## ETEE UFO Signtings Analysis

Let's learn how combining pandas with ipython interact makes for an easy to search dataframe.

This example is similar to the Fudgemart Products exercise from before only it uses pandas as a UFO dataset.

5 months of UFO sightings from 2016

https://github.com/mafudge/datasets/tree/master/ufo-sightings


    1. combine 5 months of ufo sightings into a single dataset
    2. interact to filter/search dataframe
        2.1 drop down of state
        2.2.drop down of shape
        2.3 text search in summary
        

In [1]:
# Figure out data frame
import pandas as pd
ufo1 = pd.read_csv(f'https://raw.githubusercontent.com/mafudge/datasets/master/ufo-sightings/ufo-sightings-2016-01.csv')
ufo1.head()
ufo2 = pd.read_csv(f'https://raw.githubusercontent.com/mafudge/datasets/master/ufo-sightings/ufo-sightings-2016-02.csv')
ufo2.head()



Unnamed: 0,Date / Time,City,State,Shape,Duration,Summary,Posted
0,2/29/16 23:45,Harbor Beach,MI,Light,1 minute,Yellow/white ball of light.,3/4/2016
1,2/29/16 23:30,Sebastian,FL,Triangle,20-40 minutes,6 low flying craft with loud engines and 2 whi...,3/4/2016
2,2/29/16 23:00,Salunga/Landisville area right by highway 283,PA,Triangle,5-15 minutes,Pennsylvania triangular aircraft sighting.,3/4/2016
3,2/29/16 22:00,York,PA,Triangle,30 minutes,Myself and 2 friends of mine were driving to t...,3/4/2016
4,2/29/16 21:35,Joliet,IL,Unknown,10 minutes,At approximately 21:35 we heard a very loud no...,3/4/2016


In [2]:
# ready to make a function!
def read_ufo_data():
    ufos = []
    for i in range(1,6):
        ufo = pd.read_csv(f'https://raw.githubusercontent.com/mafudge/datasets/master/ufo-sightings/ufo-sightings-2016-0{i}.csv')
        ufos.append(ufo)
        df = pd.concat(ufos, ignore_index=True)
    return df

df = read_ufo_data()

In [3]:
# let's figure out a dropdown
df['State']

df['State'].unique()

# need to drop the nan
df['State'].dropna().unique()

# that's our list
states = list(df['State'].dropna().unique())
# need to add wildcard
states.insert(0,'*ANY*')
states.sort()
states[:10]


['*ANY*', 'AB', 'AK', 'AL', 'AR', 'AZ', 'BC', 'CA', 'CO', 'CT']

In [4]:
# Let's generalize it to a function!

In [5]:
def dedupe_series(series: pd.Series, add_any=True) -> list[str]:
    values = sorted(list(series.dropna().unique()))
    if add_any:
        values.insert(0, "*ANY*")
    return values


states = dedupe_series(df['State'])
shapes = dedupe_series(df["Shape"], add_any=False)
print(states[:10])
print(shapes[:10])

['*ANY*', 'AB', 'AK', 'AL', 'AR', 'AZ', 'BC', 'CA', 'CO', 'CT']
['Changing', 'Chevron', 'Cigar', 'Circle', 'Cone', 'Cross', 'Cylinder', 'Diamond', 'Disk', 'Egg']


## Complete working code

In [6]:
import pandas as pd
from IPython.display import display, HTML
from ipywidgets import widgets, interact_manual
import warnings
warnings.filterwarnings('ignore')
pd.set_option('display.max_colwidth', None)

def read_ufo_data():
    ufos = []
    for i in range(1,6):
        ufo = pd.read_csv(f'https://raw.githubusercontent.com/mafudge/datasets/master/ufo-sightings/ufo-sightings-2016-0{i}.csv')
        ufos.append(ufo)
        df = pd.concat(ufos, ignore_index=True)
    return df

def dedupe_series(series: pd.Series) -> list[str]:
    values = sorted(list(series.dropna().unique()))
    values.insert(0, "*ANY*")
    return values

df = read_ufo_data()
states = dedupe_series(df['State'])
shapes = dedupe_series(df["Shape"])

display(HTML("<h1>Search UFO Sightings<h1>"))
@interact_manual(state=states, shape=shapes)
def on_click(state, shape):
    search_df = df
    if state != '*ANY*':
        search_df = search_df[ search_df.State == state ]
    if shape != '*ANY*':
        search_df = search_df[ search_df.Shape == shape ]
    display(search_df)


interactive(children=(Dropdown(description='state', options=('*ANY*', 'AB', 'AK', 'AL', 'AR', 'AZ', 'BC', 'CA'…