We start by acquiring the data on UFO sightings. Fortunately, github user Link Wentz has been nice enough to download
it from the website and parse it into a reasonable format, which we make available here.

We notice however that there are irregularities in some of the entries, which we would like to clean up!

In [None]:
import pandas as pd

sightings = pd.read_csv('../nuforc_events_complete.csv',
                        usecols=['event_time', 'city', 'state',
                                 'shape', 'duration', 'summary'])
print(sightings.head(10))


In [None]:
# we will be very careful about filtering sightings that can be mapped to states
valid_states = {
    'AK', 'AL', 'AR', 'AZ', 'CA', 'CO', 'CT', 'DC', 'DE', 'FL', 'GA', 'HI',
    'IA', 'ID', 'IL', 'IN', 'KS', 'KY', 'LA', 'MA', 'MD', 'ME', 'MI', 'MN',
    'MO', 'MS', 'MT', 'NC', 'ND', 'NE', 'NH', 'NJ', 'NM', 'NV', 'NY', 'OH',
    'OK', 'OR', 'PA', 'RI', 'SC', 'SD', 'TN', 'TX',
    'UT', 'VA', 'VT', 'WA', 'WI', 'WV', 'WY'
    }
sightings = sightings.loc[sightings.state.isin(valid_states), :]

# parse the date information into a more useful format
sightings['event_time'] = pd.to_datetime(sightings.event_time,
                                         format="%Y-%m-%dT%H:%M:%SZ", errors='coerce')


Let's examine the total number of sightings from across the year for each state. We see that California is a very popular destination!

In [None]:
import plotly.express as px


state_totals = sightings.groupby('state').size()

fig = px.choropleth(locations=[str(x) for x in state_totals.index],
                    scope="usa", locationmode="USA-states",
                    color=state_totals.values,
                    range_color=[0, state_totals.max()])
fig.show()
