# UFO Question 

Our data science team has predicted that the Earth is going to be invaded by an alien force in the
next years. Our only hope is to replicate a device that can block all alien technology in a radius of
~300km. Sadly, the device was sold in 2004 to an anonymous buyer to protect her hometown and
we don't know how contact her again. We know that the device has been active since 2004 in one
city in the USA, and we want to know where to start our search.
We've included a dataset called ​ufo.csv​. This dataset contains over 80,000 reports of UFO sightings
over the last century (all of them verified by the ESA). Using this dataset, try to guess the city in
which the device has been hidden.


In [1]:
import pandas as pd

from pymongo import MongoClient

import folium

In [2]:
# Dataframe ufo
df = pd.read_csv('ufo.csv')

ufo = MongoClient()

# Database ufo
db = ufo.ufo

df.head()

Unnamed: 0.1,Unnamed: 0,datetime,city,state,country,shape,duration,total_time,comments,date_posted,latitude,longitude,year,distance
0,0,10/10/1949 20:30,san marcos,tx,us,cylinder,2700.0,45 minutes,This event took place in early fall around 194...,4/27/2004,29.883056,-97.941111,2004,1242.667772
1,1,10/10/1949 21:00,lackland afb,tx,,light,7200.0,1-2 hrs,1949 Lackland AFB&#44 TX. Lights racing acros...,12/16/2005,29.38421,-98.581082,2005,1325.486319
2,2,10/10/1955 17:00,chester (uk/england),,gb,circle,20.0,20 seconds,Green/Orange circular disc over Chester&#44 En...,1/21/2008,53.2,-2.916667,2008,6515.416577
3,3,10/10/1956 21:00,edna,tx,us,circle,20.0,1/2 hour,My older brother and twin sister were leaving ...,1/17/2004,28.978333,-96.645833,2004,1211.971352
4,4,10/10/1960 20:00,kaneohe,hi,us,light,900.0,15 minutes,AS a Marine 1st Lt. flying an FJ4B fighter/att...,1/22/2004,21.418056,-157.803611,2004,6960.923396


In [3]:
df.datetime = df['datetime'].str.extract(r'(\d\d\d\d)')

df.datetime = df.datetime.astype(int)

In [20]:
df1 = df[(df['datetime']>=2004) & (df['country']=='us') & (df['distance']>=300) & (df['distance']<320)]

df1.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 211 entries, 151 to 78504
Data columns (total 14 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   Unnamed: 0   211 non-null    int64  
 1   datetime     211 non-null    int32  
 2   city         211 non-null    object 
 3   state        211 non-null    object 
 4   country      211 non-null    object 
 5   shape        208 non-null    object 
 6   duration     211 non-null    float64
 7   total_time   211 non-null    object 
 8   comments     211 non-null    object 
 9   date_posted  211 non-null    object 
 10  latitude     211 non-null    float64
 11  longitude    211 non-null    float64
 12  year         211 non-null    int64  
 13  distance     211 non-null    float64
dtypes: float64(4), int32(1), int64(2), object(7)
memory usage: 23.9+ KB


In [21]:
db.geoloc.insert_many(df.to_dict('records'))

db.geoloc.create_index([('geo_loc','2dsphere')])

def geomap(df, zoom_start=1.5):

    df.index=range(len(df)) 

    mapa=folium.Map(location=[0, 0], tiles='openstreetmap', zoom_start=zoom_start)

    for i in range(1, len(df)):

        folium.CircleMarker([float(df.latitude[i]), float(df.longitude[i])], popup=str(df.city[i]), 
                            
                             radius=1, icon=folium.Icon()).add_to(mapa)
        
    return mapa

In [22]:
df1

Unnamed: 0.1,Unnamed: 0,datetime,city,state,country,shape,duration,total_time,comments,date_posted,latitude,longitude,year,distance
151,156,2006,blairsville,ga,us,unknown,10.0,<10 sec.,Intermittant streak by moon&#44 not seen on ph...,10/30/2006,34.876111,-83.958333,2006,305.862395
877,900,2009,memphis,tn,us,oval,300.0,twice - 1-5 minute sighti,Day and Night - Saw object twice c.2009. 3-wi...,5/13/2012,35.149444,-90.048889,2012,305.272767
1183,1216,2013,hiawassee,ga,us,circle,900.0,15 minutes,We just witnessed an amazing event. About ten...,10/14/2013,34.949167,-83.757500,2013,318.307369
1346,1380,2010,flora,il,us,cross,90.0,~1.5min,Strange gray object spotted in my small town,11/21/2010,38.668889,-88.485556,2010,308.148769
1557,1594,2006,buchanan,ga,us,light,300.0,3-5 minutes,Peculiar flashing lights over Buchanan&#44 Ge...,10/30/2006,33.802500,-85.188611,2006,308.372241
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
76354,78110,2013,hoover,al,us,disk,1200.0,20 minutes,Bright blue saucer with blue lights beaming fr...,9/30/2013,33.405278,-86.811389,2013,308.723830
76555,78314,2013,chelsea,al,us,fireball,8.0,5-8 seconds,Very bright white and blue light crossing over...,9/30/2013,33.340000,-86.630278,2013,316.942024
76715,78484,2010,kennesaw,ga,us,other,180.0,3 minutes,saw black bell-shaped object hovering over I-575,11/21/2010,34.023333,-84.615556,2010,319.282429
76733,78503,2011,memphis,tn,us,light,420.0,7 min,I know that what me and my daughters witness w...,10/10/2011,35.149444,-90.048889,2011,305.272767


In [23]:
where=geomap(df1.loc[:1500])

where