# Exploring 3025 records that hit on the Initial RegEx: 'XX NM dir XXX'
***
***DAEN690***

***George Mason University***

***Author:*** Grace Cox (Team LEGO)

***Date:*** October 1, 2021
***
`This Jupyter Notebook explores 3025 records (of the 5167 non-duplicate, standard format records) that hit on the initial 'XX NM dir XXX' regular expression. The process for exploring these records is as follows:`

1. Drop null values from dataframe (these records did not link to the Airports Cleaned dataset on the 'IDENT' field).
<br>
2. Create a dictionary relating bearing abbreviations to their respective degrees.
<br>
3. Separate the UAS location into the distance (NM), bearing (abbrev), and Airport identifier.
<br>
4. Convert all distances from NM to kilometers, and all bearing abbreviations into their respective degrees. We also want to remove all records for which there is no bearing or distance information.
<br>
5. Calculate the UAS lat/long using the geopy library.
***
### INPUT: 
*Dataframe (read from a .csv file) containing 3025 records, of 5191 records that hit on the initial 'XX NM dir XXX' regular expression.*

### OUTPUT:

<font color='green'>**UAS LATITUDE/LONGITUDE COORDINATE CALCULATED**</font>

*FAA Airport Identifier: 2804 records*

## Import Statements and Read Files

In [1]:
# Import Statements
import pandas as pd
import numpy as np

import geopy
from geopy.distance import geodesic

In [2]:
# Import std_format_uas_loc.csv (contains the 3025 records that hit on the initial RegEx, as described above)
df1 = pd.read_csv('C:/Users/grace/OneDrive/Desktop/GMU/DAEN690/std_format_uas_loc.csv')

In [3]:
df1

Unnamed: 0,REMARKS,UAS_LOC,IDENT,GLOBAL_ID,NAME,LATITUDE,LONGITUDE,ICAO_ID
0,Aircraft observed alarge orange UAS with flash...,['3 NM NNW of CLE'],CLE,2A962A82-442A-4A82-9535-214B1B978E2E,Cleveland-Hopkins Intl,41.409407,-81.854691,KCLE
1,"Aircraft observed a UAS while at 5,000 feet 11...",['11 NM SSE of SLC'],SLC,CB41ECB2-ADC7-49A5-9C05-81854AFD415C,Salt Lake City Intl,40.788388,-111.977773,KSLC
2,Aircraft observed a multi-colored UAS off the ...,['7NM E of SAN'],SAN,50CC510F-CE80-405D-BE8D-05AFC123107F,San Diego Intl,32.733556,-117.189667,KSAN
3,"Aircraft observed a triangular shaped, grey an...",['23NM SW of ORL'],ORL,28DD4E4B-2AD4-4122-AD63-A0842EB23993,Exec,28.545462,-81.332930,KORL
4,Aircraft observed a red quad copter UAS while ...,['4NM SSE of MLB'],MLB,3ACDEDBD-58E6-4FFF-A8A4-C7CAB20B3299,Melbourne Orlando Intl,28.102750,-80.645250,KMLB
...,...,...,...,...,...,...,...,...
3020,Aircraft reported a quad-copter UAS at 12 O'cl...,['10 NM SE of ADS'],ADS,3646423B-964B-466A-9CE9-3CB43457F772,Addison,32.968556,-96.836444,KADS
3021,Aircraft reported a UAS 3 NM N of MMU while E ...,['3 NM N of MMU'],MMU,18FAF388-FEDD-48FC-88CF-83EA626E928D,Morristown Muni,40.799338,-74.414889,KMMU
3022,Aircraft reported a UAS sensor hit while N bou...,['21 NM NW of BXK'],BXK,4285C89D-6393-48C3-B0AD-945171391D35,Buckeye Muni,33.420417,-112.686181,KBXK
3023,From MOR: Aircraft reported a NMAC with a flat...,['5 NM ENE of DRK'],DRK,,,,,


#### Drop all NULL values from DataFrame 

2,810 UAS sightings following a Standard Format remain

In [4]:
df1 = df1.dropna().reset_index() # DROP ALL NULL VALUES FOR THE TIME BEING (LEAVES 2818 RECORDS)
df1

Unnamed: 0,index,REMARKS,UAS_LOC,IDENT,GLOBAL_ID,NAME,LATITUDE,LONGITUDE,ICAO_ID
0,0,Aircraft observed alarge orange UAS with flash...,['3 NM NNW of CLE'],CLE,2A962A82-442A-4A82-9535-214B1B978E2E,Cleveland-Hopkins Intl,41.409407,-81.854691,KCLE
1,1,"Aircraft observed a UAS while at 5,000 feet 11...",['11 NM SSE of SLC'],SLC,CB41ECB2-ADC7-49A5-9C05-81854AFD415C,Salt Lake City Intl,40.788388,-111.977773,KSLC
2,2,Aircraft observed a multi-colored UAS off the ...,['7NM E of SAN'],SAN,50CC510F-CE80-405D-BE8D-05AFC123107F,San Diego Intl,32.733556,-117.189667,KSAN
3,3,"Aircraft observed a triangular shaped, grey an...",['23NM SW of ORL'],ORL,28DD4E4B-2AD4-4122-AD63-A0842EB23993,Exec,28.545462,-81.332930,KORL
4,4,Aircraft observed a red quad copter UAS while ...,['4NM SSE of MLB'],MLB,3ACDEDBD-58E6-4FFF-A8A4-C7CAB20B3299,Melbourne Orlando Intl,28.102750,-80.645250,KMLB
...,...,...,...,...,...,...,...,...,...
2805,3019,Aircraft reported a medium sized quad-copter U...,['3 NM SW of SAT'],SAT,19784DDA-7416-457C-80E6-583E1AB80A66,San Antonio Intl,29.533958,-98.469057,KSAT
2806,3020,Aircraft reported a quad-copter UAS at 12 O'cl...,['10 NM SE of ADS'],ADS,3646423B-964B-466A-9CE9-3CB43457F772,Addison,32.968556,-96.836444,KADS
2807,3021,Aircraft reported a UAS 3 NM N of MMU while E ...,['3 NM N of MMU'],MMU,18FAF388-FEDD-48FC-88CF-83EA626E928D,Morristown Muni,40.799338,-74.414889,KMMU
2808,3022,Aircraft reported a UAS sensor hit while N bou...,['21 NM NW of BXK'],BXK,4285C89D-6393-48C3-B0AD-945171391D35,Buckeye Muni,33.420417,-112.686181,KBXK


## Create Dictionary of Bearings and Respective Degrees

In [5]:
bearing_deg = {
    'N' : 0,
    'NNE' : 23,
    'NE' : 45,
    'ENE' : 68,
    'E' : 90,
    'ESE' : 113,
    'SE' : 135,
    'SSE' : 158,
    'S' : 180,
    'SSW' : 203,
    'SW' : 225,
    'WSW' : 248,
    'W' : 270,
    'WNW' : 293,
    'NW' : 315,
    'NNW' : 338
}

In [44]:
# bearing_deg.values()
# bearing_deg.keys()
# bearing_deg.items()

#### Break Up UAS_LOC column into the Distances and Bearing Abbreviations 

In [6]:
# Extract Distances and Bearings from UAS_LOC that were pulled from Remarks
# with a Standard Format in SkyWatch
uas_loc = df1['UAS_LOC']
distances = []
bearings = []

for i in range(len(uas_loc)):
    split = uas_loc[i].split(' ')[0:3]
    
    if 'NM' in split[0]: #if there is no space between the distance and NM (i.e. 3NM)
        distance_nm = split[0][0:len(split[0]) - 2]
        bearing_abrev = split[1].replace('of', '')
        
    else: 
        distance_nm = split[0]
        bearing_abrev = split[2].replace('of', '')
    
    distances.append(float(distance_nm[2:]))
    bearings.append(bearing_abrev)

In [7]:
# Create DataFrame to store UAS Location Information
new_df = pd.DataFrame(df1['REMARKS'])
new_df['UAS_LOC'] = df1['UAS_LOC']
new_df['IDENT'] = df1['IDENT']
new_df['Distance'] = distances
new_df['Bearing'] = bearings

new_df

Unnamed: 0,REMARKS,UAS_LOC,IDENT,Distance,Bearing
0,Aircraft observed alarge orange UAS with flash...,['3 NM NNW of CLE'],CLE,3.0,NNW
1,"Aircraft observed a UAS while at 5,000 feet 11...",['11 NM SSE of SLC'],SLC,11.0,SSE
2,Aircraft observed a multi-colored UAS off the ...,['7NM E of SAN'],SAN,7.0,E
3,"Aircraft observed a triangular shaped, grey an...",['23NM SW of ORL'],ORL,23.0,SW
4,Aircraft observed a red quad copter UAS while ...,['4NM SSE of MLB'],MLB,4.0,SSE
...,...,...,...,...,...
2805,Aircraft reported a medium sized quad-copter U...,['3 NM SW of SAT'],SAT,3.0,SW
2806,Aircraft reported a quad-copter UAS at 12 O'cl...,['10 NM SE of ADS'],ADS,10.0,SE
2807,Aircraft reported a UAS 3 NM N of MMU while E ...,['3 NM N of MMU'],MMU,3.0,N
2808,Aircraft reported a UAS sensor hit while N bou...,['21 NM NW of BXK'],BXK,21.0,NW


#### Convert all Distances from NM to Kilometers and Bearing Abbreviations into Degrees

In [8]:
dist_kilo = []

for i in range(len(new_df)):
    distanceKilo = new_df['Distance'][i] * 1.852 # converting NM to kilometers
    dist_kilo.append(distanceKilo)

bearDegree = pd.DataFrame(new_df['Bearing'])
bearDegree = bearDegree.replace({'Bearing': bearing_deg})

new_df['Distance_Kilometers'] = dist_kilo
new_df['Bearing_Degrees'] = bearDegree
new_df['Airport_Latitude'] = df1['LATITUDE']
new_df['Airport_Longitude'] = df1['LONGITUDE']

In [9]:
new_df

Unnamed: 0,REMARKS,UAS_LOC,IDENT,Distance,Bearing,Distance_Kilometers,Bearing_Degrees,Airport_Latitude,Airport_Longitude
0,Aircraft observed alarge orange UAS with flash...,['3 NM NNW of CLE'],CLE,3.0,NNW,5.556,338,41.409407,-81.854691
1,"Aircraft observed a UAS while at 5,000 feet 11...",['11 NM SSE of SLC'],SLC,11.0,SSE,20.372,158,40.788388,-111.977773
2,Aircraft observed a multi-colored UAS off the ...,['7NM E of SAN'],SAN,7.0,E,12.964,90,32.733556,-117.189667
3,"Aircraft observed a triangular shaped, grey an...",['23NM SW of ORL'],ORL,23.0,SW,42.596,225,28.545462,-81.332930
4,Aircraft observed a red quad copter UAS while ...,['4NM SSE of MLB'],MLB,4.0,SSE,7.408,158,28.102750,-80.645250
...,...,...,...,...,...,...,...,...,...
2805,Aircraft reported a medium sized quad-copter U...,['3 NM SW of SAT'],SAT,3.0,SW,5.556,225,29.533958,-98.469057
2806,Aircraft reported a quad-copter UAS at 12 O'cl...,['10 NM SE of ADS'],ADS,10.0,SE,18.520,135,32.968556,-96.836444
2807,Aircraft reported a UAS 3 NM N of MMU while E ...,['3 NM N of MMU'],MMU,3.0,N,5.556,0,40.799338,-74.414889
2808,Aircraft reported a UAS sensor hit while N bou...,['21 NM NW of BXK'],BXK,21.0,NW,38.892,315,33.420417,-112.686181


In [10]:
new_df.to_csv('NaN_bearing_standard.csv', index = False)

#### Remove records from DataFrame with no Bearing Information Extracted

2,804 UAS sighting records remain

In [11]:
new_df['Bearing'].replace('', np.nan, inplace = True)
new_df.dropna(subset = ['Bearing'], inplace = True) #LEAVES 2812 ROWS

new_df = new_df.reset_index()

In [12]:
new_df

Unnamed: 0,index,REMARKS,UAS_LOC,IDENT,Distance,Bearing,Distance_Kilometers,Bearing_Degrees,Airport_Latitude,Airport_Longitude
0,0,Aircraft observed alarge orange UAS with flash...,['3 NM NNW of CLE'],CLE,3.0,NNW,5.556,338,41.409407,-81.854691
1,1,"Aircraft observed a UAS while at 5,000 feet 11...",['11 NM SSE of SLC'],SLC,11.0,SSE,20.372,158,40.788388,-111.977773
2,2,Aircraft observed a multi-colored UAS off the ...,['7NM E of SAN'],SAN,7.0,E,12.964,90,32.733556,-117.189667
3,3,"Aircraft observed a triangular shaped, grey an...",['23NM SW of ORL'],ORL,23.0,SW,42.596,225,28.545462,-81.332930
4,4,Aircraft observed a red quad copter UAS while ...,['4NM SSE of MLB'],MLB,4.0,SSE,7.408,158,28.102750,-80.645250
...,...,...,...,...,...,...,...,...,...,...
2799,2805,Aircraft reported a medium sized quad-copter U...,['3 NM SW of SAT'],SAT,3.0,SW,5.556,225,29.533958,-98.469057
2800,2806,Aircraft reported a quad-copter UAS at 12 O'cl...,['10 NM SE of ADS'],ADS,10.0,SE,18.520,135,32.968556,-96.836444
2801,2807,Aircraft reported a UAS 3 NM N of MMU while E ...,['3 NM N of MMU'],MMU,3.0,N,5.556,0,40.799338,-74.414889
2802,2808,Aircraft reported a UAS sensor hit while N bou...,['21 NM NW of BXK'],BXK,21.0,NW,38.892,315,33.420417,-112.686181


#### Calculate Lat/Long Coordinates for Sighted UAS using geopy

In [13]:
uas_lat = []
uas_long = []


for i in range(len(new_df)):
    lat_airport = pd.to_numeric(new_df['Airport_Latitude'][i])
    long_airport = pd.to_numeric(new_df['Airport_Longitude'][i])
    b = pd.to_numeric(new_df['Bearing_Degrees'][i])
    d = pd.to_numeric(new_df['Distance_Kilometers'][i])
    
    origin = geopy.Point(lat_airport, long_airport)
    destination = geodesic(kilometers=d).destination(origin,b)
    
    lat2, lon2, = destination.latitude, destination.longitude
    
    uas_lat.append(lat2)
    uas_long.append(lon2)

In [14]:
# Append UAS Lat/Long information to DataFrame
new_df['UAS_Latitude'] = uas_lat
new_df['UAS_Longitude'] = uas_long

In [15]:
new_df

Unnamed: 0,index,REMARKS,UAS_LOC,IDENT,Distance,Bearing,Distance_Kilometers,Bearing_Degrees,Airport_Latitude,Airport_Longitude,UAS_Latitude,UAS_Longitude
0,0,Aircraft observed alarge orange UAS with flash...,['3 NM NNW of CLE'],CLE,3.0,NNW,5.556,338,41.409407,-81.854691,41.455788,-81.879601
1,1,"Aircraft observed a UAS while at 5,000 feet 11...",['11 NM SSE of SLC'],SLC,11.0,SSE,20.372,158,40.788388,-111.977773,40.618259,-111.887586
2,2,Aircraft observed a multi-colored UAS off the ...,['7NM E of SAN'],SAN,7.0,E,12.964,90,32.733556,-117.189667,32.733479,-117.051359
3,3,"Aircraft observed a triangular shaped, grey an...",['23NM SW of ORL'],ORL,23.0,SW,42.596,225,28.545462,-81.332930,28.273338,-81.639923
4,4,Aircraft observed a red quad copter UAS while ...,['4NM SSE of MLB'],MLB,4.0,SSE,7.408,158,28.102750,-80.645250,28.040768,-80.617026
...,...,...,...,...,...,...,...,...,...,...,...,...
2799,2805,Aircraft reported a medium sized quad-copter U...,['3 NM SW of SAT'],SAT,3.0,SW,5.556,225,29.533958,-98.469057,29.498509,-98.509572
2800,2806,Aircraft reported a quad-copter UAS at 12 O'cl...,['10 NM SE of ADS'],ADS,10.0,SE,18.520,135,32.968556,-96.836444,32.850395,-96.696550
2801,2807,Aircraft reported a UAS 3 NM N of MMU while E ...,['3 NM N of MMU'],MMU,3.0,N,5.556,0,40.799338,-74.414889,40.849370,-74.414889
2802,2808,Aircraft reported a UAS sensor hit while N bou...,['21 NM NW of BXK'],BXK,21.0,NW,38.892,315,33.420417,-112.686181,33.668009,-112.982708


In [16]:
# Export to .csv file
new_df.to_csv('uas_latlong_standard_2804.csv', index = False)