# Automated UAS SkyWatch Standard Format Remark Functions & Script

## Origin Location: NAVAID System
***
***DAEN690***

***George Mason University***

***Author:*** Grace Cox (Team LEGO)

***Date:*** October 19, 2021

***
***How to Use:***

`To obtain the Standard Format Remarks along with complete UAS Location information in the form of latitudes and longitudes, call: uas_lat_long('file_path')`

`This function will output the dataframe containing the complete UAS location information for Standard Format Remarks referencing a NAVAID System.`
***

## Import Statements

In [3]:
# Import Statements
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import chart_studio.plotly as py
import re
import geopy
from geopy.distance import geodesic

from IPython.display import display, HTML

## Read in Files

In [2]:
# Read in Incidents_Cleaned_Standard.csv (contains 5191 records that are of 'Standard Format')
faa_standard = pd.read_csv('C:/Users/grace/OneDrive/Desktop/GMU/DAEN690/Incidents_Cleaned_Standard.csv')

# REMARK MUST REFERENCE A NAVAID System

## uas_loc_NAVAID()

Create a Function that takes in a .csv file of SkyWatch records for which the Remarks are in the Standard Format and uses Regular Expressions to extract the portion of the REMARK that reads 'XX NM direction [of] XXX' from the remark where XXX references a NAVAID System

In [4]:
# Create Function FOR STANDARD FORMAT REMARKS that:
# A. grabs REMARKS field from input dataframe/file.csv containing STANDARD FORMAT REMARKS
# B. Uses Regular Expressions to extract 'XX NM direction [of] XXX' from the remark

def uas_loc_NAVAID (file):
    '''
    This function takes in a .csv file of SkyWatch records for which the Remarks are in the Standard Format and uses
    regular expressions to extract the portion of the remark that reads 'XX NM direction [of] XXX'.
    
    This function also exports the following .csv files:
    A) StandardRemark_UAS_Location_NAVAID_NULL -- Contains all UAS Locations from Standard Format Remarks (INCLUDING NULL VALUES)
    B) StandardRemark_UAS_Location_NAVAID_nonNULL -- Contains all UAS Locations from Standard Format Remarks (NOT INCLUDING NULL VALUES)
    
    @param file: the file path (str) for a .csv file containing SkyWatch reports for which the REMARKS are
                    in Standard Format
    
    @output df1 : a dataframe containing the Standard Format REMARK as well as the UAS Location portion of the remark
                that reads 'XX NM direction [of] XXX' as well as the associated NAVAID System information linked with the
                NAVAID Identifier.
    '''
    # Read in .csv file provided by user
    faa_standard = pd.read_csv(file)
    
    # Read in Cleaned NAVAID Systems dataset
    nav_clean = pd.read_csv('C:/Users/grace/OneDrive/Desktop/GMU/DAEN690/NAVAID_cleaned.csv')
    
    # Create List that contains each Standard Remark, and the Heading/Direction
    # information contained in each remark

    remark_uas_loc = []
    remarks = faa_standard['REMARKS']

    # regular expression for any heading/direction
    headir_regex = '\.?[0-9]\.?[0-9]*[0-9]*\s?NM* [N|S|E|W|NW|NE|SW|SE|SSE|SSW|SNE|SNW|NNE|NSE|NNE|NNW|WSW|WNW|WSE|WNE|ENE|ESE|ESW|ENW]*\s?of?\s?[A-Z][A-Z][A-Z][A-Z]?'

    # Loop through all remarks and search for the heading/direction regex above
    for i in range(len(remarks)):
        head_dir = re.findall(headir_regex, remarks[i])
        remark_uas_loc.append(remarks[i])
        remark_uas_loc.append(head_dir)

    # Split Remarks and Heading/Directions into two seperate lists and create
    # pandas dataframe
    remark = []
    uas_loc = []

    for i in range(0, len(remark_uas_loc), 2):
        remark.append(remark_uas_loc[i])
        uas_loc.append(remark_uas_loc[i+1])

    remark_uas_loc_df = pd.DataFrame()
    remark_uas_loc_df['REMARKS'] = remark
    remark_uas_loc_df['UAS Location'] = uas_loc
    
    # Export final dataframe (INCLUDING NULL UAS LOCATIONS) to .csv file
    remark_uas_loc_df.to_csv('StandardRemark_UAS_Location_NAVAID_NULL.csv', index = False)
    
    # Get list of UAS locations from the above dataframe
    uas_loc = remark_uas_loc_df['UAS Location'] 

    # If the regular expressions did not hit on any location information, pass it UNKN for now
    for i in range(len(remark_uas_loc_df)):
        if len(uas_loc[i]) == 0:
            uas_loc[i] = 'UNKN'
    
    uas_loc_nonNull = uas_loc[uas_loc != 'UNKN'].to_list()
    uas_navaid = []

    for i in range(len(uas_loc_nonNull)):
        navaid = uas_loc_nonNull[i][0].split(' ')[-1]

        if len(navaid) <= 4:
            uas_navaid.append(navaid)
        else:
            trim_air = navaid[-3:]
            uas_navaid.append(trim_air) # we perform this trim to account for a lack of space, for example: 'ofCLE'
    
    uas_navaid_df = pd.DataFrame()
    uas_navaid_df['IDENT'] = uas_navaid
    
    # Create Dataframe of Standard Format Remarks with NON NULL UAS Locations
    remark_uas_loc_nn = remark_uas_loc_df[remark_uas_loc_df['UAS Location'] != 'UNKN'].reset_index()
    
    # Export final dataframe (EXCLUDING NULL UAS LOCATIONS) to .csv file
    remark_uas_loc_nn.to_csv('StandardRemark_UAS_Location_NAVAID_nonNULL.csv', index = False)
    
    # JOIN DATASETS TO HAVE ALL LOCATION INFORMATION FROM REMARK

    # DATAFRAMES USED:
    # airportsC = airports_cleaned.csv (the cleaned airports dataset)
    # uas_airport_df = dataframe of FAA Airport Identifiers found in the 3033 Standard Formats hit on by Regular Expressions
    # full_loc = dataframe containing the UAS sighting remark from SkyWatch as well as the UAS location information extracted

    uas_nav_loc= pd.merge(uas_navaid_df, nav_clean, on='IDENT', how='left')
    full_loc = pd.DataFrame()
    full_loc['REMARKS'] = remark_uas_loc_nn['REMARKS']
    full_loc['UAS_LOC'] = remark_uas_loc_nn['UAS Location']

    df1 = pd.concat([full_loc, uas_nav_loc], axis = 1) #dataframe with final UAS Location information READY FOR CALCULATIONS
                                                                # from the 3,033 records extracted from the Standard Format Remarks
    
    df1 = df1.dropna().reset_index() # DROP ALL values that do not reference an airport using FAA Identifier
    
    return df1

In [6]:
uas_loc_NAVAID('C:/Users/grace/OneDrive/Desktop/GMU/DAEN690/Incidents_Cleaned_Standard.csv')

Unnamed: 0,index,REMARKS,UAS_LOC,IDENT,LONGITUDE,LATITUDE,NAME_TXT
0,4,"Aircraft observed a triangular shaped, grey an...",[23NM SW of ORL],ORL,-81.335022,28.542730,ORLANDO
1,5,Aircraft observed a red quad copter UAS while ...,[4NM SSE of MLB],MLB,-80.635343,28.105287,MELBOURNE
2,6,Aircraft observed a small black rotary wing UA...,[31NM SE of LAX],LAX,-118.432019,33.933149,LOS ANGELES
3,7,Aircraft observed a blackquad UAS off the left...,[3NM NE of LAX],LAX,-118.432019,33.933149,LOS ANGELES
4,9,Aircraft observed a red UAS off the left side ...,[7NM NE of LGA],LGA,-73.868602,40.783724,LA GUARDIA
...,...,...,...,...,...,...,...
1703,3023,Aircraft reported a red and silver quad-copter...,[7 NM W of SLI],LAX,-118.432019,33.933149,LOS ANGELES
1704,3024,Aircraft reported a white quad-copter UAS off ...,[8 NM NW of SEA],ITO,-155.010969,19.721354,HILO
1705,3026,Aircraft reported a black quad-copter UAS whil...,[2NM SE of LGA],TCH,-111.981924,40.850256,WASATCH
1706,3029,Aircraft reported a UAS 3 NM N of MMU while E ...,[3 NM N of MMU],ATL,-84.435069,33.629083,ATLANTA


## bearing_dist_NAVAIDIdent()

Create a Function that takes in a .csv file of SkyWatch records for which the Remarks are in the Standard Format and uses the uas_loc_NAVAID() function to split UAS Locations extracted from Remarks into their origin identifier, distance (in both NM and Kilometers) and bearing (as an abbreviation and in degrees).

In [5]:
def bearing_dist_NAVAIDIdent(file):
    '''
    This function takes in a .csv file (string) of SkyWatch records for which the Remarks are in the Standard Format
    and uses the uas_loc_NAVAID() function to split the UAS Location within the remark into the distance (in both NM and 
    kilometers), bearing abbreviation and degrees, and associated NAVAID System Identifier.
    
    @param file : the file path (str) for a .csv file containing SkyWatch reports for which the REMARKS are
                    in Standard Format
    @output new_df_validBearing: returns a dataframe containing the Remark, UAS Location, NAVAID System Identifier, Distance 
                                    (in NM) from the airport, and bearing abbreviation (WHERE ALL BEARINGS ARE VALID)
                
                *** it should be noted that the full dataset that includes remarks with invalid bearing information is 
                    exported into the StandardRemark_UAS_Location_Converted_NAVAIDallBearing.csv file 
                    
                *** it should be noted that the dataset that only contains remarks with VALID bearing information is
                    exported into the StandardRemark_UAS_Location_General_NAVAIDvalidBearing.csv file
    '''
    # Call uas_loc_airportFAA() function created above
    df1 = uas_loc_NAVAID(file)
    
    # Create Dictionary of Bearings and Respective Degrees
    bearing_deg = {
    'N' : 0,
    'NNE' : 23,
    'NE' : 45,
    'ENE' : 68,
    'E' : 90,
    'ESE' : 113,
    'SE' : 135,
    'SSE' : 158,
    'S' : 180,
    'SSW' : 203,
    'SW' : 225,
    'WSW' : 248,
    'W' : 270,
    'WNW' : 293,
    'NW' : 315,
    'NNW' : 338
    }
    
    # Extract Distances and Bearings from UAS_LOC that were pulled from Remarks
    # with a Standard Format in SkyWatch
    uas_loc = df1['UAS_LOC']
    distances = []
    bearings = []

    for i in range(len(uas_loc)):
        split = uas_loc[i][0].split(' ')[0:3]

        if 'NM' in split[0]: #if there is no space between the distance and NM (i.e. 3NM)
            distance_nm = split[0][0:len(split[0]) - 2]
            bearing_abrev = split[1].replace('of', '')

        else: 
            distance_nm = split[0]
            bearing_abrev = split[2].replace('of', '')

        distances.append(float(distance_nm))
        bearings.append(bearing_abrev)
    
    # Create DataFrame to store UAS Location Information
    new_df = pd.DataFrame(df1['REMARKS'])
    new_df['UAS_LOC'] = df1['UAS_LOC']
    new_df['IDENT'] = df1['IDENT']
    new_df['Distance_NM'] = distances
    new_df['Bearing'] = bearings
    
    # Convert all distance from NM to kilometers and all bearing abbreviations to their associated degrees
    dist_kilo = []

    for i in range(len(new_df)):
        distanceKilo = new_df['Distance_NM'][i] * 1.852 # converting NM to kilometers
        dist_kilo.append(distanceKilo)

    bearDegree = pd.DataFrame(new_df['Bearing'])
    bearDegree = bearDegree.replace({'Bearing': bearing_deg})
    
    for i in range(len(bearDegree)):
        if bearDegree['Bearing'][i] == '':
            bearDegree.replace('', np.nan, inplace = True)

    new_df['Distance_Kilometers'] = dist_kilo
    new_df['Bearing_Degrees'] = bearDegree
    
    new_df['NAVAID_Latitude'] = df1['LATITUDE']
    new_df['NAVAID_Longitude'] = df1['LONGITUDE']
    
    # Export final new_df to .csv 
    new_df.to_csv('StandardRemark_UAS_Location_Converted_NAVAIDallBearing.csv', index = False)
    
    # Create Dataframe of Standard Format Remarks without NULL BEARING INFO
    new_df_validBearing = new_df.dropna().reset_index()
    
    # Export final dataframe (EXCLUDING NULL BEARING INFO) to .csv file
    new_df_validBearing.to_csv('StandardRemark_UAS_Location_General_NAVAIDvalidBearing.csv', index = False)
    
    return new_df_validBearing

In [11]:
#bearing_dist_NAVAIDIdent('C:/Users/grace/OneDrive/Desktop/GMU/DAEN690/Incidents_Cleaned_Standard.csv')

## uas_NAVAID_lat_long()

Calculates the latitude and longitude for the sighted UAS using the geopy library (***WHEN THE UAS SIGHTING REFERENCES A NAVAID System IDENTIFIER***)

In [9]:
def uas_NAVAID_lat_long(file):  
    '''
    This function takes in a .csv file (string) of SkyWatch records for which the Remarks are in the Standard Format
    and uses the bearing_dist_originIdent() function to output the dataframe containing complete UAS Location information
    in terms of the UAS' Latitude and Longitude.
    
    @param file : the file path (str) for a .csv file containing SkyWatch reports for which the REMARKS are
                    in Standard Format
    
    @output new_df: returns a dataframe containing the Remark, UAS Location, NAVAID System Identifier, Distance (in NM) from 
                    the airport, and bearing abbreviation (WHERE ALL BEARINGS ARE VALID)
                
                *** it should be noted that the full dataset that includes remarks for which UAS latitudes and longitudes
                    were calculated is located in the StandardRemark_UAS_LatLong_validBearing_NAVAIDIdent.csv file.
                    
    
    '''
    # Call bearing_dist_originIdent() function created above
    new_df = bearing_dist_NAVAIDIdent(file)
    
    uas_lat = []
    uas_long = []


    for i in range(len(new_df)):
        lat_airport = pd.to_numeric(new_df['NAVAID_Latitude'][i])
        long_airport = pd.to_numeric(new_df['NAVAID_Longitude'][i])
        b = pd.to_numeric(new_df['Bearing_Degrees'][i])
        d = pd.to_numeric(new_df['Distance_Kilometers'][i])

        origin = geopy.Point(lat_airport, long_airport)
        destination = geodesic(kilometers=d).destination(origin,b)

        lat2, lon2, = destination.latitude, destination.longitude

        uas_lat.append(lat2)
        uas_long.append(lon2)
    
    # Append UAS Lat/Long information to DataFrame
    new_df['UAS_Latitude'] = uas_lat
    new_df['UAS_Longitude'] = uas_long
    
    # Export final dataframe (EXCLUDING NULL BEARING INFO) to .csv file
    new_df.to_csv('StandardRemark_UAS_LatLong_validBearing_NAVAIDIdent.csv', index = False)
    
    return new_df

In [13]:
uas_NAVAID_lat_long('C:/Users/grace/OneDrive/Desktop/GMU/DAEN690/unknown_standard_complete.csv')

Unnamed: 0,index,REMARKS,UAS_LOC,IDENT,Distance_NM,Bearing,Distance_Kilometers,Bearing_Degrees,NAVAID_Latitude,NAVAID_Longitude,UAS_Latitude,UAS_Longitude
