# Upload FEMA Flood Claim Data
This notebook uploads the FEMA Flood Claim data in the `NC_Claims.csv` file (fetched using `FEMA_FloodClaims_ExtractData.ipynb`) to ArcGIS Online.

### A. Workspace Setup
* Import packages
* Log into Duke's AGOL account via the ArcGIS Pro application

In [12]:
#Packages
from pathlib import Path
import pandas as pd
from arcgis.gis import GIS, ItemProperties, ItemTypeEnum

In [13]:
#Log into ArcGIS Online
gis = GIS('pro')
print(f'Logged in as {gis.users.me.username}')

Logged in as ars158_dukeuniv


### B. Data Import & Wrangling
* Set the CSV filename
* Read the data into a dataframe, setting appropriate field types

In [14]:
#Set the CSV filename
the_csv = Path.cwd().parent/'data'/'raw'/'NC_Claims.csv'

In [15]:
#Read the CSV into a dataframe, setting appropriate dtypes
df = pd.read_csv(
    the_csv,
    parse_dates=['dateOfLoss'],
    dtype={
        'occupancyType': str,
        'reportedZipCode': str,
        'countyCode': str,
        'censusTract': str,
        'censusBlockGroupFips': str,
    })

#Display dataframe info; check for field types and missing values
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 109268 entries, 0 to 109267
Data columns (total 16 columns):
 #   Column                          Non-Null Count   Dtype              
---  ------                          --------------   -----              
 0   dateOfLoss                      109268 non-null  datetime64[ns, UTC]
 1   occupancyType                   109268 non-null  object             
 2   amountPaidOnBuildingClaim       83993 non-null   float64            
 3   totalBuildingInsuranceCoverage  109268 non-null  int64              
 4   yearOfLoss                      109268 non-null  int64              
 5   buildingDamageAmount            89218 non-null   float64            
 6   netBuildingPaymentAmount        109268 non-null  float64            
 7   causeOfDamage                   107407 non-null  object             
 8   floodEvent                      85463 non-null   object             
 9   state                           109268 non-null  object             
 

In [16]:
#Fill missing values in numeric columns with 0
df.fillna(0, inplace=True)

#Display the first few rows of the dataframe
df.head()

Unnamed: 0,dateOfLoss,occupancyType,amountPaidOnBuildingClaim,totalBuildingInsuranceCoverage,yearOfLoss,buildingDamageAmount,netBuildingPaymentAmount,causeOfDamage,floodEvent,state,reportedZipCode,countyCode,censusTract,censusBlockGroupFips,latitude,longitude
0,2011-08-28 00:00:00+00:00,1,2775.48,250000,2011,3776.0,2775.48,4,Hurricane Irene,NC,27948,37055,37055970300,370559703003,36.0,-75.7
1,1999-09-16 00:00:00+00:00,1,0.0,59800,1999,300.0,0.0,2,Hurricane Floyd,NC,28425,37141,37141920204,371419202042,34.6,-77.8
2,2008-08-27 00:00:00+00:00,1,0.0,250000,2008,4793.0,0.0,4,0,NC,28207,37119,37119002800,371190028002,35.2,-80.8
3,1999-09-16 00:00:00+00:00,3,12267.44,250000,1999,12768.0,12267.44,4,Hurricane Floyd,NC,27909,37139,37139960200,371399602003,36.3,-76.2
4,1998-08-26 00:00:00+00:00,1,23272.18,43000,1998,24486.0,23272.18,4,Hurricane Bonnie,NC,27817,37013,37013931000,370139310001,35.5,-77.0


### C. Upload the csv file to AGOL and publish as a table service

#### C1. Get/create the AGOL folder in which to add the CSV file: `FEMA_859`

In [17]:
#Set the folder name
folder_name = 'FEMA_859'

#Get the folder
the_folder = gis.content.folders.get(folder_name)
print(f'Folder found:{the_folder.name}')

#If the folder does not exist, create it
if(not the_folder): 
    the_folder = gis.content.folders.create(folder_name)
    print(f'Folder created:{the_folder.name}')

AttributeError: 'NoneType' object has no attribute 'name'

In [None]:
gis.url

'https://www.arcgis.com/'

#### C2. Add the CSV to the folder
* First, create the [ItemProperties](https://developers.arcgis.com/python/latest/api-reference/arcgis.gis.toc.html#itemproperties) object. This stores the properties for the CSV file we'll upload. 
* Then, use the [`folder.add()`](https://developers.arcgis.com/python/latest/api-reference/arcgis.gis.toc.html#arcgis.gis._impl._content_manager.Folder.add) function to upload the CSV file, referring to the item property object just created. 

In [None]:
#Create the item properties object
the_item_properties = ItemProperties(
    item_type=ItemTypeEnum.CSV,
    title='FEMA_FloodClaims11',
    description='FEMA flood claims data fetched from https://www.fema.gov/openfema-data-page/fima-nfip-redacted-claims-v2',
    tags="ENV859, FEMA, insurance, floods"
    )

In [None]:
#Add the item to the folder
csv_item = the_folder.add(
    item_properties=the_item_properties,
    file=str(the_csv)
).result()

AttributeError: 'NoneType' object has no attribute 'add'

In [None]:
# Confirm properties
print("Object type:\t",type(csv_item))
print("Item name:\t",csv_item.name)
print("Item type:\t",csv_item.type)

### D. Publish the CSV
The CSV file we just uploaded cannot be used in any on-line analysis; it can only be downloaded. [*Publishing*](https://developers.arcgis.com/python/latest/api-reference/arcgis.gis.toc.html#arcgis.gis.Item.publish) the CSV file creates a service from the object, which enables it to be used analytically. 
* The process of publishing a CSV file involves setting field types and field name aliases. 
    - Here, field aliases are assigned using the dictionary created below.  
    - Then, I've created a function that reads the data types of the columns in the dataframe, combines that with the aliases in the alias dictionary, and generates a dictionary (`field_props`) in the format AGOL requires in the publishing process. 
* The field properties dictionary is then added as component in another dictionary (`publish_params`) which stores all the settings used to publish the CSV as a feature service. 

>⚠️ *Note that the name of the service must be unique; if anything else on the entire portal has that name, and error will be raised.*

In [None]:
#Create a dictionary of aliases
aliasDict = {
    'dateOfLoss': 'Date of Loss',
    'occupancyType': 'Occupancy Type',
    'amountPaidOnBuildingClaim':'Amount Paid on Building Claim',
    'totalBuildingInsuranceCoverage':'Total Building Insurance Coverage',
    'yearOfLoss':'Year of Loss',
    'buildingDamageAmount':'Building Damage Amount',
    'netBuildingPaymentAmount':'Net Building Payment Amount',
    'causeOfDamage':'Cause of Damage Code',
    'floodEvent':'Flood Event',
    'state':'State',
    'reportedZipCode': 'Reported Zip Code',
    'countyCode': 'County Code',
    'censusTract': 'Census Tract',
    'censusBlockGroupFips': 'Census Block Group FIPS',
}

In [18]:
#Function to translate Pandas field definitions into ESRI field definitions
#https://developers.arcgis.com/rest/users-groups-and-items/publish-item/#csv-publish-parameters-json-properties
def get_field_params(df):

    #Reset indices
    df_temp = df.reset_index()

    #Create the lookup dictionary
    esri_fld = {
        'object':'esriFieldTypeString',
        'string':'esriFieldTypeString',
        'datetime64[ns]':'esriFieldTypeDate',
        'int32':'esriFieldTypeInteger',
        'int64':'esriFieldTypeInteger',
        'float64':'esriFieldTypeDouble',
        'datetime64[ns, utc]':'esriFieldTypeDate',
    }

    #Create an emtpy list
    fld_params = []

    #Iterate through fields; create and dict, and add to the list of dicts
    for fld in df_temp.columns:
        #Get the field type
        fld_type = str(df_temp[fld].dtype).lower()
        #Check that the field is convertable
        if not fld_type in list(esri_fld.keys()):
            print(f"{fld} is not a valid type ({fld_type})")
        #Get the esri field type
        esri_type = esri_fld[fld_type]
        #Get the field alias, using the original name if it has none
        if fld in aliasDict.keys(): 
            the_alias = aliasDict[fld] 
        else: 
            the_alias = fld
        #Construct the dict
        fld_dict = {'name':fld, 'type': esri_type, 'alias':the_alias}
        #If the field is a string, set the length
        if fld_type == 'object': 
            fld_dict['length'] = int(df_temp[fld].str.len().max())
        #Append the dict to the list
        fld_params.append(fld_dict)

    return(fld_params)

In [19]:
#Create the field properties dictionary via the function above
field_props = get_field_params(df)

NameError: name 'aliasDict' is not defined

In [20]:
#Create the item properties dictionary
publish_params = {
    'name':'FEMA Flood Claims',
    'type':'Table',
    'locationType':'none',
    'layerInfo':{
        'fields':field_props
    }
}

NameError: name 'field_props' is not defined

In [None]:
#Publish the service as a Feature Layer
the_service = csv_item.publish(publish_parameters=publish_params)

In [None]:
#Display the URL of the published service
the_service.url

In [None]:
#Share the service with everyone
the_service.share(everyone=True)