# Implementing Pandas with GIS Data

The ArcGIS API for python gives us this flexability!

In [None]:
import os
import arcpy
import arcgis
import pandas as pd

### Transforming local GIS Data to a Pandas Dataframe

In [None]:
ca_city_boundary = 'ca_cities_boundaries'
sdf = pd.DataFrame.spatial.from_featureclass(ca_city_boundary)

In [None]:
sdf

Now we can manipulate the table with Pandas!

In [None]:
sdf['Pop2010'].mean()

In [None]:
# Lets filter this dataset for only the boundaries 
# that are greater than the average city population

In [None]:
avg = sdf['Pop2010'].mean()

query = sdf['Pop2010'] > avg

sdf.loc[query]

In [None]:
large_cities_df = sdf.loc[query]

In [None]:
large_cities_df

Now lets save all of the cities that have a population greater than the average <br>
as a new feature class!

In [None]:
arcpy.env.addOutputsToMap = False

In [None]:
output_fc = os.path.join(arcpy.env.workspace, "larger_cities")
large_cities_df.spatial.to_featureclass(output_fc, sanitize_columns=False)

We can also export this data to csv or excel! Whatever we need. <br>
However, since there is spatial data in the shape column, it makes sense for us to hold on to that and convert to a feature class

### Getting a Rest Service as a dataframe

Previously we created feature layers on ArcGIS online using the API <br>
How can we interact with them as dataframes?

Lets say you needed to issue a report of all of the most recent earthquakes in the world with a magnitude greater than 5.<br>
And this report needed to be recreated at the end of every month<br>
Luckily ESRI serves a rest service of earthquakes that is updated continously<br>
In the output report, we want to clean the table a little to include only the columns we want.<br>

In [None]:
mygis = arcgis.GIS('pro')

In [None]:
earthquakes = r'https://services9.arcgis.com/RHVPKKiFTONKtxq3/ArcGIS/rest/services/USGS_Seismic_Data_v1/FeatureServer/0'

earthquakes_fl = arcgis.features.FeatureLayer(earthquakes)

In [None]:
earthquakes_fl

In [None]:
earthquakes_fl.query().sdf

In [None]:
earthquakes_df = earthquakes_fl.query().sdf

In [None]:
cols_we_want = ['id', 'mag', 'place', 'eventTime','dmin', 'longitude', 'latitude', 'SHAPE']

earthquakes_df = earthquakes_df[cols_we_want]

Lets filter the dataset for all records with magnitude greater than 5!

In [None]:
query = earthquakes_df['mag'] > 5

In [None]:
earthquakes_df = earthquakes_df.loc[query].copy()

In [None]:
output_fc = os.path.join(arcpy.env.workspace, 'Earthquakes_Report')
earthquakes_df.spatial.to_featureclass(output_fc, sanitize_columns=False)

Now lets export the report to an excel file!

In [None]:
earthquakes_df = earthquakes_df.drop(columns=['SHAPE'])

In [None]:
output_excel_earthquakes = os.path.join(data_folder, 'Earthquakes_Report.xlsx')

earthquakes_df.to_excel(output_excel_earthquakes, index=False, sheet_name='from_python')