<a href="https://colab.research.google.com/github/noahcreany/EcologyCenter_SpatialPy/blob/main/3_Wrangling_Spatial_Data_with_GeoPandas.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#GeoPandas
GeoPandas is a python library built on Pandas to read spatial data. Essentially, GeoPandas allows you integrate the geometries of the file and manipulate attribute tables using code.

*First we have to make GeoPandas work in Google Colab*

In [None]:
%%time
# Install GeoPandas in Colab
!apt install libspatialindex-dev
!pip install rtree
!pip install geopandas

In [None]:
import geopandas as gpd

Download Utah Wilderness Areas and turn it into a GeoPandas DataFrame

In [None]:
zipfile ='https://opendata.arcgis.com/datasets/61aa9c31ec96412480f99990d5668a98_0.zip'
utah_wild = gpd.read_file(zipfile)
utah_wild.head()

In [None]:
#Which agency manages Utah Wilderness?
utah_wild.Admin.value_counts()

In [None]:
#How much land does each manage?
utah_wild.groupby('Admin')['Acres'].sum()

In [None]:
#How much land does each manage as % of whole?
(utah_wild.groupby('Admin')['Acres'].sum()/(utah_wild.Acres.sum())).mul(100).round(2)

In [None]:
utah_wild.plot()

Let's add the State and county boundaries for some context

In [None]:
state = gpd.read_file('https://opendata.arcgis.com/datasets/8344c33ec2114341a59c4c1d72bcf38a_0.zip')
state = state.loc[state.STATE=='Utah']
counties = gpd.read_file('https://opendata.arcgis.com/datasets/90431cac2f9f49f4bcf1505419583753_0.zip')

In [None]:
state.head()

In [None]:
ax = state.plot(figsize=(10,10), color = 'none',edgecolor = 'black',zorder = 3)
counties.plot(color = 'none',edgecolor = 'lightgrey',ax =ax)
utah_wild.plot(color = 'forestgreen',ax =ax)

#Coordinate Reference Systems

Let's check what our CRS is for the shapefiles we already have. By adding ```.crs``` to the GeoDataFrame GeoPandas returns our epsg crs.

In [None]:
#First we need to match our CRS
print('utah_wild: ', utah_wild.crs)
print('counties: ', counties.crs)
print('state: ',state.crs)

Let's reproject our data to match a WGS 84 projection used in basemaps. To to this we add ```.to_crs(epsg=3857)``` to the GeoDataFrame

In [None]:
utah_wild_t = utah_wild.to_crs(epsg=3857)
counties_t = counties.to_crs(epsg=3857)
state_t = counties.to_crs(epsg=3857)

Let's add a satellite background to the map using Contextily



```
!pip install contextily
```



In [None]:
!pip install contextily

In [None]:
import contextily as cx

#Let's take a look at all of the basemaps we have at our disposal!
cx.providers

We'll use the same code and after we've added our elements to ax, we add the following line of code swapping ```provider.mapname``` with something like ```NASAGIBS.BlueMarble```


```
cx.add_basemap(ax, source = cx.providers.provider.mapname)
```



In [None]:
ax = state_t.plot(figsize=(15,15), color = 'none',edgecolor = 'black',zorder = 3)
counties_t.plot(color = 'none',edgecolor = 'lightgrey',ax =ax)
utah_wild_t.plot(color = 'forestgreen',ax =ax)

cx.add_basemap(ax, source = cx.providers.NASAGIBS.BlueMarble)
ax.axis('off') #remove x,y axes


#Data Manipulation

Let's add the County to the ```utah_wild``` dataset so we can see how much wilderness is in each county.




In [None]:
counties.columns

In [None]:
utah_wild= gpd.sjoin(utah_wild, counties, how ='inner')
utah_wild.head()

In [None]:
utah_wild.sample(5)

Let's cleanup our newly merged GeoDataFrame

In [None]:
utah_wild.columns

In [None]:
utah_wild = utah_wild.rename(columns = {'NAME':'County'})

In [None]:
#How many acres of Wilderness in each Utah County
utah_wild.groupby('County')['Acres'].sum()

In [None]:
#How many acres of Wilderness per person based on recent Census estimate?
(utah_wild.groupby('County')['Acres'].sum()/utah_wild.groupby('County')['POP_LASTCE'].mean()).round(2)

How far away is the nearest Wilderness area for Utah Cities?

In [None]:
utah_cities = gpd.read_file('https://opendata.arcgis.com/datasets/543fa1f073714198a3dbf8a292bdf30c_0.zip')
utah_cities.geometry

In [None]:
#Lets use a UTM CRS for consistency between the GeoDFs
utah_wild_utm = utah_wild.to_crs(epsg= 26912)
utah_cities_utm = utah_cities.to_crs(epsg = 26912)

In [None]:
#Since the Cities are Polygons, lets cast them to points using the centroid of the polygon
utah_cities_utm['centroid'] = utah_cities_utm.geometry.centroid

In [None]:
utah_cities_utm['Distance_To_Wld'] = utah_cities_utm['centroid'].apply(lambda x: utah_wild_utm.geometry.distance(x).min())

In [None]:
utah_cities_utm

In [None]:
print('Mean distance to monitoring stations: {} meters'.format(utah_cities_utm.Distance_To_Wld.mean()))


In [None]:
print('Closest Wilderness area ({} m):'.format(utah_cities_utm.Distance_To_Wld.min()))
print(utah_cities_utm.iloc[distances.idxmin()])

Let's plot the distance to Wilderness for these cities

In [None]:
utah_cities = gpd.GeoDataFrame(utah_cities)

In [None]:
utah_cities = utah_cities_utm.to_crs(utah_wild_t.crs) #We can use the .crs from another dataframe

utah_cities = utah_cities[['centroid','Distance_To_Wld']]
utah_cities = gpd.GeoDataFrame(utah_cities, geometry='centroid')
utah_cities = utah_cities.to_crs(utah_wild_t.crs)

ax = utah_cities.plot(figsize=(15,15),marker='o',
                          column = 'Distance_To_Wld', cmap = 'hot_r', legend = True)
cx.add_basemap(ax, source = cx.providers.Stamen.Toner)
ax.axis('off') #remove x,y axes
