<a href="https://colab.research.google.com/github/noahcreany/EcologyCenter_SpatialPy/blob/main/1_EC_PythonIntro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Welcome to an Intro to GeoSpatial Analysis in Python

This will be a general introduction to using open source python packages (R Libraries) for mapping and spatial statistics. 

Outline for Workshop:
1.   A brief intro to python
2.   Wrangling geometries with GeoPandas
3.   Make a Map
4.   Spatial Statistics

*Note: Many of these modules were inspired by Python "courses" on Kaggle.com (A great source to get started in Python - https://www.kaggle.com/learn).*

Let's get started!




#A Brief Intro to Python

I'm going to assume some of you might have some programming experience in R - perhaps that's what brought you here. Nevertheless, I'll cover some basic aspects of Python as it is experienced here in Google Colab (Jupyter Notebook).

In [None]:
# Comments are made with '#'
a_var = 3

print(a_var)
# Assigning objects is easy, and include multiple types of variables
a_list = ['item 1', 'item 2','item 3'] # or [1,2,3] if numeric list
print(a_list)
#or
a_list

3
['item 1', 'item 2', 'item 3']


['item 1', 'item 2', 'item 3']

In [None]:
#Sometimes a dictionary is helpful for loops:
a_dict = {'item 1':1, 'item 2':2,'item 3':3}
print('Get Item 1: ', a_dict['item 1'])
print('Get Item 2: ', a_dict['item 2'])

Get Item 1:  1
Get Item 2:  2


In [None]:
def a_function(num,mul):
  product = num*mul
  return product

a_function(7,9) 

63

#Pandas, the 'Tidyverse' of Python
Pandas is a "dataframe" package in Python

There are a few conventions for its use, but understanding how to subset and manipulate variables is the same as in GeoPandas. GeoPandas just adds geometries to Pandas.

Pandas Cheatsheet:
https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf

GeoPandas Cheatsheet:
https://github.com/prasunkgupta/python-cheat-sheets/blob/master/geopandas-shapely-geopy.ipynb

In [None]:
#import packages in python, often abbreviated like this:
import pandas as pd

In [None]:
import numpy as np
import geopandas as gpd
import requests

In [None]:
#If geopandas is not found
!pip install geopandas

In [None]:
df = pd.DataFrame(a_dict, index = [1])
df.head()

Unnamed: 0,item 1,item 2,item 3
1,1,2,3


In [None]:
#print columns
for i in df.columns: print(i)

item 1
item 2
item 3


In [None]:
#list of columns
cols = list(df.columns)
cols

['item 1', 'item 2', 'item 3']

In [None]:
renamecols = []
for s in cols: renamecols.append(str(s.replace(" ","")))
renamecols

['item1', 'item2', 'item3']

In [None]:
dict(zip(cols,renamecols))

{'item 1': 'item1', 'item 2': 'item2', 'item 3': 'item3'}

In [None]:
df = df.rename(columns=dict(zip(cols,renamecols)))
df.head()

Unnamed: 0,item1,item2,item3
1,1,2,3


In [None]:
df.item1 = df.item1.mul(1000)
df.item2 = np.log10(df.item2)
df.item3 = np.log2(df.item3)
df.head()

Unnamed: 0,item1,item2,item3
1,1000,0.30103,1.584963


In [None]:
df= df.round(3)
df.head()

Unnamed: 0,item1,item2,item3
1,1000,0.301,1.585


##Webscraping and setting up for GeoPandas

Let's grab some COVID-19 Data from the NYTimes Github page, its very easy to import to pandas

In [None]:
url = 'https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv'
df = pd.read_csv(url)
df.head()

Unnamed: 0,date,state,fips,cases,deaths
0,2020-01-21,Washington,53,1,0
1,2020-01-22,Washington,53,1,0
2,2020-01-23,Washington,53,1,0
3,2020-01-24,Illinois,17,1,0
4,2020-01-24,Washington,53,1,0


In [None]:
#How many states in DF?
print('Number of States: ',len(df.state.unique()))

#Date Range
print('DateRange:')
print('First: ', df.date.min(), '| Last:',df.date.max())

Number of States:  56
DateRange:
First:  2020-01-21 | Last: 2022-04-03


In [None]:
#Which State has the most cumulative cases?
statesorted = df.groupby('state')['cases'].sum().sort_values(ascending = False)
print(statesorted)

state
California                  2441868410
Texas                       1931087037
Florida                     1625137686
New York                    1341605937
Illinois                     878593174
Pennsylvania                 749691563
Ohio                         724264887
Georgia                      721949950
North Carolina               675172785
New Jersey                   622583773
Michigan                     616134899
Arizona                      569827891
Tennessee                    562278022
Indiana                      484674155
Massachusetts                448593084
Wisconsin                    443752805
Virginia                     440020028
Missouri                     408955726
South Carolina               397935909
Minnesota                    386315335
Alabama                      371785869
Colorado                     352980687
Louisiana                    347739360
Washington                   328897523
Kentucky                     326162597
Oklahoma           

56