# CSV to GeoDataframe with GeoPandas
If a CSV file includes coordinates - either a coordinate pair representing a point location, or a series of coordinate pairs depicting a line or a polygon's permiter - then we can use those coordinates to construct a geometric object and thus create a spatially enabled dataframe, which in Geopandas is referred to as a <u>geodataframe</u>. 

Here we focus on the steps involved in going from raw coordinate data stored in a field of CSV file to a spatial dataframe. In doing so, we discuss the hierarchy of components that go into adding spatial elements to a dataframe: from geometries, to geoseries, and finally to geodataframes.  

We'll start with the simplest example of creating a point spatial dataframe from a CSV file containing latitude and longitude coordinates. The data we'll use in this exercise is electric vehicle charging locations in North Carolina ([source](https://afdc.energy.gov/data_download)).

## 1. Constructing a Pandas dataframe from the CSV file
We'll use an API to fetch CSV data listing the electric vehicle charging locations in North Carolina and load that file directly into a familiar Pandas dataframe named `df_EVStations`.

In [None]:
#Import the requests and pandas libraries
import requests
import pandas as pd

In [None]:
#Construct the request
serviceURL = 'https://developer.nrel.gov/api/alt-fuel-stations/v1.csv'
parameters = {
    'access':'all',
    'api_key':'oA9dHswdtlpAx5qLEdV1StM1mUB8KsgWluSfoEuL',
    'fuel_type':'ELEC',
    'status':'all',
    'state':'NC',
    'download':'true'
}

In [None]:
#Process the request
response = requests.get(serviceURL,parameters)
df_EVStations = pd.read_csv(response.url)

In [None]:
#Run if the above fails...
#df_EVStations_All = pd.read_csv('./data/alt_fuel_stations (Nov 14 2019).csv',low_memory=False)
#df_EVStations = df_EVStations_All.query('State == "NC"').reset_index()

In [None]:
#Examine the columns, noting the data include "latitude"  "longitude" columns
df_EVStations.columns

## 2. Creating geometries from latitude and longitude coordinates
Now that we have our dataframe with coordinate values, the next step is to convert these raw values into geometric objects, points in our case. This is done with the `shapely` package. First, we'll demonstrate how this is done with a single coordinate pair, and then reveal a nifty way to do this for all coordinate pairs in our dataframe.

#### Creating a point geometry from a single coordinate pair

In [None]:
#Extract latitude and longitude values from our first record
theLat = df_EVStations.loc[0,'Latitude']
theLng = df_EVStations.loc[0,'Longitude']
print (theLat,theLng)

In [None]:
#Import the Point class from shapely's geometry module
from shapely.geometry import Point

In [None]:
#Construct a shapely point from our XY coordinates
thePoint = Point(theLng,theLat)
type(thePoint)

Ok, we now have a point object. What we next need to do is repeat this for all records in our dataframe, storing the geometries in a new collection. 

We could simply iterate through all rows in our dataframe (e.g. using Pandas' `iterrow()` function. However, a much more elegant and efficient method exists using Python's "list comprehension" methods. (See more [here]() on list comprehension...)

#### Creating a list of point geometries by iterating through all records

In [None]:
#Old style:
thePoints = []
for i,row in df_EVStations.iterrows():
    theLat = row['Latitude']
    theLng = row['Longitude']
    thePoint = Point(theLng,theLat)
    thePoints.append(thePoint)
len(thePoints)

#### Creating a list of point geometries by iterating through all records - *using list comprehension*

In [None]:
#New style: Using list comprehension
thePoints = [Point(xy) for xy in zip(df_EVStations['Longitude'],df_EVStations['Latitude'])]
len(thePoints)

#### Understanding *list comprehension*
A lot is going on in the above statement. 

* First, the `zip(df_EVStations['Longitude'],df_EVStations['Latitude'])` code creats a Python "zip" object which is a combination two (or more) collections of the same length that now share a common index. <br><br>Behold:

In [None]:
#Zip the two columns of data such that they share a common index
zipObject = zip(df_EVStations['Longitude'],df_EVStations['Latitude'])
#Convert the zip object to a list
zipAsList = list(zipObject)
#Reveal the first 3 object in the list
zipAsList[:3]

* The second action in the statement is a for loop that iterates through each item in the zip object, storing the current value (i.e. coordinate pair) as a variable named `xy`.
* And the third action is constucting a Point object using this coordinate pair, again done within the for loop. 
* And at each iteration, the output accumulates as a new list, which we save as the variable `thePoints`. 

## 3. Creating the geodataframe
We are almost there! The remaining step is to convert our existing Pandas dataframe to a GeoPandas *geo*dataframe. To do this we simply call the GeoPandas `GeoDataFrame` command, referencing the original dataframe, the list of geometries corresponding to each row in this dataframe, and the <u>coordinate reference system</u> or **crs** to which our geometries are referenced. 

These coordinate reference systems can actually take many forms. But most often, you'll just use the format shown below, replacing the `4326` with the "WKID" of any coordinate reference system listed at https://spatialreference.org.  

In [None]:
#Create a coordinate reference system dictionary for WGS84 (WKID=4326)
theCRS = {'init':'epsg:4326'}

In [None]:
#Import geopandas
import geopandas as gpd

In [None]:
#Create the spatial dataframe from the Pandas dataframe, the geometry collection and crs
sdf_EVStations = gpd.GeoDataFrame(df_EVStations,geometry=thePoints,crs=theCRS)

In [None]:
#Display the geodataframe as a map!
sdf_EVStations.plot(figsize=(18,5));