<a href="https://colab.research.google.com/github/cagBRT/Intro-to-Programming-with-Python/blob/master/GeoSpatial_Visualization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook we use the Ames housing dataset to demonstrate how to do geospatial visualization.

In [None]:
# Clone the entire repo.
!git clone -l -s https://github.com/cagBRT/Intro-to-Programming-with-Python.git cloned-repo
%cd cloned-repo

In [None]:
#!pip install geopandas
!pip install contextily
!pip install shapely

In [None]:
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import contextily as ctx
from shapely.geometry import Point

**Get the data**<br>

The Ames dataset listing information on houses in Ames, Iowa.


In [None]:
# Load the dataset
Ames = pd.read_csv('/content/Ames (1).csv', on_bad_lines='skip')

We list the columns of the dataset. <br>
For this demonstration we will only use Latitude and Longitude.

In [None]:
Ames.columns

## **Convert the DataFrame to a GeoDataFrame**

Converting to a GeoDataFrame, means we can access the geospatial functionalities on our <br>
dataset, transforming the raw data into a format suitable for geospatial analysis and visualization.

In [None]:
# Convert the DataFrame to a GeoDataFrame
geometry = [Point(xy) for xy in zip(Ames['Longitude'], Ames['Latitude'])]
geo_df = gpd.GeoDataFrame(Ames, geometry=geometry)

## Set the CRS<br>

**CRS=Coordinate Reference System**<br>

A Coordinate reference system (CRS) defines,  how the two-dimensional, projected map is related to real locations on the earth. There are two different types of coordinate reference systems: Geographic Coordinate Systems and Projected Coordinate Systems.<br>

**The distance between two points will differ under a different CRS**, and the map will look different. <br>

In this notebook, the CRS for the GeoDataFrame is using the notation “EPSG:4326,” which corresponds to the widely-used WGS 84 (or [World Geodetic System 1984](https://en.wikipedia.org/wiki/World_Geodetic_System)) latitude-longitude coordinate system.<br><br>


WGS 84 is the de facto standard for satellite positioning, GPS, and various mapping applications.<br>

Selecting an appropriate CRS depends on factors like scale, accuracy, and the geographic scope of your data, ensuring precision in geospatial analysis and visualization.

In [None]:
# Set the CRS for the GeoDataFrame
geo_df.crs = "EPSG:4326"

We create a convex hull around all the points. This will give us a visual representation of the data.

In [None]:
# Create a convex hull around the points
convex_hull = geo_df.unary_union.convex_hull
convex_hull_geo = gpd.GeoSeries(convex_hull, crs="EPSG:4326")
convex_hull_transformed = convex_hull_geo.to_crs(epsg=3857)
buffered_hull = convex_hull_transformed.buffer(500)

To plot the data we switch to CRS= epsg=3875, which is optimized for web based mapping applications.

In [None]:
# Plotting the map with Sale Prices, a basemap, and the buffered convex hull as a border
fig, ax = plt.subplots(figsize=(12, 8))

#Change the CRS to epsg
geo_df.to_crs(epsg=3857).plot(column='SalePrice', cmap='coolwarm', ax=ax, legend=True,
                              markersize=20)

#The buffer operation adds a buffer area around the convex hull
#In this case the buffer is 500 meters
buffered_hull.boundary.plot(ax=ax, color='black', label='Buffered Boundary of Ames')
ctx.add_basemap(ax, source=ctx.providers.CartoDB.Positron)
ax.set_axis_off()
ax.legend(loc='upper right')
colorbar = ax.get_figure().get_axes()[1]
colorbar.set_ylabel('Sale Price', rotation=270, labelpad=20, fontsize=15)
plt.title('Sale Prices of Individual Houses in Ames, Iowa with Buffered Boundary',
          fontsize=18)
plt.show()