# Lab No 4: Geovisualization II - Apps.
## 3 Challenges

# Challenge No 1

In [None]:
# Here is the code for challenge 1, Lab 4

The Urbanity dashboard is a dashboard that allows users to conduct bivariate analysis of urban features such as building footprint, green view, mean building perimeter, to name a few. In using the dashboard in this exercise, I found the programme extremely accessible and easy to use, what with the enhanced network feature representation providing context behind patterns found across urban features in cities (Yap et al., 2023).
I chose to explore Bogota versus Melbourne and compare the indicators Youth vs Mean Building Footprint, the comparison between which I found to not be as straightforward as expected. I chose these two indicators because they may indicate a city’s quality of mobility, economic activity and how compact and cosmopolitan a city is. Also, if a city is more compact and younger, then there may be higher costs of living, whereby those choosing to start a family or retire may move to outskirts due to economic motivations. I chose these two cities due to their stark climate in geographical climate, cost of living, urban layout, and crime rates etc., although the two cities do both have vibrant cultural hubs, prestigious universities, well-integrated green spaces and well-designed public transport networks. The linear regression generated by Urbanity for Bogota indicated that there was no significant correlation overall. On the map there is a different story with the outskirts of the city indicating a higher points count on the outside of the city plan. Overall, the linear regression graph for Bogota indicates that the city has not much of an overall correlation between mean building footprint and youth whereas Melbourne indicates that there is more of a negative, somewhat exponential, relationship between mean building footprint and youth. On the spatial maps provided by Urbanity, an obvious spatial pattern is observed with the centre of the city and certain neighbourhoods standing out as having higher point counts.
As mentioned in Yap, et al., 2023, Urbanity is easy use, demonstrated throughout this analysis, but definitely has a focus on urban physicalism, which can be disconnected from qualitative angles of urban analysis. Overall, the dashboard helped give an overall sense of how a bivariate relationship operates over the structure and sprawl of a city effectively, although the aggregated linear regression graphs for the cities are a little hard to comprehend due to the variety across the city as a whole. The dashboard assumes that its data is of a high quality and could have scope for better customisation of adapting the scale of analysis within the linear regressions, exploring neighbourhoods specifically, and understanding where aggregation is potentially skewing data analysis and presentation in Urbanity.


![BOGOTA.png](attachment:d8be7ec3-eb17-444a-b089-5f684d534a51.png)

![BOGOTA2.png](attachment:3d1dffe3-6e89-4c3b-b4fe-b44fbe2faaf2.png)

![MELBOURNE.png](attachment:8f96bc84-af4f-40ac-b984-d6453739a6e3.png)

![MELBOURNE2.png](attachment:8a4ae865-44ca-47ce-ad8a-4afc3a607217.png)

Yap, W., Rudi Stouffs and Filip Biljecki (2023). Urbanity: automated modelling and analysis of multidimensional networks in cities. npj Urban Sustainability, 3(1). doi:https://doi.org/10.1038/s42949-023-00125-w.

# Challenge No 2

In [None]:
# Here is the code for challenge 2, Lab 4

![URBAN_HEAT_ISLAND.png](attachment:a271709d-82ab-4a4d-bad2-716b5f05407a.png)

https://www.arcgis.com/apps/dashboards/dd7aebe3fa214c849e72f71ab439e754

# Challenge No 3

In [None]:
# Here is the code for challenge 3, Lab 4

In [None]:
import requests
import zipfile
import os
import pandas as pd
import geopandas as gpd

#downloading shapefile
url = "https://maps.gov.scot/ATOM/shapefiles/SG_SIMD_2020.zip"
zip_path = "SG_SIMD_2020.zip"
extract_path = "SG_SIMD_2020"
with open(zip_path, "wb") as f:
    f.write(requests.get(url).content)

#unzipping shapefile
with zipfile.ZipFile(zip_path, "r") as zip_ref:
    zip_ref.extractall(extract_path)

#loading the shapefile into our gdf
shapefile = [f for f in os.listdir(extract_path) if f.endswith(".shp")][0]
gdf = gpd.read_file(os.path.join(extract_path, shapefile))

#printing the first five rows of data
print(gdf.head())

**References**: bytecode (2024). Reading multiple shapefiles with geopandas from a zip file in memory. [online] Stack Overflow. Available at: https://stackoverflow.com/questions/77823335/reading-multiple-shapefiles-with-geopandas-from-a-zip-file-in-memory.

In [None]:
#subset our data frame to keep only data for edinburgh city 

gdf_subset == gdf[gdf["LAName"] == "City of Edinburgh"]
gdf_subset.head(3)

In [None]:
#seeing the column titles
gdf_subset.columns

In [None]:
#using matplot lib to create chloropleth maps 
import numpy as np
import mapclassify as mc
import matplotlib.pyplot as plt
import folium
import seaborn as sns

#choosing my axes for the chloropleth map
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

#exploring different variables through generating histograms
sns.histplot(data=gdf_subset, x="IncRate",ax=axes[0], kde=True) 
sns.histplot(data=gdf_subset, x="EduAttend",ax=axes[1], kde=True) 
sns.histplot(data=gdf_subset, x="CrimeRate",ax=axes[2], kde=True) 

axes[0].set_title("Income Rate")
axes[1].set_title("Educational Attendance")
axes[2].set_title("Crime Rate")

plt.tight_layout()
plt.show()

In [None]:
#checking for any missing values
print(gdf_subset['IncRate'].isna().sum())
print(gdf_subset['EduAttend'].isna().sum())
print(gdf_subset['CrimeRate'].isna().sum())

In [None]:
#checking the data types
print(gdf_subset['IncRate'].dtype)
print(gdf_subset['EduAttend'].dtype)
print(gdf_subset['CrimeRate'].dtype)

In [None]:
# number of classes for classification
num_classes = 5

# using natural breaks (jenks classification)
classifier_nb = mc.NaturalBreaks(gdf_subset['CrimeRate'], k=num_classes)
print(classifier_nb)
print(min(classifier_nb.bins), max(classifier_nb.bins))
print(classifier_nb.bins) #

# using equal interval classification
classifier_ei = mc.EqualInterval(gdf_subset['CrimeRate'], k=num_classes)
print(classifier_ei)
print(min(classifier_ei.bins), max(classifier_ei.bins))
print(classifier_ei.bins) 

In [None]:
#creating a histogram with breakpoints for the crime rate in Edinburgh

fig, ax = plt.subplots(figsize=(8, 5))

sns.histplot(data=gdf_subset, x="CrimeRate", ax=ax, kde=True, bins=20)

# defining the style of the lines to represent the breakpoints
ax.axvline(classifier_nb.bins[0], color='red', linestyle='dashed', linewidth=2, label='Breakpoints') 
# a simple For to loop over all the elements in the array 'classifier_nb.bins'
for bin_value in classifier_nb.bins:
    ax.axvline(bin_value, color='red', linestyle='dashed', linewidth=2) 
 
#styling the histogram
ax.set_title("Histogram with Breakpoints for Natural Breaks")

#adding a legend
plt.legend()
plt.show()

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(15, 5))
#generating histogram for crime rates in Edinburgh with natural breaks
sns.histplot(data=gdf_subset, x="CrimeRate", ax=axes[0], kde=True, bins=20)
axes[0].axvline(classifier_nb.bins[0], color='red', linestyle='dashed', linewidth=2, label='Natural Breaks')
for bin_value in classifier_nb.bins:
    axes[0].axvline(bin_value, color='red', linestyle='dashed', linewidth=2)
axes[0].set_title("Crime Rate Histogram with Natural Breaks")
axes[0].legend()

sns.histplot(data=gdf_subset, x="CrimeRate", ax=axes[1], kde=True, bins=20)
axes[1].axvline(classifier_ei.bins[0], color='blue', linestyle='dashed', linewidth=2, label='Quantiles')
for bin_value in classifier_ei.bins:
    axes[1].axvline(bin_value, color='blue', linestyle='dashed', linewidth=2)
axes[1].set_title("Crime Rate Histogram with Quantiles")
axes[1].legend()
#using the tight layout to display the histogram with natural breaks
plt.tight_layout()
plt.show()

In [None]:
#generating a chloropleth map with natural breaks                                                                                                                                        fig, ax = plt.subplots(figsize=(12, 10))
gdf_subset.plot(column='CrimeRate', ax=ax,
         legend=True, cmap='viridis',
         scheme='UserDefined',
         classification_kwds={'bins': classifier_nb.bins} 
        )
plt.title("Choropleth Map using mapclassify with Natural Breaks - Map 1")
plt.show()

In [None]:
#generating a chloropleth map with equal intervals
fig, ax = plt.subplots(figsize=(12, 10))
gdf_subset.plot(column='CrimeRate', ax=ax,
         legend=True, cmap='viridis',
         scheme='UserDefined',
         classification_kwds={'bins': classifier_ei.bins},
        )
plt.title("Choropleth Map using Classifier with Equal Intervals - Map 2")
plt.show()

In [None]:
#comparing the two chloropleth maps for crime rates in Edinburgh
fig, axs = plt.subplots(1, 2, figsize=(18, 8))

gdf_subset.plot(column='CrimeRate', ax=axs[0],
         legend=True, cmap='viridis',
         scheme='UserDefined',
         classification_kwds={'bins': classifier_nb.bins}
        )

axs[0].set_title("Choropleth Map with Natural Breaks")

gdf_subset.plot(column='CrimeRate', ax=axs[1],
         legend=True, cmap='viridis',
         scheme='UserDefined',
         classification_kwds={'bins': classifier_ei.bins})

axs[1].set_title("Choropleth Map with Equal Intervals")
#using the tight layout
plt.tight_layout() 
plt.show()

In [None]:
#creating chloropleth mapbox using natural breaks
num_classes = 5

classifier_edi = mc.NaturalBreaks(gdf_subset['CrimeRate'], k=num_classes)
gdf_subset['classification_edi'] = classifier_edi.yb #yb to get the values from the array.

print(classifier_edi)
print(gdf_subset[['CrimeRate', 'classification_nb']])

In [None]:
fig = px.choropleth_mapbox(gdf_subset,
                           geojson=gdf_subset.geometry,
                           locations=gdf_subset.index,
                           color="classification_nb",
                           color_continuous_scale="viridis",
                           range_color= (1, 5),
                           opacity=0.5,
                           center={"lat": 55.866193, "lon": -4.258246},
                           mapbox_style="carto-positron",
                           zoom=9.5)
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

In [None]:
#checking for the LA name for glasgow
print(gdf["LAName"].unique())  

In [None]:
#comparing glasgow and edinburgh 

#creating a subset for glasgow to compare to edinburgh
gdf_subset2 == gdf[gdf["LAName"] == "Glasgow City"]
gdf_subset2.head(3)

In [None]:
#observing the columns
gdf_subset2.columns

In [None]:
#generating a mapbox for Glasgow crime rate
num_classes = 5

classifier_gla = mc.NaturalBreaks(gdf_subset2['CrimeRate'], k=num_classes)
gdf_subset2['classification_gla'] = classifier_nb.yb #yb to get the values from the array.

print(classifier_gla)
print(gdf_subset2[['CrimeRate', 'classification_nb']])

In [None]:
fig = px.choropleth_mapbox(gdf_subset2,
                           geojson=gdf_subset2.geometry,
                           locations=gdf_subset2.index,
                           color="classification_nb",
                           color_continuous_scale="viridis",
                           range_color= (1, 5),
                           opacity=0.5,
                           center={"lat": 55.866193, "lon": -4.258246},
                           mapbox_style="carto-positron",
                           zoom=9.5)
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

In [None]:
#comparing the mapbox for Glasgow vs Edinburgh's crime rate
ig = make_subplots(rows=1, cols=2, subplot_titles=("Glasgow Crime Rate", "Edinburgh Crime Rate"),
                    specs=[[{"type": "mapbox"}, {"type": "mapbox"}]])

#combining both figures
fig.add_trace(fig_glasgow.data[0], row=1, col=1)
fig.add_trace(fig_edinburgh.data[0], row=1, col=2)

#updating the layout
fig.update_layout(mapbox_style="carto-positron",
                  margin={"r":0,"t":50,"l":0,"b":0},
                  height=600,
                  showlegend=False)
#displaying the figure
fig.show()