# Summary

<p>
This notebook explores the city of Nashville using geospatial or geographic data.
</p>   
<p>
A version of the notebook is avaible with<br>
<a href="https://nbviewer.jupyter.org/github/RolfChung/geospatial_data_1/blob/main/geospatialData_1_Nashville_345.ipynb" target="_blank">nbviewer</a><br>
Other than Github it shows folium plots.
<p>
   
"Geographic data and information is defined in the ISO/TC 211 series of standards as data and information having an implicit or explicit association with a location relative to Earth (a geographic location or geographic position)."<br>
<a href="https://en.wikipedia.org/wiki/Geographic_data_and_information" target="_blank">Wikipedia</a> 
</p>

<p>
Here this means in practical terms 'longitudes' and 'latitudes' are used. Two coordinate reference systems are mainly applied here: EPSG:4326 and EPSG:3857.
</p>
<p>
"A spatial reference system (SRS) or coordinate reference system (CRS) is a coordinate-based local, regional or global system used to locate geographical entities. A SRS commonly defines a specific map projection, as well as transformations between different SRS."<br>
<a href="https://en.wikipedia.org/wiki/Spatial_reference_system" target="_blank">Wikipedia</a>  
<br>
It is of overall important to apply the right crs for the operation in case.<br>
Otherwise major trouble is ahead.<br>
More on this in the notebook.
</p> 

<p>
    In a Python perspective the <b>Geopandas</b> package is applied here.
</p> 

<p>
"GeoPandas is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types. Geometric operations are performed by shapely. Geopandas further depends on fiona for file access and matplotlib for plotting."<br>
<a href="https://geopandas.readthedocs.io/en/latest/index.html" target="_blank">GeoPandas</a>  
</p> 

<p>
I am totally agreeing  with this. In general it makes life easier and is an entry point into the
many geopspatial packages and dependencies behind it.
</p>

<p>
Further topics explored here are:
</p>

<ul>
  <li>Shapefiles</li>
  <li>GeoDataFrames</li>
  <li>GeoJson</li>
  <li>Geospatial Joins</li>
  <li>Geopspatial calculations</li>
  <li>Geopandas plots</li>
  <li>Folium plots</li>
  <li>Choropleth plots</li>
</ul> 

# Import packages

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os
import geopandas as geop
import json, requests
from shapely.geometry import Point
import pprint
import sys
import folium

%matplotlib inline


In [None]:
print("Version Pandas: {}".format(pd.__version__))
print("Version Geopandas: {}".format(geop.__version__))
print("Version Numpy: {}".format(np.__version__))

Did the import work?

In [None]:
import types
def imports():
    for name, val in globals().items():
        if isinstance(val, types.ModuleType):
            yield val.__name__
list(imports())

In [None]:
# Seaborn | Style And Color: ttps://www.geeksforgeeks.org/seaborn-style-and-color/
sns.set() 
sns.set_style("whitegrid") 
sns.set_style("ticks", {"xtick.major.size":8, "ytick.major.size":8})
sns.axes_style("whitegrid")

In [None]:
currentwd = os.getcwd()
# print(currentwd)

## Shapefiles of Nashville

<p>
"The shapefile format is a geospatial vector data format for geographic information system (GIS) software. ... The shapefile format can spatially describe vector features: points, lines, and polygons, representing, for example, water wells, rivers, and lakes. Each item usually has attributes that describe it, such as name or temperature."<br>
<a href="https://en.wikipedia.org/wiki/Shapefilem" target="_blank">Wikipedia</a> 
</p> 



<p>
Shapefiles of Nashville are here used with <b>GeoPandas 0.9.0</b>.
</p>

<p>
The Nashville geospatial data is available on 
<a href="https://data.nashville.gov/General-Government/Service-Districts-GIS-/xxxs-vvs4" target="_blank">data.nashville.gov</a> <br>
According to description: <br>
Geographic boundaries of the Urban Services District and General Services District of Nashville and Davidson County.
</p> 

<p>
Nashville has got two overarching districts: the Urban Services District and the	General Services District.<br> 	
Within the two general districts are a lot of sub districts stored here in "neighborhoods.geojson".<br>
</p> 

In [None]:
nv = geop.read_file('datasets/neighborhoods.geojson', encoding='utf-8')

In [None]:
print(type(nv))

In [None]:
print(nv.dtypes)

In [None]:
print(nv.shape)

In [None]:
print(nv.columns)

In [None]:
print(nv.index)

In [None]:
nv.info()

In [None]:
nv.head()

In [None]:
nv.tail()

#### Coordinate Reference Systems
<p>
"The Coordinate Reference System (CRS) is important because the geometric shapes in a GeoSeries or GeoDataFrame object are simply a collection of coordinates in an arbitrary space. A CRS tells Python how those coordinates relate to places on the Earth."<br>
<a href="https://geopandas.org/docs/user_guide/projections.html" target="_blank">geopandas</a> 
</p> 


In [None]:
nv.geometry.crs

Shapes of some NV service districts

In [None]:
print(nv.name[0])
nv.loc[0, 'geometry']

In [None]:
print(nv.name[1])
nv.loc[1, 'geometry']

In [None]:
print(nv.name[144])
nv.loc[144, 'geometry']

### Shape of Nashville service districts

In [None]:
nv.plot(edgecolor='black', linewidth=1, color='cyan', figsize=(15,5))
plt.title('Shape of Nashville', fontsize=18)
plt.show()

In [None]:
nv.plot(edgecolor='black', linewidth=1, column='name', legend=False, figsize=(15,5))
plt.title('Shape of Nashville colored by districts', fontsize=18)
plt.show()

In [None]:
nv.plot(edgecolor='black', linewidth=1, column='name', legend=True,
        legend_kwds={'bbox_to_anchor':(1,1), 'ncol':4, 'loc':'upper left'},
        figsize=(15,5))
plt.title('Shape of Nashville colored by service districts', fontsize=18)
plt.show()

### Shape of Nashville overarching service districts

In [None]:
nv_shape = geop.read_file('datasets/Nash_shape/geo_export_586b6f7f-f69c-426a-be97-7cc49f2b415b.shp')

In [None]:
print(type(nv_shape))

In [None]:
print(nv_shape.dtypes)

In [None]:
nv_shape.shape

In [None]:
nv_shape.columns

In [None]:
nv_shape.index

In [None]:
nv_shape.info()

In [None]:
nv_shape

In [None]:
USD_shape = nv_shape.loc[0, 'geometry']
USD_shape

In [None]:
# for i in pd.Series(nv_shape.iloc[0,3]):
    # print(i)

In [None]:
nv_shape.plot(edgecolor='black', color='yellow', figsize=(12,5))
plt.title('Shape of Nashville')
plt.show()

In [None]:
nv_shape.plot(edgecolor='black', cmap='Set3', figsize=(12,5))
plt.title('Shape of Nashville')
plt.show()

#### Generating the outer boundaries of Nashville at a whole from given geometries

<strong>with GeoSeries.unary_union</strong>

<p>
Return a geometry containing the union of all geometries in the GeoSeries.
</p>


In [None]:
nv_shape_unary = nv_shape.unary_union
nv_shape_unary

In [None]:
nv_shape_unary_gdf =  geop.GeoDataFrame(geometry=[nv_shape_unary], crs=nv_shape.crs)
nv_shape_unary_gdf.plot(edgecolor='black', color='olive')
plt.show()

### Working with attribute methods of Geoseries geometry data
<p>
starting by:<br> 
</p> 

#### Calculating the areas of NV

In [None]:
nv_shape.geometry.crs

In [None]:
area_usd = nv_shape.geometry[0].area
print("Area of {name}: {area}".format(name= nv_shape.loc[0, 'name'], area= area_usd))

In [None]:
area_gsd = nv_shape.geometry[1].area
print("Area of {name}: {area}".format(name= nv_shape.loc[1, 'name'], area= area_gsd))

<p>
Above the areas are calculated in decimal places according to the EPSG: 4326 crs.<br>
This is not human friendly.<br>
A better understanding is enabled by calculating m^2 and km^2.<br>
This only works only for small distances.<br>
Otherwise you have to correct for the fact that the
earth is a sphere and not a disc ;-)<br>
More on <a href="https://gis.stackexchange.com/questions/242545/how-can-epsg3857-be-in-meters" target="_blank">'How can EPSG:3857 be in meters?'</a> here.
</p>

<p>
Using epsg=3857 for meters as a csr returns m^2.
</p> 

<p>
<b>Convert m squared to km squared:</b><br>
"How many m squared in 1 km squared? The answer is 1.000.000.
We assume you are converting between square metre and square kilometre.
You can view more details on each measurement unit:
m squared or km squared
The SI derived unit for area is the square meter.
1 square meter is equal to 1.0E-6 km squared."<br>
<a href="https://www.convertunits.com/from/m+squared/to/km+squared" target="_blank">convertunits.com</a> 
</p> 


In [None]:
nv_areas_m = nv_shape.geometry.to_crs(epsg=3857).area
nv_areas_m

In [None]:
conversion_m_km = 1000000
nv_areas_m / conversion_m_km

In [None]:
round(nv_areas_m / (10**6), 2)

Km 2 to miles 2.

In [None]:
conversion_miles = 0.3861

nv_area_km2 = pd.Series(list(nv_areas_m / (10**6))).round(2)
print(nv_area_km2)
nv_area_miles2 = pd.Series(list(np.array(nv_area_km2) *0.3861)).round(2)
print(nv_area_miles2)

In [None]:
nv_areas_df = pd.DataFrame(zip(nv_area_km2, nv_area_miles2), columns=('km2', 'miles2'),
                           index=['UrbanSD', 'GeneralSD'])
nv_areas_df

In [None]:
nv_areas_sums = nv_areas_df.apply(sum, axis=0)
nv_areas_sums_series = \
pd.Series(data={'km2':nv_areas_sums[0], 'miles2':nv_areas_sums[1]}, name="Total_areas")

In [None]:
nv_areas_df.append(nv_areas_sums_series, ignore_index=False)

# Schools in Nashville data set
## Import data: schools in Nashville
<p>
Here are different methods of importing data checked out.
</p> 

In [None]:
# os.listdir('datasets')

<b>Open method</b>

In [None]:
path_1 = 'datasets/schools.csv'

In [None]:
sd_1 = open(path_1, mode='r') # Open the file for reading
sd_1_text = sd_1.read() # Read a file’s contents

print(sd_1.closed) #  Check whether file is closed, file is not closed

sd_1.close() # Close file

print(sd_1_text[:430])

<b>With Open - method</b>

In [None]:
with open(path_1, 'r') as sd_2:
    # read in and print the first two lines of the file sd_2
    print(sd_2.readline()) 
    print(sd_2.readline())

In [None]:
# import json, requests
schools_endpoint = 'https://data.nashville.gov/resource/vpdy-5e23.json'

In [None]:
schools_reqdata = requests.get(schools_endpoint).json()

In [None]:
print(type(schools_reqdata))
print(schools_reqdata[:1])

In [None]:
schools_reqdata_df = pd.DataFrame(schools_reqdata)
# print(schools_reqdata_df.head())

<b>Pandas</b>

In [None]:
sd_pd = pd.read_csv(path_1,
                   header=0, 
                   sep=",")

## Data exploration: school data

In [None]:
type(sd_pd)

In [None]:
sd_pd.shape

In [None]:
sd_pd.columns

In [None]:
# number of not na-values
sd_pd.count()

In [None]:
sd_pd.info()

In [None]:
sd_pd.head(3)

In [None]:
sd_pd.tail(3)

In [None]:
city_gb = sd_pd.groupby('City')['School ID'].count()
city_gb

In [None]:
city_gb.sort_values(ascending=False).plot(kind='bar', edgecolor='black',
                                          linewidth=3, title="Number of districts by city",
                                          color=['g', 'r'], )
plt.grid(linewidth=1, color='gray')
plt.show()

In [None]:
level_gb = sd_pd.groupby('School Level')['School ID'].count()
level_gb

In [None]:
level_gb.sort_values(ascending=False).plot(kind='bar', color=['magenta', 'cyan'],
                                           edgecolor='black', linewidth=2,
                                            title="Number of districts by level")
plt.grid(linewidth=0.7, color='brown')
plt.show()

In [None]:
sd_pd.groupby('State')['State'].count()

## Geopspatial data exploration: school data

In [None]:
fig, ax = plt.subplots(4,1, figsize=(12,20))

sns.scatterplot(x='Latitude', y='Longitude', 
                data=sd_pd, color='r', edgecolor='black', ax=ax[0])
ax[0].set_title('School locations by longitude and latitude', fontsize=14)
ax[0].grid(linewidth=1, color='gray')

sns.regplot(x='Latitude', y='Longitude', 
            data=sd_pd, color='g', ax=ax[1],
            scatter_kws={'s':45, 'alpha':0.83, 'color':'magenta', 'edgecolor':'black',
            'linewidth':4},
            line_kws={'color': 'darkblue', 'linewidth':3})
ax[1].set_title('Regression plot', fontsize=14)
ax[1].grid(linewidth=1, color='gray')

sns.scatterplot(x='Latitude', y='Longitude', 
                data=sd_pd, color='r', edgecolor='black', ax=ax[2],
                hue='School Level')
ax[2].set_title('School locations colored by school level', fontsize=14)
ax[2].grid(linewidth=1, color='gray')
ax[2].legend(bbox_to_anchor=(1,1), loc='upper left')


sns.scatterplot(x='Latitude', y='Longitude', 
                data=sd_pd, color='r', edgecolor='black', ax=ax[3],
                hue='City') 

ax[3].set_title('School locations colored by city', fontsize=14)
ax[3].grid(linewidth=1, color='gray')
ax[3].legend(bbox_to_anchor=(1,1), loc='upper left')


plt.subplots_adjust(hspace=0.5)
plt.show()

In [None]:
sd_4 = sd_pd.copy()
print(type(nv_shape))
print(type(sd_4))

#### Combining NV shape polygons & scatter location plots: school districts

In [None]:
fig, ax1 = plt.subplots(1,1, figsize=(8, 8))
nv_shape.plot(column='name', legend=True, ax=ax1, edgecolor='black',
              linewidth=1, legend_kwds={'loc':'lower left'}, cmap='Set1', aspect=1)
sns.scatterplot(x='Longitude', y='Latitude', data= sd_4, ax= ax1, color='blue', legend=False)
ax1.set_title('Locations of chicken permits in Nashville general service districts', fontsize=12)
plt.show()

Most schools are in the Urban Services District.

## Constructing a geopandas data frame from the schools geocoordinates

<p>
A geopandas data frame needs a "geometry" column.<br>
This is a special geography format.<br>
The schools data frame is a normal pandas data frame.<br>
It stores lat and long, but not in the geometry format.<br>
The shapely package offers options to create geographic formats like points, lines, and polygons.<br>
The school locations are points.<br>
Below the lats and longs are turned into shape points.
</p> 

In [None]:
sd_5 = sd_4.copy()
print(type(sd_5))
# print(sd_5.columns)
# print(sd_5.head())

In [None]:
sd_5['geometry'] = sd_5.apply(lambda g: Point((g.Longitude, g.Latitude)), axis=1)

In [None]:
sd_5[['Latitude', 'Longitude','geometry']].head()

Converting the data frame to a geopandas df.

In [None]:
sd_gp = geop.GeoDataFrame(sd_5, crs="EPSG:4326", geometry='geometry')

print(type(sd_gp))
print(sd_gp.crs)
# print(sd_gp.head())

Changing the coordinate reference system from decimals to meters (3857).

In [None]:
sd_gp_meters = sd_gp.copy()

sd_gp_meters.geometry = sd_gp_meters.geometry.to_crs(epsg = 3857) 

print(sd_gp_meters.crs)
print(sd_gp.loc[:2,'geometry'])
print(sd_gp_meters.geometry[:2])

Using the both geo df.

In [None]:
fig, ax1 = plt.subplots(1,1, figsize=(8, 5))

nv_shape.plot(column='name', legend=True, ax=ax1, edgecolor='black',
              linewidth=1, legend_kwds={'loc':'lower left'}, cmap='Set1', aspect=1)
sd_gp.plot(ax=ax1, edgecolor='black', color='green')


ax1.set_title('Locations of schools in Nashville service districts', fontsize=12)
plt.show()

## School districts

In [None]:
sdist = geop.read_file('datasets/school_districts.geojson')

print(type(sdist))


In [None]:
print(sdist.columns.to_list())

In [None]:
sdist.info()

In [None]:
sdist[[ 'city', 'district']].head()

In [None]:
pprint.pprint(sdist[['city', 'geometry']].head())

In [None]:
sdist.info()

In [None]:
fig, ax1 = plt.subplots(1,1, figsize=(8, 5))

sdist.plot(column='city', legend=True, ax=ax1, edgecolor='black',
              linewidth=1, legend_kwds={'loc':'lower left'}, cmap='Set3', aspect=1)
sd_gp.plot(ax=ax1, edgecolor='black', color='green')


ax1.set_title('Locations of schools \n in Nashville school in cities', fontsize=12)
plt.show()

In [None]:
fig, ax1 = plt.subplots(1,1, figsize=(8, 5))

sdist.plot(column='district', legend=True, ax=ax1, edgecolor='black',
           linewidth=1, 
           legend_kwds={'loc':'upper left', 'bbox_to_anchor':(1,1)}, 
           cmap='Set1', aspect=1)
sd_gp.plot(ax=ax1, edgecolor='black', color='green')


ax1.set_title('Locations of schools \n in Nashville school districts', fontsize=12)
plt.show()

### Attributes of school data
#### Areas

Calculating the areas of the school location data.

In [None]:
sd_gp_meters.crs

In [None]:
print(sd_gp_meters.geometry.area.unique())
print(sd_gp_meters.geometry.area[:3])

The areas are all 0. This was expected. A location as point has not got an area.<br>
What about the school district areas?

In [None]:
sdist_m = sdist.to_crs(epsg=3857)

print(sdist_m.crs)

Areas in meters.

In [None]:
sd_areas_dict_city = {}

for i in range(0, len(sdist_m)):
    a  = sdist_m.geometry[i].area / conversion_m_km
    c =  sdist_m.city[i]
    sd_areas_dict_city[c] = round(a,4)
    
pprint.pprint(sd_areas_dict_city)

In [None]:
sd_areas_dict_district = {}

for i in range(0, len(sdist_m)):
    a  = sdist_m.geometry[i].area / conversion_m_km
    c =  sdist_m.district[i]
    sd_areas_dict_district[c] = round(a,4)
    
pprint.pprint(sd_areas_dict_district)

In [None]:
ad_v = list(sd_areas_dict_district.values())
ad_v_m = list(np.array(ad_v) * conversion_miles)
ad_v_m = [round(i,4) for i in ad_v_m]
ad_v_m

In [None]:
df_sd_kmm = pd.DataFrame(zip(ad_v, ad_v_m), columns=['a_km2', 'a_m2'])
df_sd_kmm.index.name = 'School_districts'
df_sd_kmm

#### GeoSeries Centroids
<p>
centers are derived from geometry variables.<br>
Geometry variables are shapes or areas consisting of line or polygons.<br>
Centroids are points in the centers of areas.
<\p>    


In [None]:
print(sdist_m.columns)
print(sdist_m[['district', 'city', 'geometry']].head())


In [None]:
sdist_m.crs

In [None]:
sdistrict_centroids = \
pd.DataFrame(list(zip(sdist_m['district'], sdist_m.geometry.centroid)),
             columns=['District', 'Centroid'])

print(type(sdistrict_centroids))
sdistrict_centroids

In [None]:
sdistrict_centroids.iloc[0,:]

In [None]:
print(sdist_m['district'][0])
district_1 = sdist_m.geometry[0].centroid
print(district_1)

In [None]:
dist1_series = geop.GeoSeries(district_1, crs='EPSG:3857')

print(type(dist1_series))
print(dist1_series.crs)
print(dist1_series)

In [None]:
print(sdist_m['district'][8])

district_7 = sdist_m.geometry[8].centroid

print(district_7)
print(type(district_7))

In [None]:
dist7_series = geop.GeoSeries(district_7, crs='EPSG:3857')

print(type(dist7_series))
print(dist7_series.crs)
print(dist7_series)

In [None]:
sdist_m.geometry.centroid

In [None]:
sdist_m.geometry.centroid.distance(sdist_m.geometry.centroid)

Center of Nashville

In [None]:
nv_center = nv_shape_unary_gdf.to_crs(3857).centroid
nv_center

In [None]:
nv_center_4 = nv_center.to_crs(4326)
nv_center_4

In [None]:
fig, ax = plt.subplots(1,1, figsize=(10,4))

nv_shape_unary_gdf.to_crs(3857).plot(edgecolor='black', linewidth=3, color='slateblue', ax=ax)
nv_center.plot(ax=ax, edgecolor='black', linewidth=0.3, color='aqua')

for ax in fig.axes:
    plt.sca(ax)
    plt.xticks(rotation=90)

plt.title('Center of Nashville at a whole')
plt.show()

### GeoSeries.distance()

<p>
allows to calculate the distance between two points.<br>
The "from" point is here the Nashville mayors office Public Square, Suite 100
Nashville, TN 37201.<br>
Latitude and longitude coordinates are: 36.166840 -86.778200.<br>
The "other" points are the centroids from above.
</p> 

In [None]:
mayor = geop.GeoSeries(Point((-86.778200, 36.166840 )), crs="EPSG:4326")

print(type(mayor))
print(mayor)
print(mayor.crs)

In [None]:
mayor_2 = mayor.copy()
mayor_2 = mayor.to_crs(3857)

print(type(mayor_2))
print(mayor_2.crs)
print(mayor_2)
print(mayor_2.index)

Distance of district 1 centroid to the mayor's office.

In [None]:
dist_dist1 = mayor_2.distance(other=district_1)

distance_dist1_m = round(float(dist_dist1))
print(distance_dist1_m, 'm')

distance_dist1_km = round(float(dist_dist1 / 1000),2)
print(distance_dist1_km, 'km')

Distance of district 7 centroid to the mayor's office.

In [None]:
distance_dist7 = mayor_2.distance(other=district_7)

distance_dist7_m = round(float(distance_dist7))
print(distance_dist7_m )

distance_dist7_km = round(float(distance_dist7/1000),2)
print(distance_dist7_km )

Distances between mayor and every school district centroid.

In [None]:
sd_centers_df = \
geop.GeoDataFrame(sdistrict_centroids, crs='EPSG:3857', geometry='Centroid')

sd_centers_df.rename(columns={'Centroid': 'geometry'}, inplace=True)

print(type(sd_centers_df))
print(sd_centers_df.crs)
print(sd_centers_df.head(1))

In [None]:
distances_mayor = []

for n, row in enumerate(sd_centers_df.iterrows()):
    # print(type(row[1]))
    # print(row[1][1])
    
    distance = round(float(mayor_2.distance(other=row[1][1])),2)
    distances_mayor.append(distance)
    
distances_mayor

In [None]:
sd_centers_df_2 = sd_centers_df.copy()

sd_centers_df_2.rename(columns={'geometry': 'district centroid'}, inplace=True)
sd_centers_df_2['distance_m'] = distances_mayor
sd_centers_df_2['distance km'] = round(sd_centers_df_2['distance_m'] / 1000, 2)

sd_centers_df_2

#### Spatial joins

<p>
A spatial join uses binary predicates such as intersects and crosses to combine two GeoDataFrames based on the spatial relationship between their geometries.
</p> 

<p>
A common use case might be a spatial join between a point layer and a polygon layer where you want to retain the point geometries and grab the attributes of the intersecting polygons.<br>
<a href="https://geopandas.org/gallery/spatial_joins.html" target="_blank">geopandas.org</a> 
</p> 

<p>
The matching is achieved with the geometry columns, every GeoPandasDataframe has got.<br>
Without a geometry col a data structure cannot be a GeoPandasDataframe.<br>
The modi of joins in GeoPandas follows SQL logic.
</p>



Joining schools with the main service districts.

In [None]:
# print(type(nv_shape))
# print(nv_shape)
# print(nv_shape.crs)
print(nv_shape.name)

In [None]:
# sdist_m : school location data

# print(type(sdist_m.head(1))
# print(type(sdist_m))

sdist_m_4326 = sdist_m.to_crs(4326)
print(sdist_m_4326.crs)

Intersects returns all cols, when matching occurs.

In [None]:
schools_intersect_maindistricts = geop.sjoin(sd_gp, nv_shape, op = 'intersects')
print(schools_intersect_maindistricts.shape)


All schools fall into the main service districts.

In [None]:
print(schools_intersect_maindistricts.name.unique())
print(schools_intersect_maindistricts.name.value_counts())

In [None]:
print("Schools in maindistricts: {}".format(schools_intersect_maindistricts.shape[0]))

# schools_intersect_maindistricts.head(1)
schools_intersect_maindistricts.columns.tolist()

The variables 'index_right', 'area_sq_mi' for example are from the main districts df (nv_shape).<br>
School location points have not any areas.

In [None]:
schools_intersect_maindistricts.plot(edgecolor='black', color='magenta')
plt.show()

Does a left join change anything?<br>
Probably not in this case as all schools fall into one of the two districts.

In [None]:
schools_intersect_maindistricts_left = geop.sjoin(sd_gp, nv_shape, how = 'left')

# print(type(schools_intersect_maindistricts_left))
print(schools_intersect_maindistricts_left.shape)

Does a left join change anything?<br>
Probably not in this case.

In [None]:
schools_intersect_maindistricts_within = geop.sjoin(sd_gp, nv_shape, op = 'within')

print(type(schools_intersect_maindistricts_within))
print(schools_intersect_maindistricts_within.shape)
print("Schools within main districts: {}".format(schools_intersect_maindistricts_within.shape[0]))

Probably not in this case as all schools fall into one of the two districts.

Joining schools with school districts.

In [None]:
# print(sd_gp.head(1))
# print(sd_gp.crs)
# print(type(sd_gp))
# print(sd_gp.head())

In [None]:
school_intersect_districts = geop.sjoin(sd_gp, sdist_m_4326, op = 'intersects')
print(type(school_intersect_districts))

In [None]:
print(sd_gp.shape)
print(school_intersect_districts.shape)

The columns of the joined df are consisting of the cols of the joining df.

In [None]:
# concat allows to create df with different length of cols

intersect_schools = \
pd.concat([pd.Series(sd_gp.columns.tolist()), 
           pd.Series(sdist_m_4326.columns.tolist()),
           pd.Series(school_intersect_districts.columns.tolist())], 
           ignore_index=True, 
           axis=1)

intersect_schools.columns = ['sd_gp', 'sdist_m_4326' ,'Joined_df']

intersect_schools = intersect_schools.fillna(0)
intersect_schools

In [None]:
school_intersect_districts.plot(edgecolor='black', color='magenta')
plt.show()

Joining schools with NV neighborhoods.<br>
Other than above not all schools might have a neighborhoods.

In [None]:
# print(type(nv))
# print(nv.head(1))

In [None]:
schools_intersect_neighborhoods = geop.sjoin(sd_gp, nv, op = 'intersects')
schools_intersect_neighborhoods_left = geop.sjoin(sd_gp, nv, how = 'left')
schools_intersect_neighborhoods_right = geop.sjoin(sd_gp, nv, how = 'right')

print(type(schools_intersect_neighborhoods))
# print(schools_intersect_neighborhoods.head(1))

In [None]:
print("Intersect:", schools_intersect_neighborhoods.shape)
print("Left:", schools_intersect_neighborhoods_left.shape)
print("Right:", schools_intersect_neighborhoods_right.shape)

In [None]:
schools_intersect_neighborhoods.City.value_counts()

In [None]:
n_unique_neigh = len(schools_intersect_neighborhoods.name.unique())
print("Number of unique neighborhoods: {}".format(n_unique_neigh))

In [None]:
print("Schools intersecting with neighborhoods: {}".format(schools_intersect_neighborhoods.shape[0]))

Not in all neighborhoods are schools.

In [None]:
schools_intersect_neighborhoods.plot(edgecolor='black', color='magenta')
plt.show()

Top five neigborhoods by number of schools.

In [None]:
schools_intersect_neighborhoods[['School Name', 'name']].groupby('name').aggregate('count').\
sort_values(ascending=False, by='School Name')[:5]

Joining areas (polygons).

In [None]:
maindist_inter_neigh_within = geop.sjoin(nv, nv_shape, op='within')
print(maindist_inter_neigh_within.shape)

In [None]:
maindist_inter_neigh = geop.sjoin(nv, nv_shape, op='intersects')
print(maindist_inter_neigh.shape)

In [None]:
maindist_inter_neigh.head(2)

In [None]:
print(maindist_inter_neigh.columns.tolist())

In [None]:
maindist_inter_neigh['name_left'].unique()[:5]

In [None]:
maindist_inter_neigh['name_right'].value_counts().sort_values(ascending=False)[:10]

Of the neighborhoods 195 are in the USD and 112 in the GSD.

In [None]:
maindist_inter_neigh = maindist_inter_neigh.\
                       rename(columns={'name_right':'main_Districts', 'name_left':'neigborhoods'})
# maindist_inter_neigh.main_Districts

In [None]:
maindist_inter_neigh[['neigborhoods', 'main_Districts']].groupby('main_Districts').agg('count')

# Chicken permits in Nashville data set
## Import json data: chicken permits in Nashville

In [None]:
chickens = pd.read_json('https://data.nashville.gov/resource/vpdy-5e23.json')

## Data exploration: chicken permits data

In [None]:
print(type(chickens))
print(chickens.shape)
print(chickens.columns.tolist)

In [None]:
chickens.head(3)

In [None]:
chickens.tail(3)

In [None]:
chickens.info()

In [None]:
chickens = chickens.drop(chickens.columns[7:13], axis=1)

In [None]:
nas =       pd.DataFrame(zip(chickens.isna().sum(), chickens.count(),
                chickens.isna().sum() + chickens.count()), 
            columns=['NA\'s', 'Values', 'Total'], 
            index=chickens.columns)

nas

### Groupby: chicken permits data

In [None]:
gb_district = chickens.groupby('district')['permit'].sum()
gb_district[:3]

In [None]:
gb_district.sort_values().plot(kind='bar', 
                               figsize=(15,7), edgecolor='black', linewidth=4,
                               color=['magenta', 'yellow', 'cyan', "crimson", 'brown'],
                               title='Chicken permits by districts', fontsize=15,
                               grid=True)

plt.title('Chicken permits by districts', fontsize=23, fontweight='bold')

plt.show()

### Geopspatial data exploration: chicken permits data
<b>Longitude and latitudes</b>

<p>
The geocoordinates are stored together with other information
in the variable mapped_location.<br>
The geocoorindates should be stored as tidy variables each in one column.
</p> 

In [None]:
with pd.option_context('display.max_rows', None,
                       'display.max_columns', None,
                       'display.precision', 1000,
                       ):
    print(chickens.loc[:5, ['city', 'mapped_location']])

# Why doesn't context print not the whole dict?

Trying to understand the construction.

In [None]:
ml = chickens.mapped_location
print(type(ml))
print(ml[:3])

In [None]:
ml10 = ml[10]
print(type(ml10))
print(ml10['latitude'])

Conclusion:<br>
This a nested construction, whereby a n dicts are stored inside of pd.Series.<br>
Therefore the construction is iterated with list comprehensions and not as a dict of dicts.

In [None]:
latitudes = [i['latitude'] for i in ml]
latitudes = [float(i) for i in latitudes]

In [None]:
print(type(latitudes[99]))
print(len(latitudes))
print(latitudes[:5])

In [None]:
longitudes = [i['longitude'] for i in ml]
longitudes = [float(i) for i in longitudes]

In [None]:
print(longitudes[:5])
print(type(longitudes[15]))
print(len(longitudes))

In [None]:
chickens_2 = chickens.copy()

In [None]:
chickens_2['Longitudes'] = longitudes
chickens_2['Latitudes'] = latitudes

In [None]:
chickens_2.columns

In [None]:
chickens_2[['Longitudes', 'Latitudes']].head()

In [None]:
geo_354 = chickens_2[['Longitudes', 'Latitudes']]
geo_354.aggregate({'min', 'max' ,'median'})

In [None]:
geo_354.head(3)

In [None]:
geo_354.info()

### Is there a problem wiith overplotting?

A lot of geocoordinate points are close to each other and fit in nearby bins.



In [None]:
fig = plt.figure(figsize=(15,5))
ax1 = fig.add_subplot(1,2,1)
sns.histplot(x='Longitudes', data=geo_354, edgecolor='black', color='green', ax=ax1, bins=50)

ax2 = fig.add_subplot(1,2,2)
sns.histplot(x='Latitudes', data=geo_354, edgecolor='black', color='crimson', ax=ax2, bins=50)
plt.show()


plt.scatter(x=chickens_2['Longitudes'], y=chickens_2['Latitudes'], s=0.5, alpha=0.7)
plt.show()

plt.figure(figsize=(8,8))
sns.jointplot(x='Longitudes', y='Latitudes',  data= chickens_2, height=5,
              color='g', s=2)
plt.ylabel('longitude')
plt.xlabel('latitude')

ax1.set_xlim(left=-86.8, right=-86.801)


plt.grid()
plt.show()

sns.kdeplot(data = chickens_2, x='Longitudes', y='Latitudes', cmap="Reds", shade=True)
plt.title('Determining overplotting with a 2D density graph', loc='left')
plt.show()



In [None]:
plt.figure(figsize=(20,5))
sns.stripplot(x='Longitudes', y='Latitudes',  data= chickens_2, s=4)
plt.xticks(rotation = 90)
plt.grid()
plt.show()

### Locations: chicken permits data

In [None]:
sns.set_style('darkgrid')
fig, (ax1, ax2) = plt.subplots(2,1, figsize=(8,10))

sns.scatterplot(x='Longitudes', y='Latitudes', data= chickens_2, ax= ax1, 
                alpha=0.2, 
                s=200,
                marker = "o")
ax1.set_xlim(left=-86.9, right=-86.7)
ax1.set_title('Locations: chicken permits data', fontsize=23)


sns.scatterplot(x='Longitudes', y='Latitudes',  data= chickens_2, ax= ax2, hue='permit', s=200)
ax2.set_xlim(left=-86.9, right=-86.7)
plt.show()

In [None]:
sns.set_style('darkgrid')
fig, (ax1, ax2) = plt.subplots(2,1, figsize=(8,10))

sns.scatterplot(x='Longitudes', y='Latitudes', data= chickens_2, ax= ax1, alpha=0.2, size="permit")
ax1.set_xlim(left=-86.9, right=-86.7)
ax1.set_title('Locations: chicken permits data', fontsize=23)


sns.scatterplot(x='Longitudes', y='Latitudes', data= chickens_2, ax= ax2, hue='permit')
ax2.set_xlim(left=-86.9, right=-86.7)


for ax in fig.axes:
    plt.sca(ax)
    plt.xticks(rotation=90)


    
plt.subplots_adjust(hspace=0.3)
plt.show()

### Combining shape polygons & scatter location plots
<p>
into layered plots.
</p>


In [None]:
fig, ax1 = plt.subplots(1,1, figsize=(12, 12))
nv_shape.plot(column='name', legend=True, ax=ax1, edgecolor='black',
              linewidth=1, legend_kwds={'loc':'lower left'}, aspect=0.8)
sns.scatterplot(x='Longitudes', y='Latitudes', data= chickens_2, ax= ax1, color='red', 
                size="permit", legend=False)
ax1.set_title('Locations of chicken permits in Nashville general service districts', fontsize=12)
plt.show()

In [None]:
fig, ax1 = plt.subplots(1,1, figsize=(10,5))

nv.plot(ax=ax1, column='name', legend=False, aspect=0.5)
sns.scatterplot(x='Longitudes', y='Latitudes', data= chickens_2, ax= ax1, color='black')
ax1.set_xlim(left=-87, right=-86.7)
ax1.set_title('Locations of chicken permits in Nashville sub service districts', fontsize=12)

plt.show()

It seems that a lot of permits, where issued for the same location or chicken farm.

# Creating interactive maps of Nashville with Folium

<p>
, which is Python package based on the leaflet.js library<br>
or in words of the <a href="https://python-visualization.github.io/folium/" target="_blank">Folium 0.12.1 documentation:</a>
</p> 

<p>
"Folium makes it easy to visualize data that’s been manipulated in Python on an interactive leaflet map. It enables both the binding of data to a map for choropleth visualizations as well as passing rich vector/raster/HTML visualizations as markers on the map."<br>

<p>
Below a map is created centered around the <b>Graceland, the home of Elvis Presley, in Memphis</b>, TN.<br>
Latitude: 35.0459425757 / Longitude: -90.0229972176
</p> 

In [None]:
graceland_df = pd.DataFrame([Point((35.0459425757, -90.0229972176 ))],
                            columns=['geometry'])

graceland_gdf = geop.GeoDataFrame(graceland_df, geometry='geometry', crs='EPSG:4326')

graceland_gdf 

In [None]:
graceland_point = Point((35.0459425757, -90.0229972176 ))
print(graceland_point)
print(type(graceland_point))


<b>How are the coordinates of a Shapely point extracted?</b><br>

<p>
Simply subsetting the tuple throws an error.<br>
Instead using the coords-method.<br>
<a href="https://stackoverflow.com/questions/20474549/extract-points-coordinates-from-a-polygon-in-shapely" target="_blank">
stackoverflow.com</a> 
</p> 


In [None]:
print(graceland_point.coords[0][0])
print(type(graceland_point.coords[0][0]))
print(graceland_point.coords[0][1])

grace_lat = graceland_point.coords[0][0]
grace_long = graceland_point.coords[0][1]
grace_coord = [grace_lat, grace_long]
print(grace_coord)

In [None]:
graceland_gseries = geop.GeoSeries(graceland_point, crs="EPSG:4326")

print(type(graceland_gseries))
print(graceland_gseries)
print(graceland_gseries[0])

Another method is to use the x and y methods.

In [None]:
graceland_centerpoint = graceland_gseries[0]
grace_y_long = graceland_centerpoint.y
grace_x_lat = graceland_centerpoint.x

print(grace_y_long)
print(grace_x_lat)

In [None]:
f = folium.Figure(width=600, height=300)

graceland = folium.Map(location = grace_coord, zoom_start=28, tiles='openstreetmap').add_to(f)
#display(graceland)

In [None]:
graceland_gseries

In [None]:
grace_tojson = graceland_gseries.to_json()
grace_tojson

In [None]:
graceland_json = folium.GeoJson(grace_tojson)

In [None]:
folium.Popup('Graceland').add_to(graceland_json)

In [None]:
graceland_json.add_to(graceland)

In [None]:
display(graceland)

Pop-up Graceland?

#### Folium map of schools in Nashville

In [None]:
geocoord_NV_center = [float(nv_center_4.geometry.y), float(nv_center_4.geometry.x)]

In [None]:
sd_gp.columns

In [None]:
geocoords_schools_map = []
school_names = []
for row in sd_gp.iterrows():
        geocoord_schools = [row[1]['Latitude'], row[1]['Longitude']]
        geocoords_schools_map.append(geocoord_schools)
        school_name = row[1]['School Name']
        school_names.append(school_name)
        
geocoords_schools_map[:2]
school_names[:4]

In [None]:
f_schools = folium.Figure(width = 500, height = 300)

school_title = """Schools in Nashville within school districts"""

school_title_2 = \
'''<h1 align="center" style="font-size:16px" font-style="italic"><b>{}</b></h2>
'''.format(school_title )

map_schools = folium.Map(location=geocoord_NV_center,
                         width='100%', height='100%', left='0%', top='0%', 
                         position='relative', tiles='OpenStreetMap', 
                         attr=None, min_zoom=2, max_zoom=18, 
                         zoom_start=11, min_lat=- 90, max_lat=90, min_lon=- 180, 
                         max_lon=180, max_bounds=False, crs='EPSG3857', 
                         control_scale=False, prefer_canvas=False, 
                         no_touch=False, disable_3d=False, png_enabled=False)

map_schools.get_root().html.add_child(folium.Element(school_title_2))

# Add school district borders
map_schools_borders = folium.GeoJson(nv.geometry,
                                     style_function = \
                                     lambda x: 
                                     {'color':'black', 'weight':0.5, 'fillColor':'blue'}).\
                                     add_to(map_schools)


for row in sd_gp.iterrows():
        geocoord_schools = [row[1]['Latitude'], row[1]['Longitude']]
        school_name = row[1]['School Name']
        folium.Marker(location=geocoord_schools, popup=school_name,
                      icon = folium.Icon(color='green', icon_color='white', icon='heart-empty', 
                                     angle=0, prefix='glyphicon')).add_to(map_schools)
        


display(map_schools)

# Art work in Nashville data set
## Quick reproduction: art work in Nashville

In [None]:
arty = pd.read_csv('datasets/public_art.csv')

print(type(arty))
print(arty.info())
print(arty[['Title', 'Latitude', 'Longitude']].head())

In [None]:
arty_gp = arty.copy()
arty_gp['geometry'] = arty.apply(lambda x: Point((x.Longitude, x.Latitude)), axis=1)

# 'EPSG:4326' = decimal crs
arty_gp = geop.GeoDataFrame(arty_gp, crs='EPSG:4326', geometry='geometry')


print(type(arty_gp))
print(arty_gp.crs)
print(arty_gp[['Title', 'Latitude', 'Longitude', 'geometry']].head())

In [None]:
fig, ax1 = plt.subplots(1,1, figsize=(12, 12))
plt.tight_layout()

nv_shape.plot(column='name', legend=True, ax=ax1, edgecolor='black',
              linewidth=1, legend_kwds={'loc':'upper right'}, aspect=0.8)

arty_gp.plot(column ='Title', legend=True, ax=ax1, edgecolor='black',
             linewidth=1, 
             legend_kwds={'loc':'upper center', 'bbox_to_anchor':(0.5, -0.05), 'ncol':4}, 
             aspect=0.8)

ax1.set_title('Locations of art in Nashville general service districts', fontsize=17,
              fontweight='bold')
plt.show()

In [None]:
fig, ax1 = plt.subplots(1,1, figsize=(12, 12))
plt.tight_layout()

nv.plot(column='name', legend=True, ax=ax1, edgecolor='black',
        linewidth=1, legend_kwds={'loc':'upper right'}, aspect=0.8,
         cmap='gist_gray')

arty_gp.plot(column ='Title', legend=True, ax=ax1, edgecolor='black',
             linewidth=1, 
             legend_kwds={'loc':'upper center', 'bbox_to_anchor':(0.5, -0.05), 'ncol':4}, 
             aspect=0.8, cmap='Reds')

ax1.set_title('Locations of chicken permits in Nashville general service districts', fontsize=17,
              fontweight='bold')
plt.show()

#### Geospatial join

In [None]:
art_within_neighbors = geop.sjoin(arty_gp, nv, op ='within')
print(type(art_within_neighbors))

print(art_within_neighbors.crs)
print(art_within_neighbors.shape)
print(arty_gp.shape)

art_within_neighbors.head(1)

Only 40 pieces of art are within neighbourhoods.

In [None]:
neighbor_12 = pd.Series(art_within_neighbors.name.unique().tolist())

In [None]:
art_within_neighbors.columns.tolist()
art_within_neighbors[['name', 'Title']].groupby('name').\
agg('count').sort_values(by='Title', ascending=False)

Most art is in the Urban Residents. Let's have a closer look into it.

In [None]:
urban_art = art_within_neighbors[art_within_neighbors.name=='Urban Residents']

print(type(urban_art))
print(urban_art.shape)
print(urban_art.columns.tolist())

In [None]:
urban_neighbors = nv.loc[nv.name=='Urban Residents']

print(urban_neighbors.shape)

In [None]:
fig, ax = plt.subplots(1,1, figsize=(12,5))

urban_neighbors.plot(edgecolor='black', linewidth=3, color='lime', ax=ax)
urban_art.plot(edgecolor='black', linewidth=3, ax=ax, column='Type', legend=True,
               legend_kwds={'ncol':1, 'bbox_to_anchor':(1, 1), 'loc':'upper center'}
               )

for ax in fig.axes:
    plt.sca(ax)
    plt.xticks(rotation=90)

plt.title('Art work in Urban Residents')
plt.show()

In [None]:
urban_neighbors = urban_neighbors.to_crs(3857)
print(urban_neighbors.crs)
urban_area_km_squared = urban_neighbors.geometry.area / (10**6)
urban_area_km_squared 

#### Nashville Urban Center Residents district.

<p>
Creating a interactive Folium map.
</p> 


#### Add (district) border shape to the Folium map.

In [None]:
urban_neighbors

In [None]:
urban_center = urban_neighbors.geometry.centroid
print(type(urban_center))
print(urban_center.crs)

urban_center

In [None]:
# crs?
print(urban_center.crs)
urban_c566 = urban_center.to_crs(4326)
print(urban_c566.crs)

In [None]:
uc_geoc = [ urban_c566.y , urban_c566.x]
uc_geoc = [float(i) for i in uc_geoc]
uc_geoc 

In [None]:
fig, ax = plt.subplots(1,1, figsize=(12,5))

urban_neighbors.plot(edgecolor='black', linewidth=3, color='lime', ax=ax)
urban_center.plot(ax=ax, edgecolor='black', linewidth=3, cmap='gist_rainbow')

for ax in fig.axes:
    plt.sca(ax)
    plt.xticks(rotation=90)

plt.title('Center of Urban Residents')
plt.show()

In [None]:
f = folium.Figure(width=400, height=300)

nv_urbanc_map = folium.Map(location=uc_geoc, zoom_start=14,
                           tiles="openstreetmap").add_to(f)
display(nv_urbanc_map)

In [None]:
ub_borders =   folium.GeoJson(urban_neighbors.geometry,
               style_function= \
               lambda x: {'fillColor': 'crimson',
                          "color": "#ff7800",
                          "weight": 5,
                          "opacity": 0.8,
                          "stroke":True}).add_to(nv_urbanc_map)

#### Add a pop up to the (district) border shape of the Folium map.

<p>
Pop up understands HTML.
</p> 

In [None]:
nv_descript = \
"""<strong>Nashville</strong> is the capital and most populous city of the U.S. state of Tennessee. 
The city is the county seat of Davidson County and is located on the Cumberland River.
It is the 23rd most-populous city in the United States.
<a href="https://en.wikipedia.org/wiki/Nashville,_Tennessee"
target="_blank"">Wikipedia</a>"""

folium.Popup(nv_descript ).add_to(ub_borders)

In [None]:
display(nv_urbanc_map)

Lockeland Springs district.

In [None]:
lock = nv.loc[nv.name == 'Lockeland Springs'].to_crs(3857)
print(lock)
print(lock.crs)

lock_art = art_within_neighbors.loc[art_within_neighbors.name == 'Lockeland Springs'].to_crs(3857)
print(lock_art[['Title', 'Medium', 'Location']])
print(lock_art.crs)

lock_center = nv.loc[nv.name=='Lockeland Springs'].to_crs(3857).geometry.centroid
print(lock_center)
print(type(lock_center))
print(lock_center.crs)

In [None]:
lock_center_4326 = lock_center.to_crs(4326)
lock_y_long = lock_center_4326.geometry.y
lock_x_lat = lock_center_4326.geometry.x
lock_coords = [lock_y_long, lock_x_lat]
lock_coords = [float(i) for i in lock_coords]
print(lock_coords)

In [None]:
fig, ax = plt.subplots(1,1, figsize=(10,4))

lock.plot(edgecolor='black', linewidth=3, color='crimson', ax=ax)
lock_art.plot(column='Title', ax=ax, linewidth=0.3, edgecolor='black', legend=True,
              legend_kwds={'ncol':1, 'bbox_to_anchor':(0.5, -0.1), 'loc':'upper center'})
lock_center.plot(ax=ax, color='lime')

plt.title('Art in Lockeland Springs')
plt.show()

In [None]:
# Add title - Thanks to: 
# https://stackoverflow.com/questions/61928013/adding-a-title-or-text-to-a-folium-map

loc = 'Shape of Lock in Folium map of Nashville'
title_html = '''
             <h3 align="center" style="font-size:16px"><b>{}</b></h3>
             '''.format(loc)   

f = folium.Figure(width=600, height=300)

map_lock = folium.Map(location=lock_coords, zoom_start=14).add_to(f)
folium.GeoJson(lock.geometry).add_to(map_lock)

map_lock .get_root().html.add_child(folium.Element(title_html))

map_lock .save('lock.html')

display(map_lock)

<strong> Creating a interactive Folium map.</strong> 
<p>
Including, adding the urban district borders, a title and a pop up to the map.
</p> 

In [None]:
# print(urban_neighbors)
urban_center_4326 = urban_center.to_crs(4326)
print(urban_center_4326.crs)

urban_center_geocoord_list = [float(urban_center_4326.y), float(urban_center_4326.x)]
print(urban_center_geocoord_list)

title_urbc = 'Nashville centered around the center of the urban district within borders'

title_urbc_html = '''
                  <h2 align="center" style="font-size:14px" font-style="italic"><b>{}</b></h2>
           
                  '''.format(title_urbc)
# print(title_urbc_html)

f_nv_ubc = folium.Figure(width=600, height=300)

nv_urbcenter = folium.Map(location=urban_center_geocoord_list,
                          zoom_start=14, tiles='Stamen Terrain').add_to(f_nv_ubc)

# folium.GeoJson(lock.geometry).add_to(map_lock)
urbd_borders = folium.GeoJson(urban_neighbors.geometry,
               style_function= \
               lambda x: {'fillColor': 'limegreen',
                          "color": "plum",
                          "weight": 5,
                          "opacity": 0.8,
                          "stroke":True}).add_to(nv_urbcenter)

nv_urbcenter.get_root().html.add_child(folium.Element(title_urbc_html))
folium.Popup(nv_descript).add_to(urbd_borders)



In [None]:
display(nv_urbcenter)

<strong>Adding markers to the map.</strong>

<p>
Iterating over the rows of the df gets the markers.
</p> 

In [None]:
urban_art = urban_art.copy()
urban_art.iloc[0,0] = 'Fourth and Commerce Sculpture'

In [None]:
urban_art.iloc[0,0]

In [None]:
# print(urban_art.head(1))
# print(urban_art.geometry.y)
# print(urban_art.columns)
# Getting the index number of the geometry column
# print(len(urban_art.geometry))

In [None]:
print(urban_art.isna().sum())

The NA's could be a problem, when generating pop-ups.

In [None]:
urban_art = urban_art.fillna("")
print(urban_art.isna().sum())

Also special signs like "'" could cause problems.

In [None]:
urban_art = urban_art.replace(to_replace="\'", value='`')

In [None]:
type(urban_art)

In [None]:
geo_index = urban_art.columns.get_loc('geometry')
geo_index_title = urban_art.columns.get_loc('Title')

Example: get the name-values of the pieces of art by iterating.

In [None]:
urban_art_pieces = []
for row in urban_art.iterrows():
    all_row_values = row[1]
    piece_of_art = all_row_values[0]
    urban_art_pieces.append(piece_of_art)
    
print(len(urban_art_pieces))
print(urban_art_pieces[:3])

In [None]:
title_urbc_html_2 = \
"""Art in the Nashville urban district marked on the map."""

title_urbc_html_2 = '''
                  <h2 align="center" style="font-size:14px" font-style="italic"><b>{}</b></h2>
           
                  '''.format(title_urbc_html_2)

nv_urbcenter_2 = folium.Map(location=urban_center_geocoord_list,
                            zoom_start=16, tiles='Stamen Terrain')

urbd_borders_2 = folium.GeoJson(urban_neighbors.geometry,
               style_function= \
               lambda x: {'fillColor': 'limegreen',
                          "color": "plum",
                          "weight": 5,
                          "opacity": 0.8,
                          "stroke":True}).add_to(nv_urbcenter_2)

nv_urbcenter_2.get_root().html.add_child(folium.Element(title_urbc_html_2))
folium.Popup(nv_descript).add_to(urbd_borders_2)

display(nv_urbcenter_2)

<p>
Folium Marker created an error, maybe because the text for the Popup is too long.<br>
The popup was not rendered.<br>
The solution is to adjust the popup frame for the text.<br>
Alternatively the font size maybe adjusted.<br>
Thanks to:<br>
<a href="https://stackoverflow.com/questions/62228489/python-folium-how-to-create-a-folium-map-marker-with-multiple-popup-text-line"
target="_blank">Stack</a> 
</p> 

In [None]:
urban_art.columns

In [None]:
for row in urban_art.iterrows():
    row_values = row[1]
    geocoords_uba = [row_values[geo_index].y, row_values[geo_index].x]
    
    title = \
    '<h3 align="center" style="font-size:12px" font-style="italic">' + row_values[0] + '</h3>'
    # print(row_values[0])
    
    iframe = folium.IFrame(title,
                       width=100,
                       height=60)

    popup = folium.Popup(iframe, max_width=100)

    marker_uba = folium.Marker(location = geocoords_uba,
                               popup=popup)
    marker_uba.add_to(nv_urbcenter_2 )

In [None]:
display(nv_urbcenter_2)

In [None]:
nv_urbcenter_23 = nv_urbcenter_2 


for row in urban_art.iterrows():
    row_values = row[1]
    geocoords_uba = [row_values[geo_index].y, row_values[geo_index].x]
    
    title_desc = '<strong align=center>' + row_values['Title'] + '</strong>'  + ': ' \
    + '<p style="font-size:12px">' + row_values['Description'] + '</p>'
    
    iframe = folium.IFrame(title_desc, width=300, height=100)
    
    popup = folium.Popup(iframe, max_width=300)
    
    # print(popup)
    
    folium.Marker(location=geocoords_uba, popup=popup, 
                  icon=folium.Icon(color="indigo")).add_to(nv_urbcenter_23)

In [None]:
display(nv_urbcenter_23)

## Choropleth 

A <b>choropleth</b> map is: 
</p> 
<p>
"A choropleth map (from Greek χῶρος choros 'area/region' and πλῆθος plethos 'multitude') is a type of thematic map in which a set of pre-defined areas is colored or patterned in proportion to a statistical variable that represents an aggregate summary of a geographic characteristic within each area, such as population density or per-capita income." (Wikipedia)
</p> 


In [None]:
sd_gp.columns.to_list()

In [None]:
schools_in_districts = sd_gp[['geometry', 'School Name', 'Latitude', 'Longitude']]

In [None]:
schools_in_districts.head(2)

In [None]:
print(sdist.crs)
print(sdist.geometry.crs)

In [None]:
sdist.head(1)

In [None]:
type(sd_gp)

In [None]:
sd_gp.head(1)

In [None]:
# schools_in_districts = gpd.sjoin(school_districts, schools_geo, op = 'contains')
schools_in_districts = geop.sjoin(sdist, sd_gp , op = 'contains')

In [None]:
schools_in_districts.head(1)

In [None]:
schools_in_districts.info()

In [None]:
schools_in_districts_2 = \
schools_in_districts.loc[:, ['district', 'School ID', 'School Name', 'city',
                             'geometry', 'Latitude', 'Longitude']]

In [None]:
schools_in_districts_2.rename(columns=
                              {'School ID': 'school_id', 'School Name': 'name'}, 
                               inplace=True)

schools_in_districts_2.head(2)

In [None]:
schools_in_districts_2_3857 = schools_in_districts_2.to_crs('EPSG:3857')
print(schools_in_districts_2_3857.crs)
schools_in_districts_2_3857.head(2)

In [None]:
schools_in_districts_2.district.unique()

In [None]:
schools_in_districts_2.crs

In [None]:
schools_in_districts_3857 = schools_in_districts_2.to_crs(epsg=3857)
print(schools_in_districts_3857.crs)
schools_in_districts_3857.head(1)

#### Calculating the are metric in km^2

In [None]:
# define a variable for m^2 to km^2
sqm_to_sqkm = 10**6

In [None]:
schools_in_districts_3857['area'] = \
round(schools_in_districts_3857.geometry.area / sqm_to_sqkm, 2)

schools_in_districts_3857.head(1)

In [None]:
schools_in_districts_4326 = schools_in_districts_3857.to_crs(epsg=4326)
schools_in_districts_4326.crs

In [None]:
schools_in_districts_4326.plot(column = 'area', cmap = 'Reds', edgecolor = 'black',
                               legend=True)
plt.title("Choropleth of school districts based on area")
plt.show()

In [None]:
schools_counts = schools_in_districts_4326.groupby(['district']).size()

print(type(schools_counts))
schools_counts

In [None]:
schools_counts_df = schools_counts.to_frame().reset_index(level=0)
schools_counts_df.columns = ['district', 'school_counts']
schools_counts_df.head(2)

In [None]:
# districts with schools counted
district_school_counts = sdist.merge(schools_counts_df, on='district')
district_school_counts[['last_name', 'district', 'school_counts']].head()

In [None]:
print(type(district_school_counts))
print(np.shape(district_school_counts))

In [None]:
district_school_counts.dtypes

In [None]:
print(type(district_school_counts.geometry[1]))

In [None]:
district_school_counts_json = district_school_counts.to_json()
district_school_counts_json[:700]

In [None]:
district_school_counts.to_file("district_school_counts_gejson.geojson", driver='GeoJSON')

In [None]:
district_school_counts.plot(column='school_counts', cmap='BuGn', edgecolor='black',
                            legend=True)
plt.title("Choropleth of districts colored by school counts")
plt.xlabel('longitude')
plt.ylabel('latitude')
plt.show()

In [None]:
district_school_counts['density'] = \
district_school_counts.apply(lambda row: row.school_counts / row.geometry.area, axis=1)

In [None]:
print(district_school_counts.columns.to_list())

In [None]:
district_school_counts.plot(column='density', cmap='Blues', edgecolor='black', linewidth=1, 
                            legend=True,  figsize=(10,6))

plt.title("Chloropleth of districts by school density")
plt.xlabel('longitude')
plt.ylabel('latitude')

plt.show()

In [None]:
print(district_school_counts.crs)
print(district_school_counts.geometry.crs)

In [None]:
title_2222 = \
"""Number of schools in school district."""

title_2222 = \
''' <h2 align="center" style="font-size:14px" font-style="italic"><b>{}</b></h2>'''.format(title_2222)


f = folium.Figure(width=600, height=300)

nashville = [36.1636,-86.7823]
m = folium.Map(location=nashville, zoom_start=10).add_to(f)

m.choropleth(
geo_data=district_school_counts,
name='Choropleth',
data=district_school_counts,
columns=['district', 'density'],
key_on='feature.properties.district',
fill_color='Set1',
fill_opacity=0.75,
line_opacity=0.5,
legend_name='Schools per km squared by School District'
)

m.get_root().html.add_child(folium.Element(title_2222))

# Add layer control and display
folium.LayerControl().add_to(m)
display(m)

### Adding markers 

In [None]:
district_school_counts_3857 = district_school_counts.to_crs(epsg=3857)

district_school_counts_3857['center'] = district_school_counts_3857.geometry.centroid

print(district_school_counts_3857.crs)
district_school_counts.head(1)

In [None]:
district_school_counts['center'] = district_school_counts_3857.center.to_crs(epsg = 4326)

In [None]:
district_school_counts = district_school_counts_3857.to_crs(epsg = 4326)
district_school_counts['center'] = district_school_counts_3857.center.to_crs(epsg = 4326)

print(district_school_counts.geometry.crs)
print(district_school_counts.center.crs)
district_school_counts.head(1)

In [None]:
for row in district_school_counts.iterrows():
    iii = row[1]
    centerpoint = iii['center']
    l = [centerpoint.y , centerpoint.x]
    p = ('District: ' + str(iii['district'] + ' - Number of schools: ' + str(iii['school_counts'])))
    marker = folium.Marker(location = l , popup = p)
    marker.add_to(m)
    
display(m)