# Analysis and Visualization of Complex Agro-Environmental Data
---
## Visualization of Geospatial Data 



### 1. The `Geopandas` module

The `Geopandas` introduces some GIS functionalities into `python`. It extends the datatypes used by pandas to allow spatial operations on geometric types. Geometric operations are performed by the `shapely` module. It further depends on the `fiona` module for file access and `matplotlib` for plotting.

We will show how to import shapefiles and merge tables, using as an example a visualization of human population density in portuguese municipalities.

In [323]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px

%matplotlib inline

Importing shapefiles (you may need to install `geopandas` and `mapclassify`)

In [None]:
import geopandas as gpd
# import geoplot as gplt # I WAS UNABLE TO INSTALL GEOPLOT (AND CARTOPY)!!!

# Import shapefile of portuguese civil parishes ('freguesias')
port_regions = gpd.read_file("Shapes/DGT/CAOP_2020.shp")
port_regions.head()

Plot map

In [None]:
port_regions.plot(figsize = (15,15), color="darkgreen", edgecolor="black")

Change colors

In [None]:
port_regions.plot(figsize = (15,15), color="darkgreen", edgecolor="black")

Import shapefile of portuguese municipalities (polygon vector layer)

In [None]:
port_munic = gpd.read_file("Shapes/DGT/Concelhos_dd.shp") # change path accordingly!!
port_munic.head()

Convert polygon vector layer to point vector layer (using the centroid)

In [None]:
port_munic_cent = port_munic.copy() # copy poly to new GeoDataFrame
# change the geometry
port_munic_cent.geometry = port_munic_cent['geometry'].centroid
# same crs
port_munic_cent.crs = port_munic.crs
port_munic_cent.head()

Plot the map

In [None]:
port_munic.plot(figsize = (15,15))

Plot map and centroids

In [None]:
ax = port_munic.plot(figsize = (15,15))
port_munic_cent.plot(color="white", alpha = 0.7, ax=ax)

Import csv table with population density per municipality in Portugal

In [None]:
dens_pop = pd.read_csv("Shapes/Dens_pop_municipal.csv", sep=";", encoding="CP1252")
dens_pop.head()

Join the table with the imported shapefiles (polygons and centroids)

In [None]:
port_munic_denspop = port_munic.merge(dens_pop, left_on="Concelho", right_on="Nome")
port_munic_denspop_cent = port_munic_cent.merge(dens_pop, left_on="Concelho", right_on="Nome")
port_munic_denspop.head()

Create a cloropleth map

In [None]:
port_munic_denspop.plot(figsize = (10,10), 
                        column="2021", 
                        legend=True
                        )

In [None]:
# Same, but classified using quantiles
port_munic_denspop.plot(figsize = (10,10), 
                        column="2021",
                        legend=True,
                        scheme='quantiles' # use quantiles instead you need to install mapclassify (the type of legend also changes)
                        )

In [None]:
# Same, with another color palette
port_munic_denspop.plot(figsize = (10,10), 
                        column="2021", 
                        legend=True, cmap='OrRd', 
                        scheme='quantiles'
                        )

Create a scatter plot map

In [None]:
ax = port_munic.plot(figsize = (15,15))
port_munic_denspop_cent.plot(column="2021", 
                    legend=True,
                    scheme='quantiles',
                    ax=ax
                    )

Create a bubble plot map

In [None]:
ax = port_munic.plot(figsize = (15,15))
port_munic_denspop_cent.plot(markersize="2021", 
                    color="pink",
                    alpha=0.4,
                    legend=True, # does not work
                    scheme='quantiles', # does not work
                    ax=ax
                    )

### 2. The `Folium` module

`Folium` makes it easy to visualize data that has been manipulated in Python on an interactive `leaflet` map. It enables both the binding of data to a map for choropleth visualizations as well as passing rich vector/raster/HTML visualizations as markers on the map.

The library has a number of built-in tilesets from OpenStreetMap, Mapbox, and Stamen, and supports custom tilesets with Mapbox or Cloudmade API keys. folium supports both Image, Video, GeoJSON and TopoJSON overlays.

A useful feature of Folium is that it provides easy functionality to export an interactive map to HTML, making it a useful tool in web development.

In [None]:
import folium

We first need to add lat and long to the 'port_munic_denspop_cent' attribute table

In [None]:
# Add lat and long to the 'port_munic_denspop_cent' attribute table
port_munic_denspop_cent["Long"] = port_munic_denspop_cent['geometry'].x
port_munic_denspop_cent["Lat"] = port_munic_denspop_cent['geometry'].y
port_munic_denspop_cent.head()

Create a map

In [None]:
m = folium.Map(location = [40, -9],
               zoom_start = 6)
m

Save to `html`

In [126]:
m.save('my_map.html')

Create an interactive bubble plot map

In [None]:
import math

m=folium.Map(
    location=[port_munic_denspop_cent['Lat'].mean(), port_munic_denspop_cent['Long'].mean()],
    zoom_start=8)

def get_radius(pop):
  return math.log(pop)*2

port_munic_denspop_cent.apply(
    lambda row: folium.CircleMarker(
        location=[row['Lat'], row['Long']],
        radius=get_radius(row['2021']),
        popup=row['Concelho'], # information that you get by clicking on top of the bubble
        tooltip='<h5>Click here for more info</h5>',
        stroke=True,
        weight=1,
        color="#3186cc",
        fill=True,
        fill_color="#3186cc",
        opacity=0.9,
        fill_opacity=0.3,
        ).add_to(m),
    axis=1)
m

Create a cloropleth map

In [None]:
m = folium.Map(location = [40, -9],
               zoom_start = 6)

folium.Choropleth(
   
      # geographical locations
    geo_data = port_munic,                    
    name = "choropleth",
   
      # the data set we are using
    data = dens_pop,                       
    columns = ["Nome", "2021"],    
   
      # YlGn refers to yellow and green
    fill_color = "YlGn",                     
    fill_opacity = 0.7,
        key_on = "feature.id",
    legend_name = "Unemployment Rate (%)",
).add_to(m)                                
 
m

### 3. Interactive geospatial visualization with `plotly`

`Plotly` also provides interactive geospatial visualization functionalities. It is especially usefull for generating a variety of geographical plots that are easy to built, debug and customize.

We will use `plotly` to demonstrate generating different classes of geographcial plots with several publicly avaolable datasets from a variety of contexts.

Let's start by a quick interactive map using `plotly express` (https://plotly.com/python/scatter-plots-on-maps/)

In [None]:
df = px.data.gapminder().query("year == 2007")
fig = px.scatter_geo(df, locations="iso_alpha",
                     size="pop", # size of markers, "pop" is one of the columns of gapminder
                     )
fig.show()

#### 3.1 Create cloropleth maps (world renewable production and comsuption)

In [334]:
import pandas as pd
import plotly.express as px
import seaborn as sns
import matplotlib.pyplot as plt

Import the renewable energy production dataset

In [None]:
renewable_energy_prod_url = "https://raw.githubusercontent.com/TrainingByPackt/Interactive-Data-Visualization-with-Python/master/datasets/share-of-electricity-production-from-renewable-sources.csv"
renewable_energy_prod_df = pd.read_csv(renewable_energy_prod_url)
renewable_energy_prod_df.head()

In [None]:
# Sort the production DataFrame based on the feature 'Year'.
renewable_energy_prod_df.sort_values(by=['Year'],inplace=True)
renewable_energy_prod_df.head()

In [270]:
# Generate a choropleth map using the plotly express module animated based on 'Year'.

renewable_energy_prod = renewable_energy_prod_df.query('Year<2017 and Year>2007')
fig = px.choropleth(renewable_energy_prod_df, locations="Code",
                    color="Renewable electricity (% electricity production)",
                    hover_name="Country", 
                    animation_frame="Year",
                    color_continuous_scale='Greens')

In [None]:
#Update layout to include suitable title text and projection style and display figure.

fig.update_layout(
    # add a title text for the plot
    title_text = 'Renewable energy production across the world (% of electricity production)',
    # set projection style for the plot
    geo = dict(projection={'type':'natural earth'}) # by default, projection type is set to 'equirectangular'
)

fig.show()

Now let's import the renewable energy consumption dataset

In [None]:
renewable_energy_cons_url = "https://raw.githubusercontent.com/TrainingByPackt/Interactive-Data-Visualization-with-Python/master/datasets/renewable-energy-consumption-by-country.csv"
renewable_energy_cons_df = pd.read_csv(renewable_energy_cons_url)
renewable_energy_cons_df.head()

In [None]:
#Convert the DataFrame to desired format.
#renewable_energy_long_df = pd.wide_to_long(renewable_energy_df, stubnames='Consumption', i=['Country', 'Code','Year'], j='Energy_Source')
#renewable_energy_long_df.head()
renewable_energy_cons_df = pd.melt(renewable_energy_cons_df, \
                                   id_vars=['Country', 'Code','Year'], \
                                   var_name="Energy Source", \
                                   value_name="Consumption (terrawatt-hours)")
renewable_energy_cons_df.head()

In [None]:
#Sort the consumption DataFrame based on the Year feature.

renewable_energy_cons_df.sort_values(by=['Year'], inplace=True)
renewable_energy_cons_df.head()

In [274]:
#Generate a choropleth map for renewable energy consumption using the plotly express module animated based on 'Year'.

import plotly.express as px

renewable_energy_total_cons = renewable_energy_cons_df[renewable_energy_cons_df['Energy Source']=='Total'].query('Year<2017 and Year>2007')
fig = px.choropleth(renewable_energy_total_cons, locations="Code",
                    color="Consumption (terrawatt-hours)",
                    hover_name="Country", 
                    animation_frame="Year",
                    color_continuous_scale='Blues')

In [None]:
#Update layout of the consuption plot to include suitable title text and projection style.

fig.update_layout(
    # add a title text for the plot
    title_text = 'Renewable energy consumption across the world (terrawatt-hours)',
    # set projection style for the plot
    geo = dict(projection={'type':'natural earth'}) # by default, projection type is set to 'equirectangular'
)

fig.show()

#### 3.2 Add animation into a cloropleth map

The next example uses the worldwide use of the internet

In [None]:
#Read the data from the .csv file:
internet_usage_url = "https://raw.githubusercontent.com/TrainingByPackt/Interactive-Data-Visualization-with-Python/master/datasets/share-of-individuals-using-the-internet.csv"
internet_usage_df = pd.read_csv(internet_usage_url)
internet_usage_df.head()

In [304]:
#Subset the data to one specific year since the DataFrame contains records from multiple years:
internet_usage_2016 = internet_usage_df.query("Year==2016")

In [None]:
#Generate a world-wide choropleth map using plotly’s choropleth function:
import plotly.express as px

fig = px.choropleth(internet_usage_2016,
                    locations="Code", # colunm containing ISO 3166 country codes
                    color="Individuals using the Internet (% of population)", # column by which to color-code
                    hover_name="Country", # column to display in hover information
                    color_continuous_scale=px.colors.sequential.Plasma)

fig.show()

In [None]:
internet_usage_2016 = internet_usage_df.query("Year==2016")

#Add title text to the choropleth map
import plotly.express as px
fig = px.choropleth(internet_usage_2016,
                    locations="Code",
                    color="Individuals using the Internet (% of population)", # column by which to color-code
                    hover_name="Country", # column to display in hover information                    color_continuous_scale=px.colors.sequential.Plasma)
                   )
fig.update_layout(
    # add a title text for the plot
    title_text = 'Internet usage across the world (% population) - 2016'
)

fig.show()

In [None]:
#Set geo_scope to Europe in the update_layout function:
import plotly.express as px
fig = px.choropleth(internet_usage_2016,
                    locations="Code",
                    color="Individuals using the Internet (% of population)", # column by which to color-code
                    hover_name="Country", # column to display in hover information
                    color_continuous_scale=px.colors.sequential.Plasma)

fig.update_layout(
    # add a title text for the plot
    title_text = 'Internet usage across the European Continent (% population) - 2016',
    geo_scope = 'europe' # can be set to north america | south america | africa | asia | europe | usa
)

fig.show()

In [None]:
#Set the projection type to natural earth:
import plotly.express as px
fig = px.choropleth(internet_usage_2016,
                    locations="Code",
                    color="Individuals using the Internet (% of population)", # column by which to color-code
                    hover_name="Country", # column to display in hover information
                    color_continuous_scale=px.colors.sequential.Plasma)

fig.update_layout(
    # add a title text for the plot
    title_text = 'Internet usage across the world (% population) - 2016',
    # set projection style for the plot
    geo = dict(projection={'type':'natural earth'}) # by default, projection type is set to 'equirectangular'
)

fig.show()

In [None]:
#Add animation to year column using animation_frame=year:
import plotly.express as px
fig = px.choropleth(internet_usage_df, locations="Code",
                    color="Individuals using the Internet (% of population)", # lifeExp is a column of gapminder
                    hover_name="Country", # column to add to hover information
                    animation_frame="Year", # column on which to animate
                    color_continuous_scale=px.colors.sequential.Plasma)
                    
fig.update_layout(
    # add a title text for the plot
    title_text = 'Internet usage across the world (% population)',
    # set projection style for the plot
    geo = dict(projection={'type':'natural earth'}) # by default, projection type is set to 'equirectangular'
)

fig.show()

In [None]:
#Sort the dataset by Year 
internet_usage_df.sort_values(by=["Year"],inplace=True)
internet_usage_df.head()

In [None]:
#Let’s generate the animated plot again now that the sorting is done

fig = px.choropleth(internet_usage_df, locations="Code",
                    color="Individuals using the Internet (% of population)", # lifeExp is a column of gapminder
                    hover_name="Country", # column to add to hover information
                    animation_frame="Year", # column on which to animate
                    color_continuous_scale=px.colors.sequential.Plasma)
                    
fig.update_layout(
    # add a title text for the plot
    title_text = 'Internet usage across the world (% population)',
    # set projection style for the plot
    geo = dict(projection={'type':'natural earth'}) # by default, projection type is set to 'equirectangular'
)

fig.show()

### 3.3 Create bubble plots in a map

In [None]:
internet_users_url = "https://raw.githubusercontent.com/TrainingByPackt/Interactive-Data-Visualization-with-Python/master/datasets/number-of-internet-users-by-country.csv"
internet_users_df = pd.read_csv(internet_users_url)
internet_users_df.head()

In [None]:
#Learning from our previous experience, Let’s sort the DataFrame by the Year feature.
internet_users_df.sort_values(by=['Year'],inplace=True)
internet_users_df.head()

In [None]:
#Let's first plot the number of users using internet across the world in 2016.

import plotly.express as px

fig = px.scatter_geo(internet_users_df.query("Year==2016"), 
                    locations="Code", # name of column indicating country-codes
                    size="Number of internet users (users)", # name of column by which to size the bubble
                    hover_name="Country", # name of column to be displayed while hovering over the map
                    size_max=80, # parameter to scale all bubble sizes
                    color_continuous_scale=px.colors.sequential.Plasma)
                    
fig.update_layout(
    # add a title text for the plot
    title_text = 'Internet users across the world - 2016',
    # set projection style for the plot
    geo = dict(projection={'type':'natural earth'}) # by default, projection type is set to 'equirectangular'
)

fig.show()

In [None]:
#Now, Let’s animate the bubble plot to show increase in number of internet users across the years, by using the animation_frame parameter.
import plotly.express as px

fig = px.scatter_geo(internet_users_df, 
                    locations="Code", # name of column indicating country-codes
                    size="Number of internet users (users)", # name of column by which to size the bubble
                    hover_name="Country", # name of column to be displayed while hovering over the map
                    size_max=80, # parameter to scale all bubble size
                    animation_frame="Year",
                    )
                    
fig.update_layout(
    # add a title text for the plot
    title_text = 'Internet users across the world',
    # set projection style for the plot
    geo = dict(projection={'type':'natural earth'}) # by default, projection type is set to 'equirectangular'
)

fig.show()

### 3.4 Create a line flow map

In the next example we will show how to plot lines in a map with `plotly` using flight connections in the USA.

Import airport locations

In [None]:
us_airports_url = "https://raw.githubusercontent.com/TrainingByPackt/Interactive-Data-Visualization-with-Python/master/datasets/airports.csv"
us_airports_df = pd.read_csv(us_airports_url)
us_airports_df.head()

Scatter plot on a map

In [None]:
#We can generate a scatter plot on the US map to indicate the locations of all airports in our dataset, using the graph_objects module.

import plotly.graph_objects as go

fig = go.Figure()

fig.add_trace(go.Scattergeo(
    locationmode = 'USA-states',
    lon = us_airports_df['LONGITUDE'],
    lat = us_airports_df['LATITUDE'],
    hoverinfo = 'text',
    text = us_airports_df['AIRPORT'],
    mode = 'markers',
    marker = dict(size = 5,color = 'black')))

fig.update_layout(
    title_text = 'Airports in USA',
    showlegend = False,
    geo = go.layout.Geo(
        scope = 'usa'
    ),
)

fig.show()

Flight records

In [None]:
#Now Let’s load the file containing flight records.
new_year_2015_flights_url = "https://raw.githubusercontent.com/TrainingByPackt/Interactive-Data-Visualization-with-Python/master/datasets/new_year_day_2015_delayed_flights.csv"
new_year_2015_flights_df = pd.read_csv(new_year_2015_flights_url)
new_year_2015_flights_df.head()

Origin dataset

In [None]:
#Along with the source and destination airports for each flight, we need to have the longitude and latitude 
# information of the corresponding airports. To do this, we need to merge the DataFrames containing airport and 
# flight data. Let’s first merge to obtain longitude and latitudes for the origin airports of all flights.
# merge the DataFrames on origin airport codes
new_year_2015_flights_df = new_year_2015_flights_df.merge(us_airports_df[['IATA_CODE','LATITUDE','LONGITUDE']], \
                                                          left_on='ORIGIN_AIRPORT', \
                                                          right_on='IATA_CODE', \
                                                          how='inner')

# drop the duplicate column containing airport code
new_year_2015_flights_df.drop(columns=['IATA_CODE'],inplace=True)

# rename the latitude and longitude columns to reflect that they correspond to the origin airport
new_year_2015_flights_df.rename(columns={"LATITUDE":"ORIGIN_AIRPORT_LATITUDE", "LONGITUDE":"ORIGIN_AIRPORT_LONGITUDE"},inplace=True)
new_year_2015_flights_df.head()

Destination dataset

In [None]:
#Now, we will perform a similar merging to get the latitude, longitude data for destination airports of all flights.
# merge the DataFrames on desintation airport codes
new_year_2015_flights_df = new_year_2015_flights_df.merge(us_airports_df[['IATA_CODE','LATITUDE','LONGITUDE']], \
                                                          left_on='DESTINATION_AIRPORT', \
                                                          right_on='IATA_CODE', \
                                                          how='inner')

# drop the duplicate column containing airport code
new_year_2015_flights_df.drop(columns=['IATA_CODE'],inplace=True)

# rename the latitude and longitude columns to reflect that they correspond to the destination airport
new_year_2015_flights_df.rename(columns={'LATITUDE':'DESTINATION_AIRPORT_LATITUDE', 'LONGITUDE':'DESTINATION_AIRPORT_LONGITUDE'},inplace=True)
new_year_2015_flights_df.head()

Create line flow map

In [None]:
#Now, we will draw our line plots -- for each flight, we need to draw a line between the origin and destination airport. This is done by providing the latitude and longitude values of destination and origin airports to the lonand lat parameters of Scattergeo and setting mode to 'lines' instead of 'markers'. Also, notice that we are using another add_trace function here. It may take a few minutes for the plot to show the flight routes.

for i in range(len(new_year_2015_flights_df)):
    fig.add_trace(
        go.Scattergeo(
            locationmode = 'USA-states',
            lon = [new_year_2015_flights_df['ORIGIN_AIRPORT_LONGITUDE'][i], new_year_2015_flights_df['DESTINATION_AIRPORT_LONGITUDE'][i]],
            lat = [new_year_2015_flights_df['ORIGIN_AIRPORT_LATITUDE'][i], new_year_2015_flights_df['DESTINATION_AIRPORT_LATITUDE'][i]],
            mode = 'lines',
            line = dict(width = 1,color = 'red')
        )
    )
    
fig.update_layout(
    title_text = 'Delayed flight on Jan 1, 2015 in USA',
    showlegend = False,
    geo = go.layout.Geo(
        scope = 'usa'
    ),
)
  
fig.show()

## References

Folium. https://python-visualization.github.io/folium/

Geospatial Data in Python - Interactive Visualization. https://www.codementor.io/@abdelfettahbesbes/geospatial-data-in-python-interactive-visualization-1oti7dtr2v

Interactive Data Visualization with Python. https://github.com/TrainingByPackt/Interactive-Data-Visualization-with-Python 

Introduction to GeoPandas https://geopandas.org/en/stable/getting_started/introduction.html