# Map Visualizations with Python


> ### Datasets:
>
>**[TripAdvisor European restaurants](https://www.kaggle.com/datasets/stefanoleone992/tripadvisor-european-restaurants)**
>
>This dataset includes restaurants with attributes such as location data, average rating, number of reviews, open hours, cuisine types, awards, etc. The dataset combines the restaurants from the main European countries. In the context of this lab, we will work with a subset of the dataset that includes restaurants in Greece.
>    
>
>**[International tourism, number of arrivals](https://data.worldbank.org/indicator/ST.INT.ARVL)**
>
>This dataset contains the yearly number of inbound tourists for every country. The data on inbound tourists refer to the number of arrivals, not to the number of people traveling. Thus a person who makes several trips to a country during a given period is counted each time as a new arrival.


Folium is a Python library designed for creating a variety of interactive maps using the Leaflet.js library. It integrates Python's data manipulation capabilities with Leaflet's efficient mapping functions, allowing users to process data in Python before rendering it on a Leaflet map with Folium.

The library comes with several built-in tilesets such as OpenStreetMap. Folium supports both GeoJSON and TopoJSON overlays, enabling the creation of  choropleth maps.



We need to install Folium first, in order to import it.


In [1]:
!pip3 install folium==0.16.0
import folium


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


Import pandas and numpy:


In [2]:
import numpy as np
import pandas as pd

To generate a map, you create a **Folium** *Map* object, and then you display it.


In [3]:
map = folium.Map()
map

Folium maps are interactive, allowing you to zoom and pan the map. When creating a map, you can specify its initial center and zoom level. To specify its center, you use its *Latitude* and *Longitude*, while to specify the zoom level you define an integer value  from 0 up to 18. Higher values result in a more zoomed in view.

In [4]:
# Athens latitude and longitude values
athens_coords = [37.983810, 23.727539]

map = folium.Map(location=athens_coords, zoom_start=5)

map

Let's create the map again with a higher zoom level.


In [5]:
map = folium.Map(location=athens_coords, zoom_start=10)

map

As you can see, the higher the zoom level the more the map is zoomed into the given center.


## Maps with Markers <a id="6"></a>


Read the CSV file with the restaurants in Greece, load it into a dataframe, and keep only the restaurants in Athens:

In [6]:
df_restaurants = pd.read_csv('tripadvisor_restaurants_greece.csv')

df_restaurants = df_restaurants[df_restaurants['city'] == 'Athens']

df_restaurants

Unnamed: 0,restaurant_name,original_location,country,region,province,city,address,latitude,longitude,claimed,...,excellent,very_good,average,poor,terrible,food,service,value,atmosphere,keywords
7090,Mollini Cafe & Bistrot,"[""Europe"", ""Greece"", ""Attica"", ""Athens""]",Greece,Attica,,Athens,"9-11 Ippokratous, Athens 106 79 Greece",37.981620,23.733213,Unclaimed,...,2.0,0.0,0.0,0.0,0.0,,,,,
7091,Rodakino,"[""Europe"", ""Greece"", ""Attica"", ""Athens""]",Greece,Attica,,Athens,"Iraklidon 70 Iraklidon 70 & Evadnis, Athens Gr...",37.976078,23.712852,Claimed,...,5.0,0.0,0.0,0.0,0.0,,,,,
7092,Laika Bar-Resto,"[""Europe"", ""Greece"", ""Attica"", ""Athens""]",Greece,Attica,,Athens,"30 Pellis, Athens 104 47 Greece",37.982970,23.711530,Claimed,...,13.0,0.0,1.0,1.0,0.0,5.0,4.5,4.5,,
7093,Nuovo Alfisti Espresso Bar,"[""Europe"", ""Greece"", ""Attica"", ""Athens""]",Greece,Attica,,Athens,"1-3 Drakou, Athens 117 42 Greece",37.964680,23.726791,Claimed,...,2.0,1.0,0.0,0.0,0.0,,,,,
7094,Stoa Proia,"[""Europe"", ""Greece"", ""Attica"", ""Athens""]",Greece,Attica,,Athens,"39 Panepistimiou, Athens 105 64 Greece",37.980614,23.732487,Unclaimed,...,1.0,0.0,0.0,0.0,0.0,4.5,4.0,5.0,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10000,Terra Delicia,"[""Europe"", ""Greece"", ""Attica"", ""Athens""]",Greece,Attica,,Athens,"59 Leoforos Georgiou Papandreou Goudi, Zografo...",37.982243,23.769420,Claimed,...,5.0,0.0,0.0,0.0,0.0,5.0,5.0,5.0,,
10001,Nikkei,"[""Europe"", ""Greece"", ""Attica"", ""Athens""]",Greece,Attica,,Athens,"Ksanthipou 10 & Dinokratous 1, Athens 106 74 G...",37.979076,23.742205,Claimed,...,71.0,22.0,9.0,2.0,12.0,4.0,4.0,3.5,,"ceviche, causa, manchego cheese, smoked chicke..."
10002,Acropolis Ami Roof Garden,"[""Europe"", ""Greece"", ""Attica"", ""Athens""]",Greece,Attica,,Athens,10 Iras BEST WESTERN Acropolis Ami Boutique Ho...,37.964554,23.730417,Claimed,...,12.0,1.0,3.0,2.0,2.0,3.5,4.0,3.0,,
10003,Indian Haveli,"[""Europe"", ""Greece"", ""Attica"", ""Athens""]",Greece,Attica,,Athens,"Leof. Andrea Siggrou 12, Athens 117 42 Greece",37.968480,23.731436,Claimed,...,1501.0,121.0,33.0,13.0,7.0,4.5,4.5,4.5,,"butter chicken, paneer, gulab jamun, mango, curry"


Check the columns of the dataset:

In [7]:
df_restaurants.columns

Index(['restaurant_name', 'original_location', 'country', 'region', 'province',
       'city', 'address', 'latitude', 'longitude', 'claimed', 'awards',
       'popularity_detailed', 'popularity_generic', 'top_tags', 'price_level',
       'price_range', 'meals', 'cuisines', 'special_diets', 'features',
       'vegetarian_friendly', 'vegan_options', 'gluten_free',
       'original_open_hours', 'open_days_per_week', 'open_hours_per_week',
       'working_shifts_per_week', 'avg_rating', 'total_reviews_count',
       'default_language', 'reviews_count_in_default_language', 'excellent',
       'very_good', 'average', 'poor', 'terrible', 'food', 'service', 'value',
       'atmosphere', 'keywords'],
      dtype='object')

And find how many rows and columns it contains:


In [8]:
df_restaurants.shape

(2915, 41)

Now, lets create a map centered around Athens with a zoom level of 11.


In [9]:
athens_map = folium.Map(location=athens_coords, zoom_start=11)

athens_map

To be able to visualize the restaurants on the map, we need to remove any rows that do not contain latitude and longitude values:

In [10]:
df_restaurants.dropna(subset=['latitude', 'longitude'], axis=0, inplace=True)

df_restaurants.shape

(2838, 41)

To show these restaurants on the map, we create a *FeatureGroup* and add it to the `athens_map`. We can also add some pop-up text that would get displayed when you hover over a marker.

In [11]:
# create a feature group for the restaurants
restaurants = folium.map.FeatureGroup()

# add pop-up text to each marker on the map
latitudes = list(df_restaurants.latitude)
longitudes = list(df_restaurants.longitude)
labels = list(df_restaurants.cuisines.fillna('N/A'))

# add each restaurant to the restaurants feature group
for lat, lng, label in zip(latitudes, longitudes, labels):
    folium.Marker([lat, lng], popup=label).add_to(restaurants)    
    
# add restaurants to the map
athens_map.add_child(restaurants)

As you can see, the map is overly congested. Overplotting can be a serious problem, which affects the responsiveness of the map, and also complicates its visual analysis.
To solve this problem, we can group the markers into different clusters. Each cluster is then represented by the number of restaurants in this area.

To implement this, we start off by instantiating a *MarkerCluster* object and adding all the data points in the dataframe to this object.

In [12]:
from folium import plugins

# Create a clean copy of the Athens map
athens_map = folium.Map(location = athens_coords, zoom_start = 12)

# create a marker cluster object for the restaurants
restaurants = plugins.MarkerCluster().add_to(athens_map)

# add each restaurant to the restaurants marker cluster
for lat, lng in zip(df_restaurants.latitude, df_restaurants.longitude):
    folium.Marker(
        location=[lat, lng],
        icon=None,
    ).add_to(restaurants)

# display the map
athens_map

Now the map is not congested and you can clearly see the distribution of restaurants in Athens. Also, the map is much more responsive.
When you zoom out, markers are clustered together, while zooming in, a cluster splits into several subclusters or single markers.

## Choropleth Maps <a id="8"></a>

A `Choropleth` map is a map in which areas are colored according to the specific variable visualized on the map. With choropleth maps, you can easily visualize how a metric varies over a geographic area.

Let's read the international tourism data and create a `Choropleth` to compare yearly tourist arrivals between different countries.

Read the  the International tourism into a *pandas* dataframe.


In [13]:
df_tourism = pd.read_csv('international_tourism.csv')

df_tourism.head()

Unnamed: 0,Country Name,Country Code,Indicator Name,Indicator Code,1960,1961,1962,1963,1964,1965,...,2012,2013,2014,2015,2016,2017,2018,2019,2020,2021
0,Aruba,ABW,"International tourism, number of arrivals",ST.INT.ARVL,,,,,,,...,1481000.0,1667000.0,1739000.0,1832000.0,1758000.0,1863000.0,1897000.0,1951000.0,,
1,Afghanistan,AFG,"International tourism, number of arrivals",ST.INT.ARVL,,,,,,,...,,,,,,,,,,
2,Angola,AGO,"International tourism, number of arrivals",ST.INT.ARVL,,,,,,,...,528000.0,650000.0,595000.0,592000.0,397000.0,261000.0,218000.0,218000.0,,
3,Albania,ALB,"International tourism, number of arrivals",ST.INT.ARVL,,,,,,,...,3514000.0,3256000.0,3673000.0,4131000.0,4736000.0,5118000.0,5927000.0,6406000.0,2658000.0,
4,Andorra,AND,"International tourism, number of arrivals",ST.INT.ARVL,,,,,,,...,7900000.0,7676000.0,7797000.0,7850000.0,8025000.0,8152000.0,8328000.0,8235000.0,5207000.0,


To create a `Choropleth` map, we need a GeoJSON file that defines the areas/boundaries of the geographical entities that we are interested in. Since our dataset contains data from all countries in the world, we need a GeoJSON file for all countries. You can find file **world.geo.json** in the zip file you downloaded for this lab.

In [None]:
world_geo = 'world.geo.json'

'world.geo.json'

Then we create a `Choropleth` map, as follows:

In [16]:
# Set up Choropleth map for inbound tourists in 2020
world_map = folium.Map(location=[0, 0], zoom_start=2)

folium.Choropleth(
    geo_data=world_geo,
    data=df_tourism,
    # specifies the columns in the DataFrame to be used
    columns=['Country Code', '2020'],
    # specifies the key or property in the GeoJSON feature properties that matches the country codes.
    key_on='feature.properties.iso_a3',
    # determines the number of bins or color categories for the Choropleth map
    bins=6, 
    fill_color='Greens', # See https://colorbrewer2.org/ for more color schemes
    fill_opacity=1,
    line_opacity=0.2,
    legend_name='Number of Inbound Tourists in 2020',
    line_color="#0000",
    show=True,
    overlay=True,
    nan_fill_color = "White"
).add_to(world_map)

# display map
world_map

As you can see in the map legend, the darker the color of a country, the higher the number of inbound tourists for 2020.


**Question**: Generate a choropleth map to visualize the average yearly tourism for years 2010-2020.

In [21]:
# type your solution

df_tourism["Average"] =df_tourism.loc[:, '2010' : '2020'].mean(axis=1)

world_map = folium.Map(location=[0, 0], zoom_start=2)

folium.Choropleth(
    geo_data=world_geo,
    data=df_tourism,
    # specifies the columns in the DataFrame to be used
    columns=['Country Code', 'Average'],
    # specifies the key or property in the GeoJSON feature properties that matches the country codes.
    key_on='feature.properties.iso_a3',
    # determines the number of bins or color categories for the Choropleth map
    bins=4, 
    fill_color='Greens', # See https://colorbrewer2.org/ for more color schemes
    fill_opacity=1,
    line_opacity=0.2,
    legend_name='Average Number of Inbound Tourists in (2010-2020)',
    line_color="#0000",
    show=True,
    overlay=True,
    nan_fill_color = "White"
).add_to(world_map)

world_map