# Map visualization with Python

In [1]:
import numpy as np  # useful for many scientific computing in Python
import pandas as pd # primary data structure library

## 1. Introduction to Folium

Folium is a powerful Python library that helps you create several types of Leaflet maps. The fact that the Folium results are interactive makes this library very useful for dashboard building. Folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js library. Manipulate your data in Python, then visualize it in on a Leaflet map via Folium.

In [None]:
! pip install folium

In [2]:
import folium

In [3]:
# define the world map
world_map = folium.Map()

# display world map
world_map

In [4]:
# define the world map centered around Seoul with a low zoom level
world_map = folium.Map(location=[37.5665, 126.978], zoom_start=4)

# display world map
world_map

### Different styles of maps

#### A. Stamen Toner Maps : Black and white

In [5]:
world_map = folium.Map(location=[37.5665, 126.978], zoom_start=4, tiles='Stamen Toner')
world_map

#### B. Stamen Terrain Maps : hill shading and natural vegetation colors

In [6]:
world_map = folium.Map(location=[37.5665, 126.978], zoom_start=4, tiles='Stamen Terrain')
world_map

#### C. Mapbox Bright maps
Default style, except that the borders are not visible with a low zoom level. Furthermore, unlike the default style where country names are displayed in each country's native language, Mapbox Bright style displays all country names in English.

In [7]:
world_map = folium.Map(location=[37.5665, 126.978], zoom_start=6, tiles='Stamen Terrain')
world_map

## 2. Map with markers

Practice by marking police department incidents using San Francisco crime data.

Data:  San Francisco Police Department Incidents for the year 2016 - [Police Department Incidents](https://data.sfgov.org/Public-Safety/Police-Department-Incidents-Previous-Year-2016-/ritf-b9ki) from San Francisco public data portal. Incidents derived from San Francisco Police Department (SFPD) Crime Incident Reporting system. Updated daily, showing data for the entire year of 2016. Address and location has been anonymized by moving to mid-block or to an intersection. 

In [8]:
df_incidents = pd.read_csv('California crime 2016.csv')

In [9]:
df_incidents.head()

Unnamed: 0,IncidntNum,Category,Descript,DayOfWeek,Date,Time,PdDistrict,Resolution,Address,X,Y,Location,PdId
0,120058272,WEAPON LAWS,POSS OF PROHIBITED WEAPON,Friday,01/29/2016 12:00:00 AM,11:00,SOUTHERN,"ARREST, BOOKED",800 Block of BRYANT ST,-122.403405,37.775421,"(37.775420706711, -122.403404791479)",12005827212120
1,120058272,WEAPON LAWS,"FIREARM, LOADED, IN VEHICLE, POSSESSION OR USE",Friday,01/29/2016 12:00:00 AM,11:00,SOUTHERN,"ARREST, BOOKED",800 Block of BRYANT ST,-122.403405,37.775421,"(37.775420706711, -122.403404791479)",12005827212168
2,141059263,WARRANTS,WARRANT ARREST,Monday,04/25/2016 12:00:00 AM,14:59,BAYVIEW,"ARREST, BOOKED",KEITH ST / SHAFTER AV,-122.388856,37.729981,"(37.7299809672996, -122.388856204292)",14105926363010
3,160013662,NON-CRIMINAL,LOST PROPERTY,Tuesday,01/05/2016 12:00:00 AM,23:50,TENDERLOIN,NONE,JONES ST / OFARRELL ST,-122.412971,37.785788,"(37.7857883766888, -122.412970537591)",16001366271000
4,160002740,NON-CRIMINAL,LOST PROPERTY,Friday,01/01/2016 12:00:00 AM,00:30,MISSION,NONE,16TH ST / MISSION ST,-122.419672,37.76505,"(37.7650501214668, -122.419671780296)",16000274071000


In [10]:
df_incidents.shape

(150500, 13)

The data includes 150,500 crimes and 13 variables. Let's just work with the first 100 incidents in this dataset.

In [11]:
# get the first 100 crimes in the df_incidents dataframe
df_incidents = df_incidents.iloc[0:100, :]

Visualize where crimes took place in San Francisco. 

In [12]:
# San Francisco latitude and longitude values
latitude = 37.77
longitude = -122.42
# create map and display it
sanfran_map = folium.Map(location=[latitude, longitude], zoom_start=12)

# display the map of San Francisco
sanfran_map

In [13]:
sanfran_map = folium.Map(location=[latitude, longitude], zoom_start=12)


### A. 범죄 위치 표시
Now let's superimpose the locations of the crimes onto the map. 

In [17]:
for lat, lng, in zip(df_incidents.Y, df_incidents.X):
   folium.CircleMarker(
      location=[lat, lng],
      radius=5, # define how big you want the circle markers to be
      color='crimson',
      fill=True,
      fill_color='crimson'
   ).add_to(sanfran_map)

In [18]:
sanfran_map

### B. 범죄 종류 label 달아주기 
Let's make each marker display the category of the crime when hovered over.

In [19]:
# add pop-up text to each marker on the map
latitudes = list(df_incidents.Y)
longitudes = list(df_incidents.X)
labels = list(df_incidents.Category)

for lat, lng, label in zip(latitudes, longitudes, labels):
    folium.Marker([lat, lng], popup=label).add_to(sanfran_map)    
    
    
sanfran_map 

In [16]:
# create map and display it
sanfran_map = folium.Map(location=[latitude, longitude], zoom_start=12)

# loop through the 100 crimes and add each to the map
for lat, lng, label in zip(df_incidents.Y, df_incidents.X, df_incidents.Category):
    folium.CircleMarker(
      location=[lat, lng],
      radius=5, # define how big you want the circle markers to be
      color='crimson',
      fill=True,
      fill_color='crimson',
      popup=label
    ).add_to(sanfran_map)

# show map
sanfran_map

### C. 일일이 표시되면 지저분하니까 cluster 로 모아주기.


The other proper remedy is to group the markers into different clusters. Each cluster is then represented by the number of crimes in each neighborhood. These clusters can be thought of as pockets of San Francisco which you can then analyze separately.

To implement this, we start off by instantiating a *MarkerCluster* object and adding all the data points in the dataframe to this object.

In [20]:
from folium import plugins

# let's start again with a clean copy of the map of San Francisco
sanfran_map = folium.Map(location = [latitude, longitude], zoom_start = 12)

# instantiate a mark cluster object for the incidents in the dataframe
incidents = plugins.MarkerCluster().add_to(sanfran_map)

# loop through the dataframe and add each data point to the mark cluster
for lat, lng, label, in zip(df_incidents.Y, df_incidents.X, df_incidents.Category):
    folium.Marker(
        location=[lat, lng],
        icon=None,
        popup=label,
    ).add_to(incidents)

# display map
sanfran_map

## 3. Choropleth map

A Choropleth map is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map, such as population density or per-capita income. The choropleth map provides an easy way to visualize how a measurement varies across a geographic area or it shows the level of variability within a region. Below is a Choropleth map of the US depicting the population by square mile per state.

In [21]:
import json
import folium
import warnings
warnings.simplefilter(action = "ignore", category = FutureWarning)

geo_path = 'skorea_municipalities_geo_simple.json'
geo_str = json.load(open(geo_path, encoding='utf-8'))

Practice by the example of data concerning to the number of crimes convicted in districts of Seoul in 2016. Data: 서울범죄2016 - This data is free of use, open to public by the governmnet of Korea. The number of crimes were cumulated showing data for the entire year of 2016.

In [22]:
df = pd.read_excel('서울범죄2016.xlsx', convert_float=True, encoding='euc-kr')
df.head()

Unnamed: 0,구별,강간,강도,살인,절도,폭력,강간검거율,강도검거율,살인검거율,절도검거율,폭력검거율,인구수,CCTV,범죄,검거
0,강남구,1.0,0.941176,0.916667,0.953472,0.661386,77.728285,85.714286,76.923077,42.857143,86.484594,570500,2780,0.89454,85.463066
1,강동구,0.15562,0.058824,0.166667,0.445775,0.289667,78.846154,100.0,75.0,33.347422,82.890855,453233,773,0.22331,85.550226
2,강북구,0.146974,0.529412,0.416667,0.126924,0.274769,82.352941,92.857143,100.0,43.096234,88.637222,330192,748,0.298949,94.070728
3,관악구,0.628242,0.411765,0.583333,0.562094,0.428234,69.0625,100.0,88.888889,30.561715,80.109157,525515,1496,0.522733,85.212224
4,광진구,0.397695,0.529412,0.166667,0.67157,0.269094,91.666667,100.0,100.0,42.200925,83.047619,372164,707,0.406888,96.37582


To check the data easily, by using the command '.pivot_table' we build up a table

In [23]:
guDF = pd.pivot_table(df, index='구별', aggfunc=np.sum)
guDF.head()

Unnamed: 0_level_0,CCTV,강간,강간검거율,강도,강도검거율,검거,범죄,살인,살인검거율,인구수,절도,절도검거율,폭력,폭력검거율
구별,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
강남구,2780,1.0,77.728285,0.941176,85.714286,85.463066,0.89454,0.916667,76.923077,570500,0.953472,42.857143,0.661386,86.484594
강동구,773,0.15562,78.846154,0.058824,100.0,85.550226,0.22331,0.166667,75.0,453233,0.445775,33.347422,0.289667,82.890855
강북구,748,0.146974,82.352941,0.529412,92.857143,94.070728,0.298949,0.416667,100.0,330192,0.126924,43.096234,0.274769,88.637222
관악구,1496,0.628242,69.0625,0.411765,100.0,85.212224,0.522733,0.583333,88.888889,525515,0.562094,30.561715,0.428234,80.109157
광진구,707,0.397695,91.666667,0.529412,100.0,96.37582,0.406888,0.166667,100.0,372164,0.67157,42.200925,0.269094,83.047619


In [24]:
map = folium.Map(location=[37.5502, 126.982], zoom_start=11, tiles='Stamen Toner')

In [84]:
map.choropleth(geo_data = geo_str,
               data = guDF['살인'],
               columns = [guDF.index, guDF['살인']],
               fill_color = 'PuRd', #PuRd, YlGnBu
               key_on = 'feature.id')
map

In contrast of the number of murders convicted, by using data of murder detection rate apply choropleth on the map of Seoul again.

In [25]:
map = folium.Map(location=[37.5502, 126.982], zoom_start=11, tiles='Stamen Toner')

map.choropleth(geo_data = geo_str,
               data = guDF['살인검거율'],
               columns = [guDF.index, guDF['살인검거율']],
               fill_color = 'YlGnBu', #PuRd, YlGnBu
               key_on = 'feature.id')
map

## 4. Export as image

Folium generates an interactive Leaflet map as an HTML5 document but we might need a static image of the map.

### A. Save as html and capture

In [48]:
map.save('map.html')

### B. Capture using selenium

In [42]:
!pip install selenium



In [56]:
import selenium.webdriver

In [60]:
driver = selenium.webdriver.PhantomJS()
driver.set_window_size(1000, 800)  # choose a resolution
driver.get('map.html')
# You may need to add time.sleep(seconds) here
driver.save_screenshot('screenshot.png')

True

[References]
- Coursera coursse: Data Visualization with Python
- https://pinkwink.kr/
- https://python-graph-gallery.com/ ★ Useful ★