## Leaflet visualisation of ATMs in Czech Republic

In this notebook, we will look at all the ATMs of banks in Czech Republic.

We will visualize ATMs as points of different colors (based on bank) and different sizes (based on average transactions per week) on a interactive Leaflet map.

Leaflet is a javascript framework for interactive mapping and can be used in python through folium library. You can read more about leflet here: 

### How will we advance

This notebook closely follows process from getting data, cleaning it and visualizing it in the end. The notebook is therefore best divided in two parts, each part representing a step in the process:

1. In the first part, we get the dataset from a webpage, clean it and save it as a csv file.
2. In the second part we calculate new values needed for mapping and visualize them on a Leaflet background.

### Notable libraries

parsel
pandas
folium

In [7]:
import requests
from parsel import Selector
import folium
import pandas as pd
import random

## Part 1 - Downloading and Cleaning ATM data

In [3]:
url = "http://www.kurzy.cz/banky/bankomaty/"  # all in CZ, 5509 on 20. june 2017
req = requests.get(url)
sel = Selector(req.text)
all_address = sel.xpath('//script[contains(.,"point_list")]').re_first(r'point_list = \[(.*)\]\];')

list_addresses = []
# first split by left bracket
for item in all_address.split('['):
    # then replace redundant characters
    replace_item1 = item.replace("<br /><b>GPS: </b>", ',')
    replace_item2 = replace_item1.replace("</b><br />", ",")
    replace_item = replace_item2.replace("'<b>", "")
    # and split an element (one ATM) by comma, creating a list of lists
    clean_item = replace_item.split(',')
    # strip whitespace
    strip_list = list(map(str.strip, clean_item))
    # creates list of lists
    list_addresses.append(strip_list)
    
len(list_addresses[1:])

5509

In [27]:
import warnings
warnings.filterwarnings('ignore')

# I don't need the first element and some columns
GenericTable = pd.DataFrame(list_addresses[1:])
ATM_Table = GenericTable.loc[:,0:4]

# Create a table from list of lists - city and address doesn't work so well
headers = ["Latitude", "Longitude", "Bank", "Street/City", "City/Street"]
ATM_Table.columns = headers
ATM_Table["Valid_To_Date"] = "2017-06-05"

len(ATM_Table)

5509

Now we want to create fictional transaction history for all ATMs as we don't have real transactional data. But if somebody from bank feels like providing it to me, I'll be more than happy to include real values here :)

Anyway, I want some outliers at top and bottom.

In [28]:
# 10% bottom, 80% normal, 10% top
distribution = [1,4,5,5,5,5,5,5,6,9]
random.seed(521)
ATM_Table["Avg_Weekly_Transactions"] = random.randrange(5000, 10000) * random.choice(distribution) * random.randint(5,10)
random.seed(521)
# it doesn't matter what column you choose, it just needs to be series that apply function is applied to
ATM_Table["Avg_Weekly_Transactions"] = ATM_Table["Bank"].apply(lambda x: random.randrange(5000, 10000) * random.choice(distribution) * random.randint(5,10))


In [10]:
# saving the file with windows encoding that shows czech characters correctly
ATM_Table.to_csv("ATMs_In_CR_Windows1250.csv", sep = ";", index = False, encoding = "cp1250")

## Part 2 - Leaflet visualization

In [11]:
map_osm = folium.Map(location=[49.8971263, 15.7008136], zoom_start=7)

In [35]:
map_osm

In [33]:
import os 
os.getcwd()
print("Working directory")

Working directory


In [14]:
os.chdir('Directory with files')
ATMs = pd.read_csv("ATMs_In_CR_Windows1250.csv", sep = ";", encoding = "cp1250")

Adding color coding to each bank.

In [15]:
import random
ATMs["Color"] = ATMs["Bank"].apply(lambda x: random.choice(["blue", "red"]))

In [16]:
ATMs.head()

Unnamed: 0,Latitude,Longitude,Bank,Street/City,City/Street,Valid_To_Date,Avg_Weekly_Transactions,Color
0,50.088289,14.429073,Fio banka,Praha 1,Náměstí Republiky,2017-06-05,40719,blue
1,50.089492,14.427249,Fio banka,Praha 1,Rybná 682/14,2017-06-05,352900,red
2,50.088228,14.431682,Fio banka,Praha 1,V Celnici 1028/10,2017-06-05,225890,red
3,50.087924,14.430162,Fio banka,Praha 1,V Celnici 1031/4,2017-06-05,203875,blue
4,50.077502,14.418,Fio banka,Praha 2,Odborů 278/4,2017-06-05,271512,blue


Now I need to create a function that will add a color based on a Bank instutution name.

In [17]:
ATMs["Bank"].unique()

array(['Fio banka', 'Česká spořitelna', 'Raiffeisenbank', 'Sberbank CZ',
       'Oberbank AG', 'Air bank', 'Citibank', 'ČSOB', 'Komerční banka',
       'MONETA', 'UniCredit Bank', 'Poštovní spořitelna'], dtype=object)

In [18]:
def bank_color(row):
    '''Adding color to bank based on their bank logo color.'''
    if row == "Fio banka":
        return "#104e8b"
    if row == "Česká spořitelna":
        return "#ee2c2c"
    if row == "Raiffeisenbank":
        return "#eec900"
    if row == "Sberbank CZ":
        return "#228b22"
    if row == "Oberbank AG":
        return "#8b2500"
    if row == "Air bank":
        return "#8cc83c"
    if row == "Citibank":
        return "#1c86ee"
    if row == "Poštovní spořitelna":
        return "#ffd700"
    if row == "ČSOB":
        return "#00bfff"
    if row == "Komerční banka":
        return "#141414"
    if row == "MONETA":
        return "#5d478b"
    if row == "UniCredit Bank":
        return "#ff0000"    
    return "#ffffff"

ATMs["Color"] = ATMs["Bank"].apply(lambda x: bank_color(x))

In [19]:
ATMs.sample(100)

Unnamed: 0,Latitude,Longitude,Bank,Street/City,City/Street,Valid_To_Date,Avg_Weekly_Transactions,Color
2319,49.592961,17.247874,Komerční banka,Tř. Svobody 1035/14,779 00 Olomouc,2017-06-05,313250,#141414
3011,49.594191,18.010816,MONETA,Masarykovo nám. 19,Nový Jičín,2017-06-05,239946,#5d478b
550,49.714800,16.266286,Česká spořitelna,Palackého nám. 184,Polička,2017-06-05,336042,#ee2c2c
4591,49.880733,15.063376,Poštovní spořitelna,Komenského 160,Uhlířské Janovice,2017-06-05,56448,#ffd700
4589,49.018888,13.579542,Poštovní spořitelna,č.p. 17,Kvilda,2017-06-05,51830,#ffd700
4599,49.051950,15.809351,Poštovní spořitelna,Nám. ČSA 39,Moravavské Budějovice,2017-06-05,204925,#ffd700
2836,49.701310,13.426333,Komerční banka,Písecká 1/972,326 00 Plzeň,2017-06-05,379620,#141414
961,50.057964,14.389393,Česká spořitelna,Praha 5 - Radlice,Praha 5,2017-06-05,348201,#ee2c2c
5137,50.382744,13.268313,Poštovní spořitelna,Poštovní 1553,Kadaň,2017-06-05,260820,#ffd700
31,50.054430,14.486114,Fio banka,Praha 10,K Vodě 3200/3,2017-06-05,249900,#104e8b


In [34]:
map_osm = folium.Map(location=[49.8971263, 15.7008136], zoom_start=7)
#ATMs.loc[0:300,:].apply(lambda row:folium.Marker(location=[row["Latitude"], row["Longitude"]],
#                                                icon = folium.Icon(color = row["Color"]),
#                                                popup = "Bank: " + row["Bank"]).add_to(map_osm),
#         axis=1)

marker_cluster = folium.MarkerCluster().add_to(map_osm)

# using sample data to quickly generate a map instead of all data
ATMs.sample(200).apply(lambda row:folium.RegularPolygonMarker(
        [row["Latitude"], row["Longitude"]],
        # html popup: https://gis.stackexchange.com/questions/185897/how-can-i-include-html-in-a-folium-marker-popup
                popup= folium.Popup(folium.Html(''.join(['<b>', row["Bank"], '</b><div>', row["Street/City"], " ", row["City/Street"], '</div>',
                                        '<div><b>Weekly_Transactions: </b>', str(row['Avg_Weekly_Transactions']), ' CZK</div>'])
                            ,script=True)),
        fill_color=row["Color"],
        number_of_sides=4,
        radius=15
        ).add_to(marker_cluster),
    axis=1)

map_osm

Adding size based on transactions at every ATM. The idea is to have reasonably sized shapes, so i will have to normalize the values a little.

In [21]:
ATMs["Avg_Weekly_Transactions"].describe()
# The min value is 25k and max value is 900k, so 25k would have to be radius ~4 and 900k should be radius around ~25.

count      5509.000000
mean     283128.860773
std      137644.903252
min       25365.000000
25%      200280.000000
50%      270400.000000
75%      353592.000000
max      889740.000000
Name: Avg_Weekly_Transactions, dtype: float64

In [22]:
# How does a row in a dataframe look like?
ATMs.loc[1]

Latitude                        50.0895
Longitude                       14.4272
Bank                          Fio banka
Street/City                     Praha 1
City/Street                Rybná 682/14
Valid_To_Date                2017-06-05
Avg_Weekly_Transactions          352900
Color                           #104e8b
Name: 1, dtype: object

In [23]:
import math
math.sqrt(9), math.sqrt(300), math.log(125, 2), math.log(4500, 2), math.log(math.sqrt(25000), 2), math.log(math.sqrt(900000), 2)
# As we can see, sqrt is ideal to use here

(3.0,
 17.320508075688775,
 6.965784284662088,
 12.135709286104401,
 7.304820237218406,
 9.889782737939562)

Finally plotting whole map with all the points

In [24]:
# how to correctly join list of strings
row = ATMs.loc[1]
''.join(['<b>' , row["Bank"] , '</b><div>' ,row["Street/City"] , " " , row["City/Street"] , '</div>',
        '<div><b>Avg_Transactions: </b>' , str(row['Avg_Weekly_Transactions']) , '</div>'])

'<b>Fio banka</b><div>Praha 1 Rybná 682/14</div><div><b>Avg_Transactions: </b>352900</div>'

In [31]:
map_colors_sizes = folium.Map(location=[49.8971263, 15.7008136], zoom_start=7)

marker_cluster = folium.MarkerCluster().add_to(map_colors_sizes)

ATMs.apply(lambda row:folium.RegularPolygonMarker(
        [row["Latitude"], row["Longitude"]],
        color = "#fffff",
        # html popup: https://gis.stackexchange.com/questions/185897/how-can-i-include-html-in-a-folium-marker-popup
        popup= folium.Popup(folium.Html(''.join(['<b>', row["Bank"], '</b><div>', row["Street/City"], " ", row["City/Street"], '</div>',
                                        '<div><b>Weekly_Transactions: </b>', str(row['Avg_Weekly_Transactions']), ' CZK</div>'])
                            ,script=True)),
        #popup= row["Bank"] + ", " + row["Street/City"] + " " + row["City/Street"],
        fill_color=row["Color"],
        number_of_sides=8,
        # radius based on manual experimentation of resulting circle values
        radius=math.sqrt((row["Avg_Weekly_Transactions"]/3000.0))
        ).add_to(marker_cluster),
    axis=1)

# map_colors_sizes

0       <folium.features.RegularPolygonMarker object a...
1       <folium.features.RegularPolygonMarker object a...
2       <folium.features.RegularPolygonMarker object a...
3       <folium.features.RegularPolygonMarker object a...
4       <folium.features.RegularPolygonMarker object a...
5       <folium.features.RegularPolygonMarker object a...
6       <folium.features.RegularPolygonMarker object a...
7       <folium.features.RegularPolygonMarker object a...
8       <folium.features.RegularPolygonMarker object a...
9       <folium.features.RegularPolygonMarker object a...
10      <folium.features.RegularPolygonMarker object a...
11      <folium.features.RegularPolygonMarker object a...
12      <folium.features.RegularPolygonMarker object a...
13      <folium.features.RegularPolygonMarker object a...
14      <folium.features.RegularPolygonMarker object a...
15      <folium.features.RegularPolygonMarker object a...
16      <folium.features.RegularPolygonMarker object a...
17      <foliu

In [32]:
map_colors_sizes.save(outfile = "ATMs_CzechRepublic_5509.html")

You can look at the map result at this link: http://52.59.160.3:3838/sample-apps/ATMs_CZ/ATMs_CzechRepublic_5509.html

What is not cool in this map?

- ~~The popup window doesn't allow new lines as it takes olny raw strings. See here: https://github.com/python-visualization/folium/issues/469~~ 
- **Edit:** It is possible to use html formatted popup and concatenating strings using nested syntax: `folium.Popup(folium.Html(''.join([list, of, strings, here])))` 
- There is not an option to add legend through folium interface into the map plot. See here: https://github.com/python-visualization/folium/issues/528
- Filtering values based on selected category (name of bank or amount of transactions less/more than x)

All of these issues could be solved by updating the resulting javascript .html file manually at specific places. But that's not user friendly and easily reusable :(

### What we accomplished

We were able to parse data about ATMs in Czech Republic from kurzy.cz webpage that also used Leaflet library, although the code isn't neat.
If you know how to parse it better, please add your answer here: https://stackoverflow.com/questions/44353476/get-coordinates-from-leaflet-app-embedded-in-webpage-using-python-scraping

We added additional info to the dataset before visualizing it - colorcoding of banks and size of radius based on fictional transactions of ATM.

### ... and what we didn't
I couldn't show legend for the ATMs banks with colors ~~and we don't~~ but we also have as neatly formatted popup window as page kurzy.cz has. But maybe that was only because of lack of trying, because this could be fixed manually in the produced html.

Folium is still in development as it is open-source library maintained by cool guys in their free time.