# Swedish maps for data using python and folium

### Background

Working with Swedish maps in python for data is hard for a couple of reasons. Working with swedish maps from windows using folium for visualization is even harder. Thats why I gave up at least three times. You might think that you just get the geojson files. But no. These files are only published in ArcView-shape och MapInfo TAB. Even if you manage to convert them Swedish contains special charachters like **å**, **ä**, and **ö**. Folium cant handle that. And then you find out that the coordinates in your freshly generated geojson file are in the wrong coordinate system.

Thanks to alot of reading and [this](https://ocefpaf.github.io/python4oceanographers/blog/2015/12/14/geopandas_folium/) blog post i was finaly  able to solve it. Thank you Filipe Fernandes. So here's how you do it. 

### Files

First we download shape-files from either [Statistics Sweden](https://www.scb.se/) or [The Swedish Election Authority](https://www.val.se/servicelankar/other-languages/english-engelska/about-us/the-swedish-election-authority.html) and extraxt the nessesary files, like ```Lan_RT90_region.zip```. Use links below to get to the download pages.

The file from The Swedish Election Agency is formatted a little different so if you use that you will need to modify the code a bit. It mainly uses the full names for regions and the key has a different name. 

* https://www.scb.se/hitta-statistik/regional-statistik-och-kartor/regionala-indelningar/digitala-granser/
* https://data.val.se/val/val2018/statistik/#gis

## Imports

We will need the following libraries:

In [121]:
import pandas as pd
import geopandas as geo
import pyepsg
import json
import folium

## Making a GEOjson-file

First we define the path to the shapefile we will use. We downloaded them earlier.

In [122]:
shape = 'geodata/lan/Lan_RT90_region.shx'

Then we will create a GeoDataframe object from that file.

In [123]:
lan = geo.GeoDataFrame.from_file(shape)

Using ```pyepsg``` we can determine that our file uses the wrong coordinate system for ```folium```. This should be ```4326```.

In [124]:
pyepsg.get(lan.crs['init'].split(':')[1])

<ProjectedCRS: 3021, RT90 2.5 gon V>

All we would normaly have to do now is to convert it to the correct coordinate system and turn that in to geojson. We can do both with the following code.

In [125]:
gjson = lan.to_crs(epsg='4326').to_json()

But since this is Swedish we will need some additional work on the geojson, we will get to that in the next section.

## Removing swedish characters

So in this section I am going to tell you how to remove the swedish characters. If we check the gjson file we will se that the Swedish names contain special charachters. Also the numbers in ```LnKod``` and the region names miss the *län*-suffix.

In [126]:
lan = json.loads(gjson)
for item in lan['features']:
    print(item['properties'])

{'LnKod': '01', 'LnNamn': 'Stockholms'}
{'LnKod': '03', 'LnNamn': 'Uppsala'}
{'LnKod': '04', 'LnNamn': 'Södermanlands'}
{'LnKod': '05', 'LnNamn': 'Östergötlands'}
{'LnKod': '06', 'LnNamn': 'Jönköpings'}
{'LnKod': '07', 'LnNamn': 'Kronobergs'}
{'LnKod': '08', 'LnNamn': 'Kalmar'}
{'LnKod': '09', 'LnNamn': 'Gotlands'}
{'LnKod': '10', 'LnNamn': 'Blekinge'}
{'LnKod': '12', 'LnNamn': 'Skåne'}
{'LnKod': '13', 'LnNamn': 'Hallands'}
{'LnKod': '14', 'LnNamn': 'Västra Götalands'}
{'LnKod': '17', 'LnNamn': 'Värmlands'}
{'LnKod': '18', 'LnNamn': 'Örebro'}
{'LnKod': '19', 'LnNamn': 'Västmanlands'}
{'LnKod': '20', 'LnNamn': 'Dalarnas'}
{'LnKod': '21', 'LnNamn': 'Gävleborgs'}
{'LnKod': '22', 'LnNamn': 'Västernorrlands'}
{'LnKod': '23', 'LnNamn': 'Jämtlands'}
{'LnKod': '24', 'LnNamn': 'Västerbottens'}
{'LnKod': '25', 'LnNamn': 'Norrbottens'}


Lets get a list of the correct names, we can use the same file that we will use later to create a choropleth map.

We download statistics of layofs in Sweden during June 2020. We can find the statistiks at the Swedish Employment Agency Service.

* https://arbetsformedlingen.se/om-oss/statistik-och-analyser/statistik

Lets load the file into pandas and check the head.

In [127]:
afstat = pd.read_excel('data/varsel-lan-2020-07.xls', header=4, skipfooter=9)
afstat.head()

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,2020-07,2020-06,2019-07
0,AB,Stockholms län,2144,4718,1067
1,C,Uppsala län,45,52,32
2,D,Södermanlands län,5,245,88
3,E,Östergötlands län,44,154,191
4,F,Jönköpings län,50,267,26


The first thing we notice is that the first two columns are unnamed, lets change that. The fist column is a one or two charachter code associated with the region and the other is the regions full name. So lets rename the columns to ```Kod``` and ```Region```. 

In [128]:
afstat.rename(columns={'Unnamed: 0':'Kod', 'Unnamed: 1':'Region'}, inplace=True)

Now let´s create a new column but with our banes *å*, *ä* and *ö*. First we put them in a dictionary with all the banes as the keys and the characters we want them changed to as the keyes. Then we use pandas replace methot on the ```Region```-column and pass the result to a new column we call ```Reg```.

In [129]:
banes = {'å':'a', 'ä':'a', 'ö':'o', 'Å':'A', 'Ä':'A', 'Ö':'O'}

afstat['Reg'] = afstat['Region'].replace(banes, regex=True)

afstat

Unnamed: 0,Kod,Region,2020-07,2020-06,2019-07,Reg
0,AB,Stockholms län,2144,4718,1067,Stockholms lan
1,C,Uppsala län,45,52,32,Uppsala lan
2,D,Södermanlands län,5,245,88,Sodermanlands lan
3,E,Östergötlands län,44,154,191,Ostergotlands lan
4,F,Jönköpings län,50,267,26,Jonkopings lan
5,G,Kronobergs län,18,58,39,Kronobergs lan
6,H,Kalmar län,125,719,35,Kalmar lan
7,I,Gotlands län,0,20,0,Gotlands lan
8,K,Blekinge län,17,5,10,Blekinge lan
9,M,Skåne län,258,524,304,Skane lan


Lets pass the new column to a list using pandas ```.to_list()```-method.

In [130]:
reglist = afstat['Reg'].to_list()

Las thing we need to to is to iterate over all the ```features``` in the geojson file and replace ```LnNamn``` with names we put in the ```reglist```-variable. 

In [131]:
lan = json.loads(gjson)
for index, item in enumerate(lan['features']):
    item['properties']['LnNamn'] = reglist[index]
    print(item['properties']['LnNamn'])

Stockholms lan
Uppsala lan
Sodermanlands lan
Ostergotlands lan
Jonkopings lan
Kronobergs lan
Kalmar lan
Gotlands lan
Blekinge lan
Skane lan
Hallands lan
Vastra Gotalands lan
Varmlands lan
Orebro lan
Vastmanlands lan
Dalarnas lan
Gavleborgs lan
Vasternorrlands lan
Jamtlands lan
Vasterbottens lan
Norrbottens lan


Finally, lets create a choropleth map of Sweden using folium. As we can see the layofs are concentrated to Stockholms län and Västra Götalands län (they are called regions now by the way but let's just ignore that). These regions are the home of the two most populous regions in Sweden, namely Stockholm and Gothenburg. Turns out that our data wasn't all that well suited for this kind of visualization. Let's change the bins!

In [142]:
lat = 64
long = 16

l = folium.Map(location=[lat, long], zoom_start=5)

folium.Choropleth(
    geo_data=lan,
    name='choropleth',
    data=afstat,
    columns=['Reg', '2020-07'],
    key_on='feature.properties.LnNamn',
    fill_color='YlGn',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='Antal varsel'
).add_to(l)

folium.LayerControl().add_to(l)

l

In [133]:
lat = 64
long = 16

bins = list(afstat['2020-07'].quantile([0, 0.25, 0.5, 0.75, 1]))

l = folium.Map(location=[lat, long], zoom_start=5)

folium.Choropleth(
    geo_data=lan,
    name='choropleth',
    data=afstat,
    columns=['Reg', '2020-07'],
    key_on='feature.properties.LnNamn',
    fill_color='YlGn',
    fill_opacity=0.7,
    line_opacity=0.2,
    bins= bins,
    legend_name='Antal varsel'
).add_to(l)

folium.LayerControl().add_to(l)

l

## Saving the geojson so you don't have to

Here's how you save the geojson as a file so you dont have to:

In [134]:
with open('geodata/lan/Lan_RT90_region.json', 'w') as fout:
    json.dump(lan, fout)

## Lets rince and repeat with the municipality shape file

Lets just do that, I wont explain what I'm doing here.

In [135]:
shape2 = 'geodata/kommun/Kommun_RT90_region.shx'

kommun = geo.GeoDataFrame.from_file(shape2)

gjson2 = kommun.to_crs(epsg='4326').to_json()

In [136]:
scbstat = pd.read_excel('data/BE0101N1.xlsx', header=2, skipfooter= 67)
scbstat.rename(columns={'Unnamed: 0':'Kod', 'Unnamed: 1':'Kommun'}, inplace=True)

scbstat['Kom'] = scbstat['Kommun'].replace(banes, regex=True)
scbstat.head()

Unnamed: 0,Kod,Kommun,2019,Kom
0,114.0,Upplands Väsby,46786.0,Upplands Vasby
1,115.0,Vallentuna,34090.0,Vallentuna
2,117.0,Österåker,45574.0,Osteraker
3,120.0,Värmdö,45000.0,Varmdo
4,123.0,Järfälla,79990.0,Jarfalla


In [137]:
kommunlist = scbstat['Kom'].to_list()

In [138]:
kommun = json.loads(gjson2)

for index, item in enumerate(kommun['features']):
    item['properties']['KnNamn'] = kommunlist[index]

In [139]:
with open('geodata/kommun/Kommun_RT90_region.json', 'w') as fout:
    json.dump(kommun, fout)

In [140]:
lat = 64
long = 16


bins = list(scbstat['2019'].quantile([0, 0.25, 0.5, 0.75, 1]))

k = folium.Map(location=[lat, long], zoom_start=5, tiles='CartoDB positron')

folium.Choropleth(
    geo_data=kommun,
    name='choropleth',
    data=scbstat,
    columns=['Kom', '2019'],
    key_on='feature.properties.KnNamn',
    fill_color='YlGn',
    fill_opacity=0.7,
    line_opacity=0.2,
    bins=bins,
    legend_name='Antal invanare'
).add_to(k)

folium.LayerControl().add_to(k)

k

In [141]:
k.save('kommun.html')