# Geographic Data Merge

I want to visualize my data with an actual map as it would be difficult to get all 79 municipalities on a single graph for wages and years for example.

Geographic data obtained from [geoBoundaries](https://www.geoboundaries.org) by searching for BIH country code and using ADM1 and ADM3 maps.

In [1]:
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation

from tools.config import GEO_DIR, SAVE_DIR, GIF_DIR

Read in the Geographic files

In [2]:
adm1 = gpd.read_file(GEO_DIR + "geoBoundaries-BIH-ADM1-all/geoBoundaries-BIH-ADM1.geojson")
adm3 = gpd.read_file(GEO_DIR + "geoBoundaries-BIH-ADM3-all/geoBoundaries-BIH-ADM3.geojson")

Need to make sure both files share the same crs

In [3]:
print(f"ADM1 CRS: {adm1.crs}")
print(f"ADM3 CRS: {adm3.crs}")

ADM1 CRS: EPSG:4326
ADM3 CRS: EPSG:4326


I am having issues with trying to join the exact municipality polygons so I will be using their center points instead to do the join.

In [4]:
adm3_points = adm3.copy()
adm3_points.geometry = adm3.geometry.representative_point()

Confirming shape name of entities

In [5]:
adm1['shapeName'].unique()

array(['Brčko District', 'Republika Srpska',
       'Federation of Bosnia and Herzegovina'], dtype=object)

In [6]:
fbih_entity = adm1[adm1['shapeName'] == 'Federation of Bosnia and Herzegovina']

In [7]:
joined_points = gpd.sjoin(adm3_points, fbih_entity, how='inner', predicate='within')

fbih_municipalities = adm3.loc[joined_points.index]
f"Found {len(fbih_municipalities)} municipalities."

'Found 80 municipalities.'

Looks like an extra municipality is included, likely brcko but it will be removed when merging to economic and political data later. The following cell was made raw to avoid saving an image to vcs.

Plot looks great, now to just merge with other data. Before doing that I have to make sure the `shapeName` col is the same format as my `Municiaplity` col.

In [9]:
eco_pol = pd.read_excel(SAVE_DIR + "combined_clean.xlsx")
eco_pol

Unnamed: 0,Municipality,Year,Gross Average Wage,ethnic_concentration_hhi,political_fragmentation_hhi,Percentage of Agricultural Businesses,Employees
0,banovici,2012,1220,0.953261,0.706868,0.019355,5056
1,banovici,2013,1248,0.953261,0.827413,0.022340,5214
2,banovici,2014,1261,0.953261,0.975511,0.023256,5167
3,banovici,2015,1270,0.953261,0.917824,0.025316,5230
4,banovici,2016,1276,0.953261,0.863865,0.025000,5169
...,...,...,...,...,...,...,...
864,zivinice,2018,1187,0.905964,0.581126,0.038872,10137
865,zivinice,2019,1185,0.905964,0.585950,0.036850,10911
866,zivinice,2020,1208,0.905964,0.590914,0.035323,10916
867,zivinice,2021,1241,0.905964,0.596017,0.030917,10929


Used an LLM again to make this name mapping as some of them are not simple transformations like lowercasing and going to ascii. 

Also, the extra municipality was `Istočni Stari Grad` which should be in RS and not FBiH!

In [10]:
name_map = {
    'Olovo': 'olovo',
    'Goražde': 'gorazde',
    'Mostar': 'grad mostar',
    'Živinice': 'zivinice',
    'Široki Brijeg': 'siroki brijeg',
    'Bihać': 'bihac',
    'Tomislavgrad': 'tomislavgrad',
    'Gornji Vakuf-Uskoplje': 'gornji vakuf uskoplje',
    'Ilijaš': 'ilijas',
    'Orašje': 'orasje',
    'Domaljevac-Šamac': 'domaljevac samac',
    'Odžak': 'odzak',
    'Doboj East': 'doboj istok',
    'Gradačac': 'gradacac',
    'Gračanica': 'gracanica',
    'Srebrenik': 'srebrenik',
    'Čelić': 'celic',
    'Teočak': 'teocak',
    'Sapna': 'sapna',
    'Tuzla': 'tuzla',
    'Lukavac': 'lukavac',
    'Banovići': 'banovici',
    'Kladanj': 'kladanj',
    'Kalesija': 'kalesija',
    'Istočni Stari Grad': None,
    'Stari Grad': 'stari grad sarajevo',
    'Trnovo (BiH)': 'trnovo',
    'Hadžići': 'hadzici',
    'Ilidža': 'ilidza',
    'Novo Sarajevo': 'novo sarajevo',
    'Centar': 'centar sarajevo',
    'Novi Grad': 'novi grad sarajevo',
    'Vogošća': 'vogosca',
    'Foča-Ustikolina': 'foca',
    'Pale-Prača': 'pale',
    'Ljubuški': 'ljubuski',
    'Posušje': 'posusje',
    'Grude': 'grude',
    'Ravno': 'ravno',
    'Neum': 'neum',
    'Prozor-Rama': 'prozor rama',
    'Konjic': 'konjic',
    'Jablanica': 'jablanica',
    'Čapljina': 'capljina',
    'Čitluk': 'citluk',
    'Stolac': 'stolac',
    'Velika Kladuša': 'velika kladusa',
    'Bužim': 'buzim',
    'Cazin': 'cazin',
    'Ključ': 'kljuc',
    'Sanski Most': 'sanski most',
    'Bosanska Krupa': 'bosanska krupa',
    'Bosanski Petrovac': 'bosanski petrovac',
    'Drvar': 'drvar',
    'Bosansko Grahovo': 'bosansko grahovo',
    'Glamoč': 'glamoc',
    'Kupres (BiH)': 'kupres',
    'Livno': 'livno',
    'Doboj Jug': 'doboj jug',
    'Usora': 'usora',
    'Tešanj': 'tesanj',
    'Maglaj': 'maglaj',
    'Breza': 'breza',
    'Visoko': 'visoko',
    'Žepče': 'zepce',
    'Zenica': 'zenica',
    'Zavidovići': 'zavidovici',
    'Kakanj': 'kakanj',
    'Vareš': 'vares',
    'Kreševo': 'kresevo',
    'Kiseljak': 'kiseljak',
    'Busovača': 'busovaca',
    'Vitez': 'vitez',
    'Fojnica': 'fojnica',
    'Dobretići': 'dobretici',
    'Jajce': 'jajce',
    'Donji Vakuf': 'donji vakuf',
    'Travnik': 'travnik',
    'Novi Travnik': 'novi travnik',
    'Bugojno': 'bugojno'
}

Now to apply the mapping and merge the dfs

In [11]:
fbih_municipalities["Municipality"] = fbih_municipalities["shapeName"].map(name_map)

In [12]:
fbih_municipalities = fbih_municipalities.merge(
    eco_pol,
    left_on="Municipality",
    right_on="Municipality"
)

In [13]:
# fbih_municipalities

Now to try plotting income per municipality over time.

In [24]:
plot_column = 'Gross Average Wage'

fig, ax = plt.subplots(1, 1, figsize=(12, 12))
ax.set_axis_off()

vmin = fbih_municipalities[plot_column].min()
vmax = fbih_municipalities[plot_column].max()

def update(year):
    fig.clear()

    ax = fig.add_subplot(111) # 111 means 1 row, 1 column, 1st subplot
    ax.set_axis_off()
    
    data_for_year = fbih_municipalities[fbih_municipalities['Year'] == year]
    
    data_for_year.plot(
        column=plot_column,
        ax=ax,
        vmin=vmin,
        vmax=vmax,
        cmap='YlGnBu',
        edgecolor='black',
        linewidth=1.0,
        legend=True,
        legend_kwds={
            'label': f"Gross Average Wage",
            'orientation': "vertical"
        }
    )
    
    ax.set_title(
        f"Gross Average Wage in FBiH Municipalities - Year: {year}",
        fontdict={'fontsize': '18', 'fontweight' : 'bold'}
    )
    ax.set_axis_off()


years = sorted(fbih_municipalities['Year'].unique())

ani = FuncAnimation(
    fig,
    update,
    frames=years,
    repeat=True,
    interval=1000
)


print("Generating GIF... This might take a moment.")
ani.save(
    GIF_DIR + 'gross_average_wage_animation.gif',
    writer='pillow',
    fps=0.5,
)

plt.close(fig)

print("Animation saved as 'gross_average_wage_animation.gif'")

Generating GIF... This might take a moment.
Animation saved as 'gross_average_wage_animation.gif'


Gif doesn't look like its doing anything but opening it with jupyter notebook shows it works!

Also embedding it in markdown displays it!

![Gross Average Wages for each Municipality over time.](../gifs/gross_average_wage_animation.gif)

Now to save the geo merged data for PA4

In [19]:
fbih_municipalities.to_file(
    SAVE_DIR + "geo_combined_clean.gpkg",
    driver="GPKG"
)

In [21]:
gdf = gpd.read_file(SAVE_DIR + "geo_combined_clean.gpkg")
gdf.head(5)

Unnamed: 0,shapeName,shapeISO,shapeID,shapeGroup,shapeType,Municipality,Year,Gross Average Wage,ethnic_concentration_hhi,political_fragmentation_hhi,Percentage of Agricultural Businesses,Employees,geometry
0,Olovo,,43093233B78325541023921,BIH,ADM3,olovo,2012,891,0.940177,0.681965,0.097436,1650,"POLYGON ((18.49439 44.10319, 18.49699 44.10277..."
1,Olovo,,43093233B78325541023921,BIH,ADM3,olovo,2013,881,0.940177,0.795133,0.101737,1666,"POLYGON ((18.49439 44.10319, 18.49699 44.10277..."
2,Olovo,,43093233B78325541023921,BIH,ADM3,olovo,2014,902,0.940177,0.935533,0.115764,1771,"POLYGON ((18.49439 44.10319, 18.49699 44.10277..."
3,Olovo,,43093233B78325541023921,BIH,ADM3,olovo,2015,887,0.940177,0.836525,0.128079,1916,"POLYGON ((18.49439 44.10319, 18.49699 44.10277..."
4,Olovo,,43093233B78325541023921,BIH,ADM3,olovo,2016,921,0.940177,0.750267,0.137529,2124,"POLYGON ((18.49439 44.10319, 18.49699 44.10277..."
