From Statistics Denmark's api console, we obtain the data set "IFOR41: Inequality measure measured on equivalent disposable income by inequality measure and municipality". From the data set we extract the municipality code, year, and Gini coefficient.

Sources:

- IFOR41: Inequality measure measured on equivalent disposable income by inequality measure and municipality. Found 16-08-2023. https://www.statistikbanken.dk/statbank5a/SelectVarVal/Define.asp?MainTable=IFOR41&PLanguage=0&PXSId=0&wsid=cftree

Script by Lars Kjær

In [1]:
# get data from api.statbank
import pandas as pd
url = "https://api.statbank.dk/v1/data/IFOR41/CSV?valuePresentation=CodeAndValue&KOMMUNEDK=*&ULLIG=*&Tid=*"
ifor41_raw = pd.read_csv(url, sep = ";")

In [2]:
input_df = ifor41_raw.copy()

############### wrangle the data ########################


# Subset the data so only values representiong Gini coefficient is present
input_df = input_df[input_df['ULLIG'] == '70 Gini-koefficient']


# Get municipality code
def get_muni_code(row):
    return row.split()[0]
input_df['muni_code'] = input_df['KOMMUNEDK'].apply(lambda row: get_muni_code(row))

# Get year
def get_year(row):
    return row.split()[0]
input_df['year'] = input_df['TID'].apply(lambda row: get_year(row))

# get Gini-coefficient values
def get_gini_index(row):
    return row.replace(',','.')
input_df['gini_index'] = input_df['INDHOLD'].apply(lambda row: get_gini_index(row))

# change data types
input_df['muni_code'] = input_df['muni_code'].astype(int)
input_df['year'] = input_df['year'].astype(int)
input_df['gini_index'] = input_df['gini_index'].astype(float)

# filter on year 
input_df = input_df.query('year >= 1992')

# Store data in output variable 
output_df = input_df.iloc[:, -3:]

In [4]:
output_df.to_csv('gini_index.csv',index=False)