# A descriptive data analysis of the Danish Regions

You may need to install the DST api-data reader and the pandas_datareader to run all code in this project. Uncomment the following cells and run to install. 

In [None]:
# The DST API wrapper
    # %pip install git+https://github.com/alemartinello/dstapi

In [None]:
# A wrapper for multiple APIs with a pandas interface
    # %pip install pandas-datareader

Imports and set magics:

In [170]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from matplotlib_venn import venn2
from dstapi import DstApi
import pandas_datareader

# autoreload modules when code is run
%load_ext autoreload
%autoreload 2


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## 1. <a id='toc1_'></a>[Fetching and exploring data](#toc0_)

### 1.1. <a id='toc1_1_'></a>[Dataset 1: The account and budgets of regions](#toc0_)

We'll use [dstapi](https://github.com/alemartinello/dstapi) by Alessandro Martinello to fetch data from Danmarks Statistik. 

First, we create an dst api **object** that will allow us to interact with the DST server. 

In [171]:
ind = DstApi('REGR11') # object to interact with DST server

A quick overview of the available data in REGR11:

In [172]:
tabsum = ind.tablesummary(language='en')
display(tabsum)

Table REGR11: Regions accounts by main accounts by region, main account, dranst, kind, price unit and time
Last update: 2022-04-22T08:00:00


Unnamed: 0,variable name,# values,First value,First value label,Last value,Last value label,Time variable
0,OMRÅDE,6,000,All Denmark,081,Region Nordjylland,False
1,FUNK1,6,X,I alt hovedkonto 0-5,5,5 Interest etc.,False
2,DRANST,5,1,1 Current expenditure,7,7 Financing,False
3,ART,52,UE,Expenses exclusive calculating expenses,97,9.7 Internal revenues,False
4,PRISENHED,2,LOBM,"Current prices (DKK 1,000)",INDL,"Per capita, current prices (DKK)",False
5,Tid,15,2007,2007,2021,2021,True


To get an overview of the available values for each variable in the dataset, we make a loop:

In [173]:
# The available values for a each variable: 
for variable in tabsum['variable name']:
    print(variable+':')
    display(ind.variable_levels(variable, language='en'))

OMRÅDE:


Unnamed: 0,id,text
0,0,All Denmark
1,84,Region Hovedstaden
2,85,Region Sjælland
3,83,Region Syddanmark
4,82,Region Midtjylland
5,81,Region Nordjylland


FUNK1:


Unnamed: 0,id,text
0,X,I alt hovedkonto 0-5
1,1,1 Healthcare
2,2,2 Social and specialeducation
3,3,3 County development
4,4,4 Joint purpose and administration
5,5,5 Interest etc.


DRANST:


Unnamed: 0,id,text
0,1,1 Current expenditure
1,2,2 Reimbursement from central government
2,3,3 Capital expenditure
3,4,4 Interests
4,7,7 Financing


ART:


Unnamed: 0,id,text
0,UE,Expenses exclusive calculating expenses
1,UI,Expenses inclusive calculating expenses
2,TOT,Total
3,I,Incomes
4,S0,0 Calculating expenses
5,00,"0.0 Balance sheets, entries"
6,01,0.1 Depreciation
7,02,0.2 Changes in stocks
8,03,0.3 Pension provision for civil servants
9,04,0.4 Interest


PRISENHED:


Unnamed: 0,id,text
0,LOBM,"Current prices (DKK 1,000)"
1,INDL,"Per capita, current prices (DKK)"


Tid:


Unnamed: 0,id,text
0,2007,2007
1,2008,2008
2,2009,2009
3,2010,2010
4,2011,2011
5,2012,2012
6,2013,2013
7,2014,2014
8,2015,2015
9,2016,2016


In [174]:
# the _define_base_params -method gives us a nice template (selects all available data)
params = ind._define_base_params(language='en')
params

{'table': 'regr11',
 'format': 'BULK',
 'lang': 'en',
 'variables': [{'code': 'OMRÅDE', 'values': ['*']},
  {'code': 'FUNK1', 'values': ['*']},
  {'code': 'DRANST', 'values': ['*']},
  {'code': 'ART', 'values': ['*']},
  {'code': 'PRISENHED', 'values': ['*']},
  {'code': 'Tid', 'values': ['*']}]}

In [175]:
# manually selecting the data we want
params = {'table': 'regr11',
 'format': 'BULK',
 'lang': 'en',
 'variables': [{'code': 'OMRÅDE', 'values': ['*']},
  {'code': 'FUNK1', 'values': ['X']},
  {'code': 'DRANST', 'values': ['1']},
  {'code': 'ART', 'values': ['TOT']},
  {'code': 'PRISENHED', 'values': ['LOBM']},
  {'code': 'Tid', 'values': ['*']}]}

Now we can load the data from DST via the API using the operations specified in the param dictionary. 

In [176]:
inc_api = ind.get_data(params=params)
inc_api.head(5)

Unnamed: 0,OMRÅDE,FUNK1,DRANST,ART,PRISENHED,TID,INDHOLD
0,All Denmark,I alt hovedkonto 0-5,1 Current expenditure,Total,"Current prices (DKK 1,000)",2017,114350951
1,Region Nordjylland,I alt hovedkonto 0-5,1 Current expenditure,Total,"Current prices (DKK 1,000)",2017,11718599
2,Region Syddanmark,I alt hovedkonto 0-5,1 Current expenditure,Total,"Current prices (DKK 1,000)",2009,20462606
3,Region Hovedstaden,I alt hovedkonto 0-5,1 Current expenditure,Total,"Current prices (DKK 1,000)",2009,30587411
4,Region Syddanmark,I alt hovedkonto 0-5,1 Current expenditure,Total,"Current prices (DKK 1,000)",2021,27219129


We can sort by OMRÅDE and TID to get a nicer structure in the data. 

In [177]:
inc_api.sort_values(by=['OMRÅDE', 'TID'], inplace=True)
inc_api.reset_index(inplace=True) #resetting index 
inc_api.head(5)

Unnamed: 0,index,OMRÅDE,FUNK1,DRANST,ART,PRISENHED,TID,INDHOLD
0,22,All Denmark,I alt hovedkonto 0-5,1 Current expenditure,Total,"Current prices (DKK 1,000)",2007,84398718
1,44,All Denmark,I alt hovedkonto 0-5,1 Current expenditure,Total,"Current prices (DKK 1,000)",2008,90810648
2,9,All Denmark,I alt hovedkonto 0-5,1 Current expenditure,Total,"Current prices (DKK 1,000)",2009,96968991
3,52,All Denmark,I alt hovedkonto 0-5,1 Current expenditure,Total,"Current prices (DKK 1,000)",2010,99429448
4,84,All Denmark,I alt hovedkonto 0-5,1 Current expenditure,Total,"Current prices (DKK 1,000)",2011,99328557


In [178]:
new_df = inc_api[['OMRÅDE', 'TID', 'INDHOLD']] #Selecting the relevant columns fra the above DataFrame and store in new DataFrame
new_df = new_df.rename(columns={'OMRÅDE': 'region', 'TID': 'year', 'INDHOLD': 'expenditure'}) #Renaming Columns
new_df['expenditure'] = new_df['expenditure'].div(10**6)  #Scale expenditures to billions 

new_df.head(5)

Unnamed: 0,region,year,expenditure
0,All Denmark,2007,84.398718
1,All Denmark,2008,90.810648
2,All Denmark,2009,96.968991
3,All Denmark,2010,99.429448
4,All Denmark,2011,99.328557


We can now make an interactive plot to inspect the total operating expenditures Region by Region from 2007-2021 in Denmark.

In [179]:
def _plot_timeseries(dataframe, variable, region, years):
    
    fig = plt.figure(dpi=100)
    ax = fig.add_subplot(1,1,1)
    
    dataframe.loc[:,['year']] = pd.to_numeric(dataframe['year'])
    I = (dataframe['year'] >= years[0]) & (dataframe['year'] <= years[1]) & (dataframe['region'] == region)
        
    x = dataframe.loc[I,'year']
    y = dataframe.loc[I,variable]

    ax.set_title('Expenditure of Danish Regions')
    ax.set_xlabel('Years', fontsize = 10)
    ax.set_ylabel('Billion kr., current prices', fontsize = 10)
    ax.plot(x,y)
    

In [180]:
def plot_timeseries(dataframe):

    widgets.interact(_plot_timeseries, 
    dataframe = widgets.fixed(dataframe),
    
    variable = widgets.Dropdown(
        description='variable', 
        options=['expenditure'], 
        value='expenditure'),
        
    region = widgets.Dropdown(description='region', 
                                        options=dataframe.region.unique(), 
                                        region='Region Nordjylland'),

    years=widgets.IntRangeSlider(
            description="years",
            min=2007,
            max=2021,
            value=[2007, 2021],
            continuous_update=False,
        )   
    ); 


In [181]:
plot_timeseries(new_df)

interactive(children=(Dropdown(description='variable', options=('expenditure',), value='expenditure'), Dropdow…

### 1.2. <a id='toc1_1_'></a>[Dataset 2: Number of employees per region](#toc0_)

In [261]:
import requests 
import json
from pandas.io.json import json_normalize

d = {
  "table": "Personale-måned",
  "time": [
    {
      "y1": "2023",
      "m1": "01"
    },
    {
      "y1": "2022",
      "m1": "01"
    },
    {
      "y1": "2021",
      "m1": "01"
    },
    {
      "y1": "2020",
      "m1": "01"
    },
    {
      "y1": "2019",
      "m1": "01"
    },
    {
      "y1": "2018",
      "m1": "01"
    },
    {
      "y1": "2017",
      "m1": "01"
    },
    {
      "y1": "2016",
      "m1": "01"
    },
    {
      "y1": "2015",
      "m1": "01"
    },
    {
      "y1": "2014",
      "m1": "01"
    },
    {
      "y1": "2013",
      "m1": "01"
    },
    {
      "y1": "2012",
      "m1": "01"
    },
    {
      "y1": "2011",
      "m1": "01"
    },
    {
      "y1": "2010",
      "m1": "01"
    },
    {
      "y1": "2009",
      "m1": "01"
    },
    {
      "y1": "2008",
      "m1": "01"
    },
    {
      "y1": "2007",
      "m1": "01"
    }
  ],
  "control": [
    "kom_reg"
  ],
  "data": [
    "fuldtid"
  ],
  "selection": [
    {
      "name": "Udvalgte population",
      "filters": {
        "omr": [
          "1",
          "8"
        ]
      }
    }
  ],
  "options": {
    "totals": True,
    "outputFormat": "json",
    "actions": [],
    "tableName": "Antal ansatte",
    "subLimit": 5,
    "modelName": "SIRKA",
    "timeIncreasing": True
  },
  "dimension": {
    "viewportHeight": 591,
    "viewportWidth": 638,
    "xsMaxWidth": 768,
    "smMaxWidth": 992,
    "mdMaxWidth": 1200,
    "CONSTANTS": {
      "XS": 0,
      "SM": 1,
      "MD": 2,
      "LG": 3,
      "MAIL": 4
    }
  }
}

r = requests.post("https://www.krl.dk/sirka/sirkaApi/tableApi", json.dumps(d)) 

dict = json.loads(r._content)
df = json_normalize(dict)
df2 = pd.DataFrame(df)
display(df2)

  df = json_normalize(dict)


Unnamed: 0,_YM,_BM,kom_reg,fuldtid
0,200701,Udvalgte population,081,11143.754125
1,200701,Udvalgte population,082,24408.394261
2,200701,Udvalgte population,083,23871.169375
3,200701,Udvalgte population,084,35617.518759
4,200701,Udvalgte population,085,15271.505287
...,...,...,...,...
148,202301,Udvalgte population,085,17323.430604
149,202301,Udvalgte population,999,890.179819
150,202301,Udvalgte population,,128846.836405
151,202301,,,128846.836405


In [262]:
# rename columns 
df2 = df2.rename(columns={'_YM': 'year', 'kom_reg': 'region', 'fuldtid': 'fulltime_emp'}) 

# select the relevant columns 
df2 = df2[['year', 'region', 'fulltime_emp']] 

# give regions names based on their komreg number, which can be found at KRL or DST. 
df2['region'] = df2['region'].replace('081', 'Region Nordjylland')
df2['region'] = df2['region'].replace('082', 'Region Midtjylland')
df2['region'] = df2['region'].replace('083', 'Region Syddanmark')
df2['region'] = df2['region'].replace('084', 'Region Hovedstaden')
df2['region'] = df2['region'].replace('085', 'Region Sjælland')
df2['region'] = df2['region'].replace('999', 'Øvrige')

# dropping none values. We do this, because the api has problems with reading the "i alt"
df2 = df2.dropna()

# sort values on year and region
df2.sort_values(by=['year', 'region'], inplace=True) 

df2.sample(10)


Unnamed: 0,year,region,fulltime_emp
36,201101,Region Nordjylland,12824.35372
66,201401,Region Hovedstaden,38913.814186
76,201501,Region Sjælland,15630.318222
75,201501,Region Hovedstaden,39506.4261
117,202001,Region Nordjylland,12557.495233
109,201901,Region Midtjylland,27242.838212
146,202301,Region Syddanmark,26560.169205
59,201301,Øvrige,502.242758
29,201001,Region Syddanmark,24203.558309
86,201601,Øvrige,248.521337


In [263]:
#Making the "All Denmark", i.e. sum of all regions in each year 
df_alt = {'year': [], 'fulltime_emp': []}

for year in df2.year.unique():
    df_temp = df2[df2["year"] == year]
    df_alt['fulltime_emp'].append(df_temp['fulltime_emp'].sum())
    df_alt['year'].append(year)

df_alt = pd.DataFrame(df_alt)

#Adding the region column as i alt, to be ready to combine with df2
df_alt['region'] ='All Denmark' 
df_alt.tail()

Unnamed: 0,year,fulltime_emp,region
12,201901,120500.020156,All Denmark
13,202001,122103.375728,All Denmark
14,202101,128186.597949,All Denmark
15,202201,131915.486956,All Denmark
16,202301,128846.836405,All Denmark


In [264]:
# combine the df with the region data with the "i alt" df
df_combined = pd.concat([df2,df_alt], axis=0)

# drop the month indicating characters in year
df_combined['year'] = df_combined['year'].apply(lambda x: x[:-2])

# reset index
df_combined.reset_index(inplace=True)
df_combined = df_combined.drop('index', axis=1)

# sort values on year and region
df2.sort_values(by=['year', 'region'], inplace=True) 

# display final dataframe 
df_combined.head()


Unnamed: 0,year,region,fulltime_emp
0,2007,Region Hovedstaden,35617.518759
1,2007,Region Midtjylland,24408.394261
2,2007,Region Nordjylland,11143.754125
3,2007,Region Sjælland,15271.505287
4,2007,Region Syddanmark,23871.169375


Now let us plot the number of fulltime employees reigon by region just as we did with expenditures earlier

In [265]:
def _plot_timeseries(dataframe, variable, region, years):
    
    fig = plt.figure(dpi=100)
    ax = fig.add_subplot(1,1,1)
    
    dataframe.loc[:,['year']] = pd.to_numeric(dataframe['year'])
    I = (dataframe['year'] >= years[0]) & (dataframe['year'] <= years[1]) & (dataframe['region'] == region)
        
    x = dataframe.loc[I,'year']
    y = dataframe.loc[I,variable]

    ax.set_title('Fulltime employees of Danish Regions')
    ax.set_xlabel('Years', fontsize = 10)
    ax.set_ylabel('Fulltime employees', fontsize = 10)
    ax.plot(x,y)

In [266]:
def plot_timeseries(dataframe):

    widgets.interact(_plot_timeseries, 
    dataframe = widgets.fixed(dataframe),
    
    variable = widgets.Dropdown(
        description='variable', 
        options=['fulltime_emp'], 
        value='fulltime_emp'),
        
    region = widgets.Dropdown(description='region', 
                                        options=dataframe.region.unique(), 
                                        region='Region Nordjylland'),

    years=widgets.IntRangeSlider(
            description="years",
            min=2007,
            max=2021,
            value=[2007, 2021],
            continuous_update=False,
        )   
    ); 

In [267]:
plot_timeseries(df_combined)

interactive(children=(Dropdown(description='variable', options=('fulltime_emp',), value='fulltime_emp'), Dropd…

## 2. <a id='toc1_'></a>[Merging the data sets](#toc0_)

Now we want to merge the data for expenditures and employment for the regions.

First lets us understand the differences between the datasets we have created

**Find differences:**

In [309]:
#Find differences
diff_y = [y for y in new_df.year.unique() if y not in df_combined.year.unique()] 
print(f'years in new_df data, but not in df_combined data: {diff_y}')

diff_m = [m for m in new_df.region.unique() if m not in df_combined.region.unique()] 
print(f'regions in new_df data, but not in df_combined data: {diff_m}')

diff_r = [r for r in df_combined.region.unique() if r not in new_df.region.unique()] 
print(f'regions in new_df data, but not in df_combined data: {diff_r}')

years in new_df data, but not in df_combined data: []
regions in new_df data, but not in df_combined data: []
regions in new_df data, but not in df_combined data: ['Øvrige']


So because we have cleaned at setup up some nice DataFrames we dont have major differences between the datasets. 

We now perform **Left join** (one-to-one), that keeps obsevrations which are in the left dataset (new_df) or in both data sets (new_df and dF_combined)

This gives us the final combined dataset which we have been aiming for. 

In [310]:
#Merging the datasets 
region_exp_emp = pd.merge(new_df, df_combined, on=['region','year'], how='left')
region_exp_emp.head(10)

Unnamed: 0,region,year,expenditure,fulltime_emp
0,All Denmark,2007,84.398718,110800.519217
1,All Denmark,2008,90.810648,109977.924505
2,All Denmark,2009,96.968991,113612.885754
3,All Denmark,2010,99.429448,118856.231359
4,All Denmark,2011,99.328557,116444.296419
5,All Denmark,2012,103.60632,115318.482198
6,All Denmark,2013,104.810414,117186.444884
7,All Denmark,2014,107.159252,119226.61548
8,All Denmark,2015,109.495306,120088.195058
9,All Denmark,2016,112.300921,118814.060742


In [311]:
def _plot_timeseries(dataframe, variable1, variable2, region, years):
    
    fig = plt.figure(dpi=100)
    ax1 = fig.add_subplot(1,1,1)
    ax2 = ax1.twinx()
    
    dataframe.loc[:,['year']] = pd.to_numeric(dataframe['year'])
    I = (dataframe['year'] >= years[0]) & (dataframe['year'] <= years[1]) & (dataframe['region'] == region)
        
    x = dataframe.loc[I,'year']
    y1 = dataframe.loc[I,variable1]
    y2 = dataframe.loc[I,variable2]

    ax1.set_title('Fulltime employees and Expenditure of Danish Regions')
    ax1.set_xlabel('Years', fontsize = 10)
    ax1.set_ylabel('Fulltime employees', fontsize = 10, color = 'blue')
    ax1.plot(x,y1, color='blue')
    
    ax2.set_ylabel('Expenditure, billion kr.', fontsize = 10, color='red')
    ax2.plot(x,y2, color='red')

In [312]:
def plot_timeseries(dataframe):

    widgets.interact(_plot_timeseries, 
    dataframe = widgets.fixed(dataframe),
    
    variable1 = widgets.Dropdown(
        description='variable1', 
        options=['fulltime_emp'], 
        value='fulltime_emp'),
    
    variable2 = widgets.Dropdown(
        description='variable2', 
        options=['expenditure'], 
        value='expenditure'),

    region = widgets.Dropdown(description='region', 
                              options=dataframe.region.unique(), 
                              value='Region Nordjylland'),

    years=widgets.IntRangeSlider(
            description="years",
            min=2007,
            max=2021,
            value=[2007, 2021],
            continuous_update=False,
        )   
    ); 

In [313]:
plot_timeseries(region_exp_emp)

interactive(children=(Dropdown(description='variable1', options=('fulltime_emp',), value='fulltime_emp'), Drop…

# Analysis

To get a quick overview of the data, we show some **summary statistics** on a meaningful aggregation. 

First we pivot the dataset we have created.

In [317]:
# Set the "year" column as the new index
df_pivot = region_exp_emp.pivot(index='region', columns='year')

# Swap the two levels of the multi-level column index, so that "expenditure" and "fulltime_emp" become the top level
df_pivot = df_pivot.swaplevel(axis=1)

# Sort the column index by the top level, which is now the years
df_pivot = df_pivot.sort_index(axis=1)

# Print the pivoted dataframe
display(df_pivot)


year,2007,2007,2008,2008,2009,2009,2010,2010,2011,2011,...,2017,2017,2018,2018,2019,2019,2020,2020,2021,2021
Unnamed: 0_level_1,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp,...,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp
region,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
All Denmark,84.398718,110800.519217,90.810648,109977.924505,96.968991,113612.885754,99.429448,118856.231359,99.328557,116444.296419,...,114.350951,118529.733422,116.519487,119193.132816,118.743393,120500.020156,125.056607,122103.375728,129.320174,128186.597949
Region Hovedstaden,26.698705,35617.518759,28.504013,34602.377841,30.587411,35741.521158,30.786874,37829.511637,30.746773,36252.762736,...,36.049598,39177.004969,36.631218,38649.541892,37.344899,39092.312202,39.072816,39412.585161,40.488706,41437.232412
Region Midtjylland,17.904448,24408.394261,19.244569,25427.344754,20.489229,26448.80972,21.310588,27843.600491,21.250113,27558.52501,...,24.347025,27159.791736,24.905225,27254.629531,25.411603,27242.838212,26.81048,27517.651209,27.812688,29134.004179
Region Nordjylland,8.741839,11143.754125,9.5578,11321.328307,10.238996,12113.164528,10.442487,12918.091286,10.361579,12824.35372,...,11.718599,12393.890065,11.95012,12405.144169,12.141622,12433.889366,12.821721,12557.495233,13.312703,13225.388547
Region Sjælland,13.229846,15271.505287,14.360099,14772.524183,15.190749,15027.910046,15.71321,15517.610603,15.416587,15179.296882,...,17.776432,15654.200622,18.211613,15789.60861,18.485286,15684.546298,19.532283,15998.882329,20.486948,16656.824661
Region Syddanmark,17.82388,23871.169375,19.144167,23394.697863,20.462606,23861.508604,21.176289,24203.558309,21.553505,24339.892253,...,24.459297,23896.940634,24.821311,24841.622235,25.359983,25285.149299,26.819307,25726.86799,27.219129,26877.504902


Now we have a dataset that is meaningful to make summary statistics on

In [318]:
df_pivot.describe()

year,2007,2007,2008,2008,2009,2009,2010,2010,2011,2011,...,2017,2017,2018,2018,2019,2019,2020,2020,2021,2021
Unnamed: 0_level_1,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp,...,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp,expenditure,fulltime_emp
count,6.0,6.0,6.0,6.0,6.0,6.0,6.0,6.0,6.0,6.0,...,6.0,6.0,6.0,6.0,6.0,6.0,6.0,6.0,6.0,6.0
mean,28.132906,36852.143504,30.270216,36582.699575,32.322997,37800.966635,33.143149,39528.100614,33.109519,38766.52117,...,38.116984,39468.593574,38.839829,39688.946542,39.581131,40039.792589,41.685536,40552.809608,43.106725,42586.258775
std,28.20252,37201.839696,30.317497,36886.221482,32.385308,38087.645337,33.171565,39879.325217,33.15189,38995.527309,...,38.216878,39862.003757,38.931424,40031.57629,39.680329,40525.230151,41.770134,41056.429015,43.186281,43106.525456
min,8.741839,11143.754125,9.5578,11321.328307,10.238996,12113.164528,10.442487,12918.091286,10.361579,12824.35372,...,11.718599,12393.890065,11.95012,12405.144169,12.141622,12433.889366,12.821721,12557.495233,13.312703,13225.388547
25%,14.378355,17421.421309,15.556116,16928.067603,16.508713,17236.309685,17.07898,17689.097529,16.874969,17469.445725,...,19.41908,17714.885625,19.864038,18052.612016,20.20396,18084.697048,21.351832,18430.878744,22.169993,19211.994721
50%,17.864164,24139.781818,19.194368,24411.021308,20.475918,25155.159162,21.243438,26023.5794,21.401809,25949.208632,...,24.403161,25528.366185,24.863268,26048.125883,25.385793,26263.993756,26.814893,26622.2596,27.515909,28005.754541
75%,24.500141,32815.237634,26.189152,32308.619569,28.062866,33418.343299,28.417803,35333.03385,28.448456,34079.203304,...,33.152023,36172.701661,33.69972,35800.813802,34.361575,36129.943705,36.009439,36438.851673,37.319702,38361.425354
max,84.398718,110800.519217,90.810648,109977.924505,96.968991,113612.885754,99.429448,118856.231359,99.328557,116444.296419,...,114.350951,118529.733422,116.519487,119193.132816,118.743393,120500.020156,125.056607,122103.375728,129.320174,128186.597949


# Conclusion

ADD CONCISE CONLUSION.