# Inflation, Unemploymet and Real GDP

The goal of this project is to analyze the employment rates of men and women across various industries. We intend to present both the overall employment figures and highlight differences within specific sectors. Our objective is to identify sectors where one gender is predominant over the other and vice versa. Additionally, we aim to examine employment growth rates over time and assess how external factors, such as the 2020 pandemic, have impacted workforce dynamics. Through this analysis, we seek to gain insights into gender representation in the labor market and understand the effects of external events on employment trends.

Imports and load of relevant libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from matplotlib_venn import venn2
import plotly.graph_objects as go
import datetime
import pandas_datareader # install with `pip install pandas-datareader`
from dstapi import DstApi # install with `pip install git+https://github.com/alemartinello/dstapi`

# autoreload modules when code is run
%load_ext autoreload
%autoreload 2

# user written modules
import dataproject as dp

plt.rcParams.update({"axes.grid":True,"grid.color":"black","grid.alpha":"0.25","grid.linestyle":"--"})
plt.rcParams.update({'font.size': 14})

## Importing inflation data from Denmark Statistics

We consider the following dictionary definitions wrt. table PRIS111

In [None]:
columns_dict = {}
columns_dict['VAREGR'] = 'variable'
columns_dict['ENHED'] = 'unit'
columns_dict['TID'] = 'Year'
columns_dict['INDHOLD'] = 'Inflation rate'

var_dict = {} # var is for variable
var_dict['00 Consumer price index, total'] = 'Y'


unit_dict = {}
unit_dict['Index'] = 'Indexnumber'
unit_dict['Percentage change compared to previous month (per cent)'] = 'pct month'
unit_dict['Percentage change compared to same month the year before (per cent)'] = 'pct year'


In [None]:
# Importing inflation data via Api.  
PRIS111_api = DstApi('PRIS111') 
params = PRIS111_api._define_base_params(language='en')

PRIS111 = PRIS111_api.get_data(params)


In [None]:
# Renaming columns of the DataFrame PRIS111. 
PRIS111.rename(columns=columns_dict,inplace=True)

# Looping over the var dictionary.
for key,value in var_dict.items():
    PRIS111.variable.replace(key,value,inplace=True)

# Looping over the unit dictionary.
for key,value in unit_dict.items():
    PRIS111.unit.replace(key,value,inplace=True)



In [None]:
# Only keep rows where the variable is in `[Y]`. Afterwards convert the `value` column to a float.

# Ensuring 'year' is a string before applying string methods
PRIS111['Year'] = PRIS111['Year'].astype(str)

# Filter based on 'variable' values
var_vals = var_dict.values()
I_var = PRIS111['variable'].isin(var_vals)
PRIS111 = PRIS111[I_var]

# Grouping variables
# PRIS111.groupby(['variable','unit']).describe()

## Exploring data set

In [None]:
# Ensuring 'year' is a string before applying string methods
PRIS111['Year'] = PRIS111['Year'].astype(str)

# Filter based on 'variable' values
var_vals = var_dict.values()
I_var = PRIS111['variable'].isin(var_vals)
PRIS111 = PRIS111[I_var]

# Remove rows where 'unit' is either "Indexnumber" or "pct month"
units_to_exclude = ['Indexnumber', 'pct month']
I_unit = ~PRIS111['unit'].isin(units_to_exclude)
PRIS111 = PRIS111[I_unit]

# Filter to keep only rows where 'year' ends with "M12", then remove the "M12" part
I_year_suffix = PRIS111['Year'].str.endswith("M12")
PRIS111 = PRIS111[I_year_suffix]
PRIS111['Year'] = PRIS111['Year'].str.replace('M12', '')

# Convert 'year' back to integer for proper comparison and sorting
PRIS111['Year'] = PRIS111['Year'].astype(int)

# Exclude rows before "2007" and the year "2023"
PRIS111 = PRIS111[(PRIS111['Year'] >= 2007) & (PRIS111['Year'] != 2023)]

# Sort the DataFrame by 'year' in ascending order
PRIS111 = PRIS111.sort_values(by='Year')

# PRIS111



## Merge inflation- and unemployment data from Denmark Statistics

We merge with AULP01: Unemployment

In [None]:

FT_api = DstApi('AULP01')
unemp = FT_api.get_data(params=params)
params = FT_api._define_base_params(language='en')
params['variables'][0]['values'] = ['000']
## 000 is the code for all of Denmark, this can be seen by using: FT_api.variable_levels('HOVEDDELE', language='en')

params['variables'][1]['values'] = ['TOT']
params['variables'][2]['values'] = ['TOT']
params['variables'][3]['values'] = ['2006','2008','2009','2010','2011','2012','2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021', '2022']
unemp.rename(columns={'TID':'Year','INDHOLD':'Unemployment'},inplace=True)
unemp =  unemp.loc[:,['Year','Unemployment']]

In [None]:
#eksperiment
PRIS111['Year'] = PRIS111['Year'].astype(str)
unemp['Year'] = unemp['Year'].astype(str)

# We merge our two dataframes inflation (PRIS111) and Unemployment (unemp)
merged = pd.merge(PRIS111, unemp, how='left', on='Year')

Combined_data = merged[['Year', 'Inflation rate', 'Unemployment']]

In [None]:
Combined_data.head(5)

## Real GDP for Denmark

In [None]:
# Importing data from CSV file 'Real GDP'
GDP = pd.read_csv('GDP new1.csv',delimiter=';')

## Display the data set

In [None]:
#Here we display the data to see the values for Real GDP
GDP.head(10)

In [None]:
# We display our combined data for the inflation and unemployment rate. 
Combined_data.head(2)

We combine all three dataframes Real GDP with inflation and unemployment rate

In [None]:
# We join/ merge our dataframe real GDP with our combined data which contains dataframe inflation and unemployment rate. 
merging = pd.merge(GDP, Combined_data, how='left', on=['Year'])

Combined_data1 = merging[["Year", "Inflation rate", "Unemployment", "Real GDP"]]
Combined_data1.head(10)

In [None]:
#Printing to see what type it is, string or float?
type_ = GDP['Year'].dtype
type1_ = GDP['Real GDP'].dtype
print(type_,type1_)

In [None]:
# We convert 'Year' column to integer if it's stored as string
GDP['Year'] = pd.to_numeric(GDP['Year'], errors='coerce')
# Here we do the same converting column Real GDP to a numeric type
GDP['Real GDP'] = pd.to_numeric(GDP['Real GDP'], errors='coerce')

In [None]:
# Convert 'Year' column to integer if it's stored as string
Combined_data1['Year'] = pd.to_numeric(Combined_data1['Year'], errors='coerce')
# Convert other columns to numeric type
Combined_data1['Real GDP'] = pd.to_numeric(Combined_data1['Real GDP'], errors='coerce')
Combined_data1['Inflation rate'] = pd.to_numeric(Combined_data1['Inflation rate'], errors='coerce')
Combined_data1['Unemployment'] = pd.to_numeric(Combined_data1['Unemployment'], errors='coerce')


In [None]:
# We sort our DataFrame by the 'Year' column
GDP.sort_values(by='Year', inplace=True)

# Plotting the graph for Real GDP
plt.figure(figsize=(10, 4))
plt.plot(GDP['Year'], GDP['Real GDP'], marker='o', color='blue')
plt.ylabel('Real GDP (Mia. kr.)')
plt.title('Real GDP Over Time For Denmark')
plt.xlabel('Year')
plt.grid(True)  # Add gridlines
plt.tight_layout()
plt.show()

Describe what we see on the graph. 

## Analyzing the data

In [None]:
# Calculate average GDP
average_GDP = Combined_data1['Real GDP'].mean()

# Calculate maximum and minimum GDP
max_GDP = Combined_data1['Real GDP'].max()
min_GDP = Combined_data1['Real GDP'].min()

# Calculate average Unemployment Rate
average_unemp = Combined_data1['Unemployment'].mean()

# Calculate average Inflation Rate
average_inf = Combined_data1['Inflation rate'].mean()


# Print the results
print("Average GDP:", average_GDP)
print("Maximum GDP:", max_GDP)
print("Minimum GDP:", min_GDP)
print("Average Unemployment Rate:", average_unemp)
print("Average Inflation Rate:", average_inf)

Describe what wee see above regarding the results

In [None]:
# Calculate the annual growth rate of GDP
GDP['GDP_growth'] = GDP['Real GDP'].pct_change() * 100  # Calculate percentage change in GDP

# Create a new DataFrame with Year and GDP Growth Rate
GDP_growth_ = pd.DataFrame({'Year': GDP['Year'], 'GDP_growth': GDP['GDP_growth']})

# Print the DataFrame with Year and GDP Growth Rate
print(GDP_growth_)