# Recipients of subsidy to private day-care and day-care of own children by region

> **Note the following:** 
> 1. This is *not* meant to be an example of an actual **data analysis project**, just an example of how to structure such a project.
> 1. Remember the general advice on structuring and commenting your code
> 1. The `dataproject.py` file includes a function which can be used multiple times in this notebook.

Imports and set magics:

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from matplotlib_venn import venn2

# autoreload modules when code is run
%load_ext autoreload
%autoreload 2

# user written modules
import dataproject

import datetime

import pandas_datareader # install with `pip install pandas-datareader`
import pydst # install with `pip install git+https://github.com/elben10/pydst`

import matplotlib.pyplot as plt
plt.rcParams.update({"axes.grid":True,"grid.color":"black","grid.alpha":"0.25","grid.linestyle":"--"})
plt.rcParams.update({'font.size': 14})


# Read and clean data

Set up and identify the data tabel we wish to work with. 

In [24]:
# Setup data loader with the langauge 'english'
Dst = pydst.Dst(lang='en') 

# Get a list of all subjects 
Dst.get_subjects() 

# Get all tables in subject '1' (people)
tables = Dst.get_tables(subjects=['1']) 

# Display only the tables in '1' containing 'Recipients' in the text because we know that the table of interest contains this in its title 
display(tables[tables['text'].str.contains('Recipients')])

Unnamed: 0,id,text,unit,updated,firstPeriod,latestPeriod,active,variables
137,DAGTIL4,Recipients of subsidy to private day-care and ...,Number,2022-03-30 08:00:00,2008,2021,True,"[region, grant type, affected , time]"
217,HJEMSYG,Recipients of home nursing,Number,2022-06-10 08:00:00,2016,2021,True,"[region, age, sex, time]"


From running the Dst.get_subjects() we get that we have to search within the subject 'people' which is 1. 

We print a list of all of the tables within this subject. 

We only display the ones that have the word 'Recipients' in the text, because we know that the table we want, the DAGTIL4, has this in its title.


**$\large \color{lightblue}{Import}$**


In [25]:
# Importing the data from DAGTIL4
Rec_vars = Dst.get_variables(table_id='DAGTIL4')
Rec_vars

# Make the Dataframe
variables_rec = {'OMRÅDE':['*'],'TILSKUDSART':['*'],'BERORT':['*'], 'TID':['*']}
rec_gen= Dst.get_data(table_id = 'DAGTIL4', variables=variables_rec)
rec_gen.sort_values(by=['TID', 'OMRÅDE'], inplace=True)

# Rename the columns
rec_gen = rec_gen.rename(columns={'OMRÅDE':'Municipality','TILSKUDSART':'Subsidy_type','TID':'Year', 'BERORT':'Affected'})


# Importing the data from BY2
By_vars = Dst.get_variables(table_id='BY2')
By_vars

# Make the Dataframe
variables_by = {'KOMK':['*'],'ALDER':['*'], 'KØN':['*'], 'Tid':['*']}
by_gen= Dst.get_data(table_id = 'BY2', variables=variables_by)
by_gen.sort_values(by=['TID', 'KOMK'], inplace=True)

# Rename the columns
by_gen = by_gen.rename(columns={'KOMK':'Municipality','ALDER':'Age','KØN':'Gender', 'TID':'Year'})


**$\large \color{lightblue}{Merge}$** 

In [20]:
one2one = pd.merge(by_gen,rec_gen,on=['Municipality', 'Year'],how='inner') # Merge the two dataframes by Municipality and Year
one2one.head(10)

Unnamed: 0,Municipality,Age,Gender,Year,BYST,INDHOLD_x,Subsidy_type,Affected,INDHOLD_y
0,Ballerup,91 years,Women,2010,Greater Copenhagen Region,27,Subsidy to parents who choose private day-care,Children,12
1,Ballerup,91 years,Women,2010,Greater Copenhagen Region,27,Subsidy to parents who choose private day-care,Families,..
2,Ballerup,91 years,Women,2010,Greater Copenhagen Region,27,Subsidy for day-care of own children,Children,14
3,Ballerup,91 years,Women,2010,Greater Copenhagen Region,27,Subsidy for day-care of own children,Families,13
4,Ballerup,7 years,Men,2010,Greater Copenhagen Region,235,Subsidy to parents who choose private day-care,Children,12
5,Ballerup,7 years,Men,2010,Greater Copenhagen Region,235,Subsidy to parents who choose private day-care,Families,..
6,Ballerup,7 years,Men,2010,Greater Copenhagen Region,235,Subsidy for day-care of own children,Children,14
7,Ballerup,7 years,Men,2010,Greater Copenhagen Region,235,Subsidy for day-care of own children,Families,13
8,Ballerup,86 years,Women,2010,Greater Copenhagen Region,58,Subsidy to parents who choose private day-care,Children,12
9,Ballerup,86 years,Women,2010,Greater Copenhagen Region,58,Subsidy to parents who choose private day-care,Families,..


Now we want to examine each variable

In [5]:
one2one_vars = Dst.get_variables(table_id='one2one')

for id in ['INDIKATOR','SEKTOR','ARBFUNK','FAMTYP']:
     print(id)
     values = one2one_vars.loc[one2one_vars.id == id,['values']].values[0,0]
     for value in values:      
         print(f' id = {value["id"]}, text = {value["text"]}')

# print the first 6 valyes of FAMTYP
# values = Rec_vars.loc[Rec_vars.id == 'FAMTYP',['values']].values[0,0]
# print(values)

## Explore each data set

In order to be able to **explore the raw data**, you may provide **static** and **interactive plots** to show important developments 

Et interaktivt plot hvor man kan klikke rundt mellem kommuner? 

In [23]:
# create a pivot table
one2one_pivot = one2one.pivot_table(index='indikator', columns='Year', values='Affected')

# create a subset of data for Men and Women
men = one2one_pivot.loc['Men']
women = one2one_pivot.loc['Woman']

# create line plots for Men and Women
plt.plot(equ_gen_pivot.columns, men, label='Men', color='blue')
plt.plot(equ_gen_pivot.columns, women, label='Women', color='red')

# create bar plot for Diff
diff = equ_gen_pivot.loc['Diff']
plt.bar(equ_gen_pivot.columns, diff, label='Difference', color='lightblue', alpha=0.5)

# add axis labels and legend
plt.xlabel('Year')
plt.ylabel('Hours')
plt.legend()

# set the style using Seaborn
sns.set_style('whitegrid')

# set the plot title and adjust the layout
plt.title('Average Absent hours by Gender')
plt.tight_layout()

# show the plot
plt.show()
 

TypeError: no numeric data to plot

**Interactive plot** :

In [6]:
def plot_func():
    # Function that operates on data set
    pass

widgets.interact(plot_func, 
    # Let the widget interact with data through plot_func()    
); 


interactive(children=(Output(),), _dom_classes=('widget-interact',))

Explain what you see when moving elements of the interactive plot around. 

# Analysis

To get a quick overview of the data, we show some **summary statistics** on a meaningful aggregation. 

MAKE FURTHER ANALYSIS. EXPLAIN THE CODE BRIEFLY AND SUMMARIZE THE RESULTS.

# Conclusion

ADD CONCISE CONLUSION.