#### Before Start

Start of the cohabiting couple projet.  

This project has been ran with Python 2.7.10, packages are listed [at the end of this notebook](#Packages). Packages are the ones present in `Winpython 2.7.10.2` (portable version) although some may has been actualized and should be matched to the `requirement.txt` present in the `./Programme/`  directory.

In order to run this series of Notebook you need to have : 
1. runed the Sas script `Convert_EDP_sas_to_csv.sas` with the Sas software in order to export to csv the 52 sas files that constitutes the EDP 2015. You will have to **modify the script** such that: 
 1. the first path point to the folder containing the `sas7bdat` files (you will likely only have to change the CASD project name).
 2. the second path point to `./Data/csv/` (but in absolute path) such that the  52 `csv` files are in the `./Data/csv/` folder.
 
This script takes about 15 minutes to get executed.

2. runed the notebook [create hdf](create_hdf.ipynb) you may need to ask the CASD services to increase the SWAP `pagefile.sys` such that you don't get `Memory Errors`.
This script creates the file `edp_2015_final.h5` of 69 Go that contains the whole EDP in an hadoop file system format (HDFS).
** Running the next cell will execute the script**.




In [1]:
import datetime
start_time = datetime.datetime.now(); print(start_time)

2019-04-26 16:36:20.710000


In [2]:
%%time
%%capture
%run create_hdf.ipynb

Wall time: 27min 40s


## Notebooks

I organized notebooks in two sets. 
The first one that must be runned in the presented order.

**Running this notebook** will execute those notebooks in the script section (without modifying the executed notebooks).  
Then the  [Results](### Results) notebooks that contains the restults of the articles and that must be run independently. 


#### First set:

1. [Data preparation : Isolate cohabiting couples](./Programme/Data_preparation_isolate_cohabiting_couples.ipynb)
This script isolates the cohabiting couples with children under 18 from the other types of households. It creates Pickles that contains households `ID_FISC_LOG_DFF`. 
2. [biologic_child_2014.py](./Programme/biologic_script/biologic_child_2014.py)  
Is a script that get keep only households where both parents have the same birth date as the one in the **civil registries** in order to avoid stepfamilies.
2. [Optimizers 2013](./Programme/optimize_2013.ipynb) which compute all the potential income taxes for a household.
You need to execute that notebook for  each 2014 also [Optimizers 2014](./Programme/optimize_2014.ipynb) , and the counterfactuals  [IR_2013_on_2014_income](./Programme/Conterfactual_tax/optimize_cleaned_2014_income_tax_on_2013_income.ipynb) and [IR_2014_on_2013_income](./Programme/Conterfactual_tax/optimize_cleaned_2013_income_tax_on_2014_income.ipynb)
4. [Recensement](./Programme/Recensement.ipynb)
Get the information from the census survey over in a span of 5 years in order to get additionnal informations on the household such as diplomas.
3. [Mariage Separation](./Programme/marriage_separation.ipynb), Isolate couples that separate, get married, or get into a civil union (PACS) between 2014 and 2015.

5. [Data preparation (variables creation)](./Programme/Data_preparation.ipynb)
Creates some variables needed for the analysis such as eledest child age and so on.

### Results

These contains results present in the article (and more). They should be runned independently.

[Descriptive stats](./Programme/Descriptive_stats.ipynb)
Contains descriptive statistics for years 2013 and 2014.  

[Cooperation](./Programme/Regressions_cooperation.ipynb)
Contains MLN logit that shows that cohabiting couples that do not optimize tend to separate more, while couples that do optimize tend to engage more in a civil union (mariage or PACS).


[Learning and transitions](./Programme/Learning_and_transitions.ipynb)
Shows how households reacts to a change (or not) in their optimal allocation between the fiscal years 2013 and 2014.

#### Second set:


1. [Child Repartition](./Programme/Child_repartition_stat_all_year.ipynb)

An aditionnal notebook that shows that children repartion does not vary a lot over the 2010-2015 period.



In [3]:
import datetime
start_time = datetime.datetime.now(); print(start_time)

2019-04-26 17:04:01.052000


In [4]:
%pwd

u'C:\\Users\\IMPTEMP_A_PACIFIC\\Desktop\\Cohabitant_project(EDP_2015)'

In [5]:
%cd Programme
datetime.datetime.now()

C:\Users\IMPTEMP_A_PACIFIC\Desktop\Cohabitant_project(EDP_2015)\Programme


datetime.datetime(2019, 4, 26, 17, 4, 1, 129000)

In [6]:
%%time
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1 Data_preparation_isolate_cohabiting_couples.ipynb

Wall time: 30min 24s


[NbConvertApp] Converting notebook Data_preparation_isolate_cohabiting_couples.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 13973 bytes to Data_preparation_isolate_cohabiting_couples.ipynb


In [7]:
%cd ..
datetime.datetime.now()

C:\Users\IMPTEMP_A_PACIFIC\Desktop\Cohabitant_project(EDP_2015)


datetime.datetime(2019, 4, 26, 17, 34, 25, 863000)

In [8]:
%pwd


u'C:\\Users\\IMPTEMP_A_PACIFIC\\Desktop\\Cohabitant_project(EDP_2015)'

In [9]:
%%time
%%capture
%run ./Programme/biologic_script/biologic_child_2014.py
#Creates 'C:/Users/IMPTEMP_A_PACIFIC/Desktop/EDP_2015/Data/hdf/edp_concubin.h5'

Wall time: 9min 24s


In [10]:
%cd Programme/

C:\Users\IMPTEMP_A_PACIFIC\Desktop\Cohabitant_project(EDP_2015)\Programme


In [11]:
%%time
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1 ./biologic_id_to_pickle.ipynb

Wall time: 17.9 s


[NbConvertApp] Converting notebook ./biologic_id_to_pickle.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 4450 bytes to biologic_id_to_pickle.ipynb


In [12]:
%%time
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1  optimize_2013.ipynb

Wall time: 3min 53s


[NbConvertApp] Converting notebook optimize_2013.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 276171 bytes to optimize_2013.ipynb


In [13]:
%%time
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1  optimize_2014.ipynb

Wall time: 2min 32s


[NbConvertApp] Converting notebook optimize_2014.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 193790 bytes to optimize_2014.ipynb


In [14]:
%pwd

u'C:\\Users\\IMPTEMP_A_PACIFIC\\Desktop\\Cohabitant_project(EDP_2015)\\Programme'

In [15]:
%cd ./Conterfactual_tax

C:\Users\IMPTEMP_A_PACIFIC\Desktop\Cohabitant_project(EDP_2015)\Programme\Conterfactual_tax


In [16]:
%%time
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1  optimize_cleaned_2014_income_tax_on_2013_income.ipynb

Wall time: 3min 55s


[NbConvertApp] Converting notebook optimize_cleaned_2014_income_tax_on_2013_income.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 346982 bytes to optimize_cleaned_2014_income_tax_on_2013_income.ipynb


In [17]:
%%time
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1  optimize_cleaned_2013_income_tax_on_2014_income.ipynb

Wall time: 3min 58s


[NbConvertApp] Converting notebook optimize_cleaned_2013_income_tax_on_2014_income.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 315293 bytes to optimize_cleaned_2013_income_tax_on_2014_income.ipynb


In [18]:
%cd ..

C:\Users\IMPTEMP_A_PACIFIC\Desktop\Cohabitant_project(EDP_2015)\Programme


In [19]:
%%time
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1  Recensement.ipynb

Wall time: 7min 27s


[NbConvertApp] Converting notebook Recensement.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 28694 bytes to Recensement.ipynb


In [20]:
%%time
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1 marriage_separation.ipynb

Wall time: 3min 37s


[NbConvertApp] Converting notebook marriage_separation.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 49577 bytes to marriage_separation.ipynb


In [21]:
%%time
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1 Data_preparation.ipynb

Wall time: 3min 13s


[NbConvertApp] Converting notebook Data_preparation.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 57094 bytes to Data_preparation.ipynb


### Results notebooks

In [22]:
import datetime
datetime.datetime.now()

datetime.datetime(2019, 4, 26, 18, 12, 47, 764000)

In [23]:
%cd Programme

[Error 2] Le fichier sp�cifi� est introuvable: u'Programme'
C:\Users\IMPTEMP_A_PACIFIC\Desktop\Cohabitant_project(EDP_2015)\Programme


In [24]:
 datetime.datetime.now()

datetime.datetime(2019, 4, 26, 18, 12, 48, 363000)

In [25]:
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1 Descriptive_stats.ipynb

[NbConvertApp] Converting notebook Descriptive_stats.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 1783839 bytes to Descriptive_stats.ipynb


In [26]:
 datetime.datetime.now()

datetime.datetime(2019, 4, 26, 18, 27, 24, 448000)

In [27]:
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1 Regressions_cooperation.ipynb

[NbConvertApp] Converting notebook Regressions_cooperation.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 1973682 bytes to Regressions_cooperation.ipynb


In [28]:
 datetime.datetime.now()

datetime.datetime(2019, 4, 26, 18, 28, 59, 677000)

In [29]:
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1 Learning_and_transitions.ipynb

[NbConvertApp] Converting notebook Learning_and_transitions.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 548765 bytes to Learning_and_transitions.ipynb


In [30]:
 datetime.datetime.now()

datetime.datetime(2019, 4, 26, 18, 29, 18, 528000)

#### Second set

In [31]:
 datetime.datetime.now()

datetime.datetime(2019, 4, 26, 18, 29, 18, 588000)

In [32]:
%%time
! Ipython nbconvert --to notebook --execute --inplace --ExecutePreprocessor.timeout=-1 Child_repartition_stat_all_year.ipynb

Wall time: 8min 4s


[NbConvertApp] Converting notebook Child_repartition_stat_all_year.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python2
[NbConvertApp] Writing 90656 bytes to Child_repartition_stat_all_year.ipynb


##### Packages

In [33]:
!pip freeze

adodbapi==2.6.0.7
alabaster==0.7.6
astroid==1.3.6
Babel==1.3
backports.ssl-match-hostname==3.4.0.2
baresql==0.7.1
bcolz==0.10.0
beautifulsoup4==4.4.0
Biryani==0.10.4
blaze==0.8.2
bokeh==0.9.2
brewer2mpl==1.4.1
certifi==2015.4.28
cffi==1.1.2
click==4.1
colorama==0.3.3
configparser==3.5.0b2
cvxopt==1.1.7
cx-Freeze==4.3.4
cyordereddict==0.2.2
Cython==0.22.1
cytoolz==0.7.3
dask==0.6.1
datashape==0.4.6
db.py==0.4.4
decorator==3.4.2
dill==0.2.3
docopt==0.6.2
docutils==0.12
enum34==1.0.4
Flask==0.10.1
fonttools==2.4
formlayout==1.0.15
funcsigs==0.4
functools32==3.2.3.post1
greenlet==0.4.7
guidata==1.6.2
guiqwt==2.3.2
h5py==2.5.0
holoviews==1.3.2
husl==4.0.2
ipynb==0.3
ipython==3.2.1
ipython-sql==0.3.6
isodate==0.5.4
itsdangerous==0.24
jedi==0.9.0
Jinja2==2.7.3
joblib==0.8.4
jsonschema==2.5.1
julia==0.1.1.8
Keras==0.1.2
llvmlite==0.6.0
lmfit==0.8.3
locket==0.2.0
logilab-common==1.0.2
logutils==0.3.3
lxml==3.4.4
Markdown==2.6.2
MarkupSafe==0.23
matplotlib==1.4.3
mingwpy==0.1.0b3
mistune==0.7
mp

Error [Error 2] Le fichier spécifié est introuvable while executing command git config remote.origin.url
cannot determine version of editable source in c:\users\imptemp_a_pacific\desktop\openfisca\openfisca-france-data (git command not found in path)


In [34]:
!pip freeze > requirement.txt

Error [Error 2] Le fichier spécifié est introuvable while executing command git config remote.origin.url
cannot determine version of editable source in c:\users\imptemp_a_pacific\desktop\openfisca\openfisca-france-data (git command not found in path)


In [35]:
! python --version

Python 2.7.10


In [36]:
stop_time = datetime.datetime.now(); print("now: ",stop_time);
execution_time = stop_time - start_time; print("execution_time: ",execution_time)
stop_time = start_time
#clear  memory
%reset -f

('now: ', datetime.datetime(2019, 4, 26, 18, 37, 54, 915000))
('execution_time: ', datetime.timedelta(0, 5633, 863000))
