Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revised demographics.py and tests and csv data files #36

Closed
wants to merge 71 commits into from

Conversation

jpycroft
Copy link
Collaborator

Hello @rickecon @jdebacker @nikhilwoodruff,

I think this is getting close now ...

This new pull request replaces the original one started in March 2021. With changing git addresses and so on, it best to start afresh.
@rickecon: I notice that the old pull request included changes to parameter_plots.py & OG-UKplots.mplstyle. Are these still needed?

test_demographics.py
The testing suite for all of the functions in demographics.py is based on the OG-USA test file. Note that some tests are fairly simple and simply test whether the output is the correct shape.

The six tests in test_demographics.py are:
• test_get_pop_objs tests whether the steady state omega_SS equals the long run omega[-1,:]
• test_pop_smooth tests smooth evolution of omega and g_n
• test_imm_smooth test smooth evolution of imm_rates
• test_fert_rates tests the shape of fert_rates
• test_get_mort (added by Rick) tests the shape of mort_rates, the values of mort_rates against a data list within the test (with 100-period lives and with fewer)
• test_get_imm_resid tests shape of imm_rates

demographics.py
For demographics.py, recent changes are largely code to create further csv files for the downloaded code (saved in \data\demographics), which required some rewriting/restructuring of the code. The raw data can also be checked from these csv files. There are seven csv files needed.
Alternatively, all data can be taken from the Eurostat site by setting download=True in each function.

I was having issues with recovering dataframes from csv files that behaved differently to the original dataframes. I’ve found solutions which work, though there may be more elegant ways of doing it.

@jpycroft
Copy link
Collaborator Author

Hi @rickecon,

You mentioned that for you the model was breaking at "curve_fit" within the get_fert function. I was testing just now.

I'm using test_demographics.py and deliberately failing test_get_fert to get so I can print within the get_fert function.
I am able to print the values for over50pred and under15pred, which are produced using curve_fit. I get:

under15pred:  [ 0.71483251  2.04317571  5.83992322 16.69200697 47.71006158]
over50pred:  [109.39731081  64.017876    37.46242405  21.92252076  12.82877253
   7.50722996   4.39313282   2.57080389   1.50440083   0.88035569
   0.51517264]

These values are then scaled to exactly match the under 15 / over 50 totals and included in fert_rates, which gives these values:

fert_rates:  [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 8.91698170e-07 2.61903633e-06
 7.71009048e-06 2.26687005e-05 6.62549564e-05 4.39801153e-04
 1.64448885e-03 4.33788391e-03 8.14303979e-03 1.33199877e-02
 1.70616361e-02 2.05305283e-02 2.40569888e-02 2.71474409e-02
 3.09965506e-02 3.54377049e-02 3.94466671e-02 4.45074687e-02
 4.93073608e-02 5.22590234e-02 5.47425029e-02 5.69597063e-02
 5.47451534e-02 5.19441887e-02 4.81662924e-02 4.35275293e-02
 3.80291283e-02 3.13433478e-02 2.56444012e-02 2.06582167e-02
 1.49545220e-02 9.99803910e-03 6.32723184e-03 3.60467171e-03
 2.06909270e-03 1.13429480e-03 6.41484730e-04 3.57761398e-04
 2.52635447e-04 1.63146237e-04 1.17397731e-04 6.82993578e-05
 3.98483860e-05 2.33556964e-05 1.38901464e-05 8.31351634e-06
 5.00777170e-06 3.04512830e-06 1.84115779e-06 1.10259192e-06
 6.65555594e-07 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]

So unfortunately, this does not explain why your version is not running. Let me know if I can check something else.

@jpycroft
Copy link
Collaborator Author

Hi @rickecon,
I've pushed changes to demographics.py to make it run with get_data_df, instead of get_sdmx_data_df.
All seven test_demographics.py pytest are passing.
Various automatic checks NOT passing. Apart from rerunning black formatting (which fails anyway), I haven't addressed them yet.
A few notes below for the record.
Best,
Jon

Notes on switch to get_data_df from get_sdmx_data_df to download from eurostat.

  • change filter_pars entry contents
  • not CAPS for geo, sex, age
  • replace beg_yr with "2018_value"
  • some changes to column heading in df, e.g., "geo" to "geo\TIME_PERIOD"
  • change to (Year) + "_flag" to drop columns
  • Unfortunately, had to hard code some years, e.g., 'startPeriod':"2018", not beg_yr=2018 and 'startPeriod':beg_yr. Note: str(beg_yr) does not solve the problem.

## [0.1.2] - 2022-10-26

### Changed

* Updated `demographics.py` and `test_demographics.py` (and `calibrate.py`).
* Added some `.csv` data files and `.png` image files.
* Updated the `initial_guess_r_SS` starting value in `oguk_default_parameters.json`.
=======
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(think this might be a merge conflict BTW)

@jpycroft
Copy link
Collaborator Author

Hi @rickecon, @nikhilwoodruff,

To update on trying to run OG-UK from the demog_uk. I note that the tests are failing and, Nikhil, your point above that a merge conflict might be involved.

At present, I get the following error (full output attached):

File "c:\users\jonat\repos\og-uk\oguk\calibrate.py", line 153, in get_tax_function_parameters
num_etr_params = dict_params["tfunc_etr_params_S"].shape[2]
AttributeError: 'list' object has no attribute 'shape'

(dict_params is a dictionary, btw: type(dict_params): <class 'dict'>)

This seems strange, because presumably it works for others, and (I think) it used to work for me too. In any case, I can no longer solve for the SS.

I can't work out which changes that I made that could cause this. There were the changes to demographics.py to switch from get_data_df to get_sdmx_data_df. I also reinstalled the environment.yml (because I removed and reinstalled Anaconda), using the version on the demog_uk branch.

JPnotes_OG-UK_19dec2022.txt

@rickecon
Copy link
Member

@jpycroft. This PR was superceded by PR #57. We can close this now.

@rickecon rickecon closed this Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants