Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update criterion functions for new estimagic interface #389

Merged
merged 23 commits into from Apr 28, 2021
Merged

Conversation

amageh
Copy link
Member

@amageh amageh commented Dec 2, 2020

Current behavior

The current criterion functions will not work for upcoming changes to estimagic's optimization capabilities. In particular, the way outputs are returned needs to be adjusted,

get_log_like_func for maximum likelihood estimation:
The following arguments control what objects will be returned:

  • return_scalar: If True returns mean log-likelihood, if False returns NumPy.array of individual log-likelihood contributions.
  • return_comparison_plot_data: If True returns pandas.DataFrame in tidy data format with the columns identifier, period, choice, value, and kind. value column contains likelihood contributions (see Restructure likelihood.py for #310 and estimagic's comparison plot. #313)

If both are True, the criterion will return a tuple.

get_moment_errors_func for method of simulated moments estimation:
The following arguments control what objects will be returned:

  • return_scalar: If True will return the square product of weighted moment errors (float), if False will return moment errors multiplied by the root of weighting matrix.
  • return_simulated_moments: If True will return simulated moments in the same data structure as input empirical_moments.
  • return_comparison_plot_data: If True will return pandas.DataFrame in tidy data format with empirical and simulated moments (see Add comparison_plot_data to msm and return moments. #363)

If return_simulated_moments or return_comparison_plot_data are True, function will return tuple. Both can't be True at the same time.

Desired behavior

To fit the new estimagic optimization interface, the criterion should return either a scalar/NumPy.ndarray or a dictionary containing the additional information (contributions).

Solution / Implementation

Both functions now return either a scalar or dictionary (no tuples). I implemented this by eliminating the function arguments return_simulated_moments and return_comparison_plot_data. There are definitely other ways to do it but I feel like this could make the functions more concise. Backwards-compatibility will be compromised either way with these changes as far as I can judge.

I detailed in both function docstrings how the function outputs will look like. As for the comparison plot data, some additional details:

Likelihood function:

  • I tested the new interface on the kw_94_one model (1000 agents and 40 periods). The pickled output is around 1.7 MB.
identifier period choice value kind
0 0 0 a -1.03995 choice
1 0 1 b -0.844175 choice
2 0 2 b -0.173064 choice
3 0 3 edu -2.0254 choice
4 0 4 edu -0.527486 choice
... ... ... ... ... ...
79995 999 35 b -10.0324 wage
79996 999 36 b -10.013 wage
79997 999 37 b -9.97582 wage
79998 999 38 b -10.4655 wage
79999 999 39 b -14.1459 wage

The columns contain the following data types:

identifier      uint16
period           uint8
choice        category
value          float64
kind          category
dtype: object

Simulated Method of Moments:

Example of comparison plot data created for kw_94_one on a sample of 320 moments. The pickled output here is around 30 kb but will obviously vary quite a lot with the choice of moments. I imposed only few data types (the columns moment_set and kind are set to categorical) since most depend on user-defined inputs.

  moment_column moment_index value moment_set kind
0 count 0 535.000000 0 empirical
1 count 1 634.000000 0 empirical
2 count 2 722.000000 0 empirical
3 count 3 740.000000 0 empirical
4 count 4 767.000000 0 empirical
... ... ... ... ... ...
635 max 35 109416.436026 0 simulated
636 max 36 99576.614121 0 simulated
637 max 37 115929.849186 0 simulated
638 max 38 100242.059114 0 simulated
639 max 39 98889.945043 0 simulated

Todo

  • Add data types for comparison plot data for both criterion functions.
  • Update returned objects in likelihood module.
  • Update returned objects in msm module.
  • Update documentation (msm guides)
  • Adjust tests.
  • Add more tests (?)
  • Document PR in CHANGES.rst.

@amageh amageh removed the request for review from mo2561057 December 2, 2020 16:54
@amageh amageh added the method-of-simulated-moments All these issues should be tackled when making respy ready for MSM. label Dec 2, 2020
@codecov
Copy link

codecov bot commented Dec 3, 2020

Codecov Report

Merging #389 (43f8797) into main (a4a5ff4) will decrease coverage by 2.41%.
The diff coverage is 54.54%.

❗ Current head 43f8797 differs from pull request most recent head 6cdd6b0. Consider uploading reports for the commit 6cdd6b0 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main     #389      +/-   ##
==========================================
- Coverage   76.59%   74.17%   -2.42%     
==========================================
  Files          50       50              
  Lines        3640     3636       -4     
==========================================
- Hits         2788     2697      -91     
- Misses        852      939      +87     
Flag Coverage Δ
end_to_end 74.17% <54.54%> (+0.85%) ⬆️
integration ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
respy/tests/test_likelihood.py 48.83% <14.28%> (-44.50%) ⬇️
respy/likelihood.py 76.43% <37.50%> (-8.54%) ⬇️
respy/tests/test_method_of_simulated_moments.py 86.66% <59.09%> (-13.34%) ⬇️
respy/method_of_simulated_moments.py 83.78% <100.00%> (+3.78%) ⬆️
respy/tests/test_flexible_choices.py 54.54% <0.00%> (-33.34%) ⬇️
respy/tests/test_solve.py 50.22% <0.00%> (-16.45%) ⬇️
respy/tests/utils.py 92.59% <0.00%> (-3.71%) ⬇️
respy/tests/test_model_processing.py 45.28% <0.00%> (-2.84%) ⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a4a5ff4...6cdd6b0. Read the comment docs.

@amageh amageh requested a review from janosg December 7, 2020 11:11
mo2561057 and others added 2 commits February 27, 2021 06:59
@amageh amageh added this to the 2.0.1 milestone Mar 31, 2021
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Annica Gehlen <39128048+amageh@users.noreply.github.com>
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

pre-commit-ci bot and others added 4 commits April 20, 2021 17:22
* [pre-commit.ci] pre-commit autoupdate (#401)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/asottile/pyupgrade: v2.11.0 → v2.12.0](asottile/pyupgrade@v2.11.0...v2.12.0)
- [github.com/PyCQA/flake8: 3.9.0 → 3.9.1](PyCQA/flake8@3.9.0...3.9.1)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Annica Gehlen <39128048+amageh@users.noreply.github.com>
updates:
- [github.com/asottile/pyupgrade: v2.11.0 → v2.13.0](asottile/pyupgrade@v2.11.0...v2.13.0)
- [github.com/asottile/reorder_python_imports: v2.4.0 → v2.5.0](asottile/reorder-python-imports@v2.4.0...v2.5.0)
- [github.com/psf/black: 20.8b1 → 21.4b0](psf/black@20.8b1...21.4b0)
- [github.com/PyCQA/flake8: 3.9.0 → 3.9.1](PyCQA/flake8@3.9.0...3.9.1)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Annica Gehlen <39128048+amageh@users.noreply.github.com>
@amageh amageh merged commit fc3da1f into main Apr 28, 2021
@amageh amageh deleted the update-crit branch April 28, 2021 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
method-of-simulated-moments All these issues should be tackled when making respy ready for MSM.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants