New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numpyrize backward induction. #145
Conversation
…lve with dataframe.
… in pyth_contributions.
… construct_emax_risk and get_exogenous_variables. Removed unused functions.
Hi @janosg, I am not able to reproduce the error in test_f2py::test_6 on my machine. Check out the second failed run on travis as the first log is not very informative. Can you give me a hint? |
@tobiasraabe, On my machine test_f2py::test_6 also passes. |
Did you also not deactivate mpi or omp? Now, if I deactivate one of them during compilation, I get a segementation fault. |
If I deactivate any of the two or both I also get a segfault with no further information that would help you. |
Thanks for helping me! The seg fault was not caused by Fortran, but by an indexerror in a jitted function which also returns a segmentation fault. I thought Numba is more verbose than this. |
No, in general numba segfaults are much harder to diagnose than fortran segfaults!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work! I create some issues to address later when we have settled down in our development a bit after these tumultous times :)
I cancelled the build as it took too much time. But, tests were successful on my machine, the Python suite was fine on Appveyor and Fortran on Travis. |
Problem
One major bottleneck of the Python implementation is in
pyth_backward_induction
which accounts for the majority of time used. Even for a small state space (10 periods, 2 types) extracting a state and finding the emax of its children states in the subsequent period takes 11% and 86% of the time. This obstacle can only be alleviated if the procedure can be translated to Numpy and Numba.Solution
The final solution is very fast. Getting the emax from subsequent periods is no obstacle anymore and will not be a bottleneck anytime soon. The new bottleneck which accounts for over 92% of the time is
construct_emax_risk
. You can find the profiling results for 10 evaluations in the appendix.pyth_create_systematic_rewards
in state space and add an update function for the rewards used inpyth_criterion
.pyth_backward_induction
pyth_simulate
,pyth_contributions
and other related functions to the new state space.get_emaxs_sub
, but it doubles runtime.construct_emax_risk
as independent gufunc.pyth_create_state_space
so that no states are allowed were an agent chose always occupation A or B or education and lagged choice is home. (Thanks @mo2561057)testing_pull_request.py
works.Appendix