-
-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize spline tax functions to 2D #839
Conversation
Error is "ModuleNotFoundError: No module named 'pygam'", do new modules/packages need to be added elsewhere? |
@prrathi. This looks really good. For the tests to run on the GitHub Action, you need to add the pygam package to the Add the following three lines at the end of
Add the following two lines after
After you make these changes, you'll want to update your
Make sure your |
@rickecon thanks for the help, think previous issues should be fixed but its giving an error with |
@prrathi. You need to do two things to this PR.
After following these steps (1) and (2), I think it is likely that all the CI tests will pass. |
merging changes from original repo
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #839 +/- ##
==========================================
- Coverage 81.96% 79.93% -2.04%
==========================================
Files 18 18
Lines 4037 4146 +109
==========================================
+ Hits 3309 3314 +5
- Misses 728 832 +104
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@prrathi. Perfect. All tests are passing. Now I just need to review this. I will try to get this done by the end of this week. |
@rickecon sounds good, one thing that is somewhat missing is test cases for the actual interpolated against expected interpolated values analogous to your test case for the other methodology. I wasn't sure how to obtain that expected value so didn't write any cases for that |
@rickecon @jdebacker following up if there were any updates on reviewing this, I also should be able to make this Monday's meeting if there is one so can discuss more there |
@rickecon I think the only reference here is https://pygam.readthedocs.io/en/latest/. I had experimented with adding a tensor term in addition to the current spline terms for each dimension but with this specific dataset hadn't seen improvement. There are also some other model classes I didn't explore |
@prrathi. Can you merge the most recent updates to OG-Core into your branch? That will be a better foundation for me to work off of. |
merging updates
@rickecon merged |
tests/test_txfunc.py
Outdated
# X, y, weights, lam=100, incl_uncstr=True, show_plot=True, method='pygam' | ||
# ) | ||
|
||
with open("test_io_data/micro_data_dict_for_tests.pkl", "rb") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use os.path.join(CUR_PATH, "test_io_data", "micro_data_dict_for_tests.pkl")
for the files name so we can open it from other directories besides tests
(e.g., this test fails to open the pickle file when I run pytest
from OG-Core/
Also, should we make use of the utils.safe_read_pickle
as we do elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prrathi I see you removed your local path. That's helpful. But also note the suggested changes to specify paths from the CUR_PATH
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdebacker @rickecon added the change. Also from testing locally I think this needs numpy version < 1.24, because the pygam package has code that doesn't work with numpy >= 1.24, specifically they call numpy.int which seems to be deprecated.
@jdebacker @rickecon did those, also added the option to |
See my PR to your branch with a few changes necessary to make things run. One thing I wasn't clear on in your changes: can one estimate |
ogcore/txfunc.py
Outdated
if not np.isscalar(bins): | ||
err_msg = "monotone_spline2 ERROR: bins value is not type scalar" | ||
""" | ||
New args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this just "Args" as in other docstrings?
Can you can probably remove references to "new" and "old" and just refer to the method as "eilers" and "pygam"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
ogcore/txfunc.py
Outdated
elif tax_func_type == "mono2D": | ||
mono_interp, _, wsse_cstr, _, _ = monotone_spline( | ||
df[["total_labinc", "total_capinc"]].values, | ||
df["etr"].values, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case should use txrates
rather than df["etr"]
since lines above adjust txrates
for the specific rate chosen (i.e., ETR, MTRx, MTRy).
Also, for consistency with other cases, will want to use X
and Y
rather than columns straight from df
here - same for the weights.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think your PR to my fork fixed this
Update to mono2d
@rickecon @jdebacker Following up on the last meeting and the comments above I was looking more into the code integrating
I think I should be able to add in the necessary logic for the first issue and maybe some idea like an interpolation of functions can work for the second issue, but wanted to make sure these are actual issues that would need fixing. |
@prrathi Thank you for looking through the code to find these issues. Re the Re the
cc @rickecon |
Yes, it only relies on the labor income, capital income, and weight columns of the data so would work with both options (regardless of filters on the age column). Also realized the |
@prrathi Now that PR #861 is merged into the Once you sync and make those changes, ping me and I'll run this branch in OG-USA to make sure we can not just estimate the We're getting close! Took a few more changes I didn't anticipate when you set out to do this! |
@jdebacker sorry for the delay on this, would the best way to do this be create a branch with my current updates, sync my master with this master, and then work through the two? |
@prrathi I think you should Let me know if you have questions about any merge conflicts and I can advise. |
@rickecon @jdebacker the above changes should be set, just adapting the |
@prrathi Thanks for the updates. I just added two comments. With those changes I was able to solve the SS of the OG-USA models using the |
@jdebacker do you mean a review for the code? I don't see anything from my end |
ogcore/txfunc.py
Outdated
] | ||
for t in range(income.shape[0]) | ||
] | ||
txrates = np.array(txrates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this line:
txrates = np.squeeze(np.array(txrates))
Else the dimension is (S,1) not (S), which causes issues when using the functions in the model.
ogcore/txfunc.py
Outdated
splines=[100, 100], | ||
) | ||
wsse = wsse_cstr | ||
params = mono_interp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make this line:
params = [mono_interp]
This will make sure the mono function is in a list and thus be consistent with how the parametric tax functions have their parameters in lists.
@prrathi Sorry about that - I forgot to click to submit my review. Please see comments now. |
@prrathi. I just submitted a PR to your branch that makes the changes that @jdebacker requested in |
Update version number and jdebacker review comments
@rickecon @jdebacker sounds good, just merged |
@prrathi. Thank you for this excellent PR. And thanks @jdebacker for all the maintenance and testing of this. I think this is ready to merge. |
This PR addresses issue #828 through pyGAM which enables monotonic splines for any number of dimensions. It creates a new method within
monotone_spline
function alongside new arguments to specify the method and necessary inputs. Also adds a new functionavg_by_bin_2d
for binning data with any number of x dimensions (maybe the function name should be changed) that's used in the 2d case for the tax data example created intest_txfunc.py
. @jdebacker @rickecon