-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kindergeldübertrag #751
The head ref may contain hidden characters: "kindergeld\u00FCbertrag_new"
Kindergeldübertrag #751
Conversation
…ests accordingly.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #751 +/- ##
==========================================
+ Coverage 89.46% 89.51% +0.04%
==========================================
Files 51 52 +1
Lines 3656 3682 +26
==========================================
+ Hits 3271 3296 +25
- Misses 385 386 +1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
|
||
aggregate_by_p_id_kindergeldübertrag = { | ||
"kindergeld_übertrag_m": { | ||
"p_id_to_aggregate_by": "p_id_kindergeld_empf", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thinking of a weird case here: what if Kindergeldempfänger is outside the BG? E.g. ex-partner. Is that even allowed? Do we have/need a check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The concept of Kindergeldübertrag is that the Kindergeldempfänger is in a different BG than the child (and hence there is Kindergeldübertrag). Are you thinking about different fg
s?
I.e.
p_id | hh_id | fg_id | bg_id | p_id_kindergeld_empf | p_id_elternteil_1 | p_id_elternteil_2 | kind |
---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | -1 | -1 | -1 | false |
1 | 0 | 1 | 1 | -1 | -1 | -1 | false |
2 | 0 | 0 | 2 | 1 | 0 | 1 | true |
I'd argue that this is not possible because then, the child should count towards the fg
of p_id==1
. But we don't have a test for that, we don't account for separated couples in the same household when creating fg_id
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I meant fg's.
Precondition for fg is living in the same household. I just wonder whether it is possible that the other parent (i.e., of parents living in separate households) gets the Kindergeld. And if not (what I'd kinda expect), whether we check that as a condition on the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll create an issue on the creation of fg_id
. Intuitively, the child should form an fg
with the parent that gets the Kindergeld (but could be something else of course). It's not clear to me what GETTSIM is currently doing in that regard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure the fg conditions on being part of the same household. My main question is whether Kindergeld does the same when parents are separated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me rephrase, that was not clear:
My main question is whether kindergeldempfänger
conditions on that parent being part of the same household when parents are separated — and we are talking about means-tested transfers (ofc parents receive kindergeld for kids at college even though they may not be in the same household).
(I am just worried that the machinery here depends implicitly on this assumption being true, would like to make it explicit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay so my point was that the creation of fg_id
in case parents have a child, are separated but live in the same household should react to who claims Kindergeld.
Your point is that if parents live in different households p_id_kindergeldempfänger
should be compatible with the place of living of the child (if such a requirement exists, presumably it does) if one of the involved individuals gets transfers?
(I am just worried that the machinery here depends implicitly on this assumption being true, would like to make it explicit)
We create neither of the variables endogenously. I'm not sure to which extent creating a consistency check here would create a circle in the DAG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your point is that if parents live in different households
p_id_kindergeldempfänger
should be compatible with the place of living of the child (if such a requirement exists, presumably it does) if one of the involved individuals gets transfers?
Exactly.
(I am just worried that the machinery here depends implicitly on this assumption being true, would like to make it explicit)
We create neither of the variables endogenously. I'm not sure to which extent creating a consistency check here would create a circle in the DAG.
Would be a check pre-DAG, just whether the data passed conforms to expectations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But how do we check pre-DAG whether the involved individuals get transfers?
If I understood it correctly, if all involved individuals don't have their SGB II Bedarf gedeckt, we want to allow for
p_id | hh_id | fg_id | bg_id | p_id_kindergeld_empfänger | kind |
---|---|---|---|---|---|
0 | 0 | 0 | 0 | -1 | false |
1 | 0 | 0 | 0 | 0 | true |
2 | 1 | 1 | 1 | -1 | false |
but we don't want to allow for / it would be inconsistent if
p_id | hh_id | fg_id | bg_id | p_id_kindergeld_empfänger | kind |
---|---|---|---|---|---|
0 | 0 | 0 | 0 | -1 | false |
1 | 0 | 0 | 0 | 2 | true |
2 | 1 | 1 | 1 | -1 | false |
But if all adults have their SGB II Bedarf gedeckt, the latter situation would be perfectly fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine, let's not get carried away here. It may be more about alleinerziehend
status (or, more precisely, having unterhaltsanspruch
) and kindergeld
, basically it is baked in there already (deduction of 1/2 kindergeld
from the alimony payment). We should still think of a check for that, but no need to do it here.
As always, closing an issue by thinking through it opens at least three new ones... 😂
Is that skipped test related to #696? I thought the ALG2 comment there referred to other adults (who could work in principle), but not underage children? |
Yes, that's the bug we discovered some time ago. The skipped test shows that the We could fix this with a |
Cool, pick whatever is easier. |
…est case for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent! No show-stoppers left AFAICT!
_mean_kindergeld_per_child_m: float, | ||
p_id_kindergeld_empf: np.ndarray[int], | ||
p_id: np.ndarray[int], | ||
) -> float: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returns an array of floats, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the following error in the _convert_data_to_correct_types
when using numpy.ndarray[float]
:
ValueError: The data types of the following columns are invalid:
E
E - kindergeld_zur_bedarfsdeckung_m: The internal type numpy.ndarray[float] is not yet supported.
E
E Note that conversion from floating point to integers or Booleans inherently suffers from approximation error. It might well be that your data seemingly obey the restrictions when scrolling through them, but in fact they do not (for example, because 1e-15 is displayed as 0.0).
E The best solution is to convert all columns to the expected data types yourself.
Other functions in the code use numpy.ndarray[bool]
as expected return and this works. Any idea @lars-reimann?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like GETTSIM raises this error itself. Generally, one should never use floating point numbers to represent exact monetary values, for the reasons cited in the error message. But if a close approximation is fine, we could adjust the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that we check for the type of an input Series in convert_series_to_internal_type
, which needs to be float
, bool
etc.:
277 internal_type = None
278 if column_name in TYPES_INPUT_VARIABLES:
279 internal_type = TYPES_INPUT_VARIABLES[column_name]
280 elif (
281 column_name in functions_overridden
282 and "return" in functions_overridden[column_name].__annotations__
283 ):
284 internal_type = functions_overridden[column_name].__annotations__["return"]
285
286 # Make conversion if necessary
287 if internal_type and not check_series_has_expected_type(series, internal_type):
288 try:
289 data[column_name] = convert_series_to_internal_type(
290 series, internal_type
291 )
292 collected_conversions.append(
293 f" - {column_name} from {series.dtype} "
294 f"to {internal_type.__name__}"
295 )
296
297 except ValueError as e:
298 breakpoint()
299 -> collected_errors.append(f" - {column_name}: {e}")
(Pdb++) column_name
'kindergeld_zur_bedarfsdeckung_m'
(Pdb++) internal_type
numpy.ndarray[float]
(Pdb++) check_series_has_expected_type(series, internal_type)
False
(Pdb++) check_series_has_expected_type(series, float)
True
(Pdb++) functions_overridden[column_name].__policy_info__
*** AttributeError: 'function' object has no attribute '__policy_info__'
When we use the correct type annotation for a function that is already vectorised, this fails. @lars-reimann, can you think of a sensible solution here? Probably means making a case distinction regarding the definition of internal_type
on line 284 above, depending on whether the function has the skip_vectorization
decorator, though I am afraid the last line of my debugging shows we haven't applied it at that point.
Other functions in the code use
numpy.ndarray[bool]
as expected return and this works.
Most likely we never use them as input columns in tests directly, so convert_series_to_internal_type
never gets called. Although I have not checked.
@MImmesberger: I think the short-term hack here is to leave the annotation at float
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attribute added by the decorator is called __info__
, not __policy_info__
. Something like this should allow detecting whether the function is vectorized:
gettsim/src/_gettsim_tests/test_docs.py
Line 139 in add6232
if hasattr(func, "__info__") and func.__info__["skip_vectorization"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, misremembered! Thanks!
What problem do you want to solve?
Closes #742
The implementation will still rely on manually removing children from their Bedarfsgemeinschaft (via eigenbedarf_gedeckt) until the corresponding issue #622 has been addressed.
(This PR is based on #744)