Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing birth date for one individual in ERFS_FPR 2009 and 2010 causes build to crash #184

Open
elie-gerschel opened this issue Jan 15, 2020 · 2 comments
Assignees

Comments

@elie-gerschel
Copy link
Contributor

The issue appeared when building either ERFS_FPR 2010 or ERFS_FPR 2009.

Here is what I did:

from openfisca_france_data import france_data_tax_benefit_system
from openfisca_france_data.erfs_fpr.get_survey_scenario import get_survey_scenario

tax_benefit_system = france_data_tax_benefit_system

survey_scenario = get_survey_scenario(
tax_benefit_system = tax_benefit_system,
year = 2009,
rebuild_input_data = True,
)

Here is what actually happened:

File "C:\Users\elieg\openfisca-france-data\openfisca_france_data\erfs_fpr\input_data_builder\step_03_variables_individuelles.py", line 141, in create_variables_individuelles
year = year)

File "C:\Users\elieg\openfisca-france-data\openfisca_france_data\erfs_fpr\input_data_builder\step_03_variables_individuelles.py", line 942, in create_date_naissance
'day': day_birth,

File "C:\Users\elieg\Anaconda3\envs\ipp\lib\site-packages\pandas\util_decorators.py", line 208, in wrapper
return func(*args, **kwargs)

File "C:\Users\elieg\Anaconda3\envs\ipp\lib\site-packages\pandas\core\tools\datetimes.py", line 781, in to_datetime
result = _assemble_from_unit_mappings(arg, errors, box, tz)

File "C:\Users\elieg\Anaconda3\envs\ipp\lib\site-packages\pandas\core\tools\datetimes.py", line 906, in _assemble_from_unit_mappings
raise ValueError("cannot assemble the " "datetimes: {error}".format(error=e))

ValueError: cannot assemble the datetimes: time data '608' does not match format '%Y%m%d' (match)

Data values

Attached are the data values for the two individuals which I think constitute the issue (year of birth is 0, variable "naia")

problem_guy_2009.xlsx
problem_guy_2010.xlsx

Context

I identify more as a Economist (I make microsimulations with real populations).

@benjello
Copy link
Member

You should add a check on naia valeus using an assert and/or explicitly fix the erroneous values upstream.

@benjello
Copy link
Member

@elie-gerschel : I assign you this issue. Get back to me if you can't solve it by yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants