Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Parse Age without Date of Birth #125
Types of changes
I added an alternate parsing function for Deceased fields with no DOB available. Also, I changed the existing parsing functions to convert date strings to datetimes earlier, and then changed the lines that compared datetimes to use < and <= instead of the custom functions is_posterior and is_in_range. (The docstring for is_posterior seems to be wrong about what the function does, but I tried to match its actual functionality.)
This is to fix issue #92, Deceased fields that can't be parsed because they contain no DOB.
I added one new test case with the Deceased field that was failing. That one is passing. There's still one record that's failing and showing age -1, but that seems to be a different issue, involving a DOB format with spaces in it. I also changed some tests to expect datetimes instead of strings.
By the way, I accidentally added these commits on top of my old commits that aren't relevant to this pull request, but I decided to submit the pull request anyway before I make it worse in an attempt to modify the git history. The overall diff for the pull request looks correct to me.
rgreinho left a comment
Good job with the PR. I am just doing a small pre-review until all the checks pass (note for self, I should have a bot sending reminders for me).
The biggest remark to this PR however is that all the values should be string. If you use a datetime object, we cannot serialize the entity in JSON without adding custom logic. So this part should be reverted, unless you find a workaround.
Hmm, I didn't know there were no tests for those formatters. I added some, but not for the Google Sheets formatter because I don't really understand its requirements. So I don't know whether this PR breaks your Google Sheets workflow, @rgreinho. Could you maybe tell me whether it does, or write a test describing the expected behavior?
About the custom operator functions I want to delete...I think a big advantage of using the builtin operator functions is that you don't have to test or document them.
I do agree that
rgreinho left a comment
Great job with the tests for the formatters! I did not write any because they were very simple and just using the stdlib, but I am definitely a big fan of adding tests for them, especially as we're adding custom logic to them.
Delaying the conversion of the values to string is a good idea, and allows us to manipulate Python objects for all the operations without having to perform various conversions and reconversions of their values. I just want to ensure we generate the same output (and you did a good job reformating the date for JSON) for all the formatters (CSV, print).
Regarding the custom operators, I would really like to keep them. I added the missing tests and fixed/updated the documentation, therefore I think there is no reason to delete them. Besides I do strongly think that they read better than the operators.
I don't think fixing #128 will be such a big deal because we already have half of the logic.
>>> # https://austintexas.gov/news/traffic-fatality-29-3 >>> from dateparser.search import search_dates >>> search_dates('Patrick Leonard Ervin, Black male, D.O.B. August 30, 1966') [('August 30, 1966', datetime.datetime(1966, 8, 30, 0, 0))] >>>