Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing cyclomatic complexity of search and addressing offset awareness issue #881

Merged
merged 6 commits into from
Feb 18, 2021

Conversation

SimplicityGuy
Copy link
Contributor

Includes 2 contributions:

  1. Since the search functionality is complex, the first change addresses cyclomatic complexity and deep indentation in the search code. This should make the code easier to read and maintain.
  2. Addresses TypeError: can't compare offset-naive and offset-aware datetimes #679 by handling timezone offset awareness.

* fix: scoping where the language is retrieved
* chore: reduce cyclomatic complexity and indentation on complicated code
@codecov
Copy link

codecov bot commented Feb 13, 2021

Codecov Report

Merging #881 (63d8bed) into master (615c729) will increase coverage by 0.04%.
The diff coverage is 98.48%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #881      +/-   ##
==========================================
+ Coverage   98.15%   98.20%   +0.04%     
==========================================
  Files         229      229              
  Lines        2495     2504       +9     
==========================================
+ Hits         2449     2459      +10     
+ Misses         46       45       -1     
Impacted Files Coverage Δ
dateparser/search/search.py 99.34% <98.27%> (+<0.01%) ⬆️
dateparser/parser.py 99.01% <100.00%> (+0.01%) ⬆️
dateparser/search/__init__.py 100.00% <100.00%> (ø)
dateparser/freshness_date_parser.py 99.04% <0.00%> (+0.95%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 615c729...63d8bed. Read the comment docs.

@SimplicityGuy
Copy link
Contributor Author

Oooo, ok. I'll add an additional test here.

@SimplicityGuy
Copy link
Contributor Author

hey @noviluni So seems like the CC is complaining because I haven't hit this line.

Totally understand. Here's the problem... after code tracing, it doesn't seem that dateobj will ever have tzinfo as the code is currently written, so in practice this line of code is dead. My concern is that if the code changes to now start adding at tzinfo, this bug will be reintroduced. While I can easily remove this path, I do not feel that for the longer term this is the right thing to do.

Thoughts?

@Gallaecio
Copy link
Member

What about adding an assert for dateobj’s tzinfo being None, with a comment that explains that if it ever becomes possible for that to be something other than None, the code needs to change, and describes exactly how it needs to change?

@SimplicityGuy
Copy link
Contributor Author

What about adding an assert for dateobj’s tzinfo being None, with a comment that explains that if it ever becomes possible for that to be something other than None, the code needs to change, and describes exactly how it needs to change?

Great idea! Totally forgot about using an assert here. Changing it now.

… that future maintainers can address this if the condition becomes possible.
Copy link
Collaborator

@noviluni noviluni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @SimplicityGuy! Good job here!

The search functionality has to be refactored and checked because it is hard to read and there are a bunch of known bugs: https://github.com/scrapinghub/dateparser/issues?q=is%3Aissue+is%3Aopen+label%3Asearch_dates

Thank you very much for addressing this! I left one minor comment.

@@ -447,6 +448,19 @@ def _correct_for_time_frame(self, dateobj):

dateobj = dateobj + delta

# NOTE: If this assert fires, self.now needs to be made offset-aware in a similar
# way that dateobj is temporarily made offset-aware.
assert not (self.now.tzinfo is None and dateobj.tzinfo is not None), "Review comment for details."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SimplicityGuy
I know that it doesn't seem to happen, but Imagine we introduce the bug without noticing and some user gets this error. I think we should do the error message a bit more self-explanatory. What about something like...?:

Suggested change
assert not (self.now.tzinfo is None and dateobj.tzinfo is not None), "Review comment for details."
assert not (self.now.tzinfo is None and dateobj.tzinfo is not None), "`self.now` shouldn't have tzinfo. Review comment in code for details."

or

Suggested change
assert not (self.now.tzinfo is None and dateobj.tzinfo is not None), "Review comment for details."
assert not (self.now.tzinfo is None and dateobj.tzinfo is not None), "Timezone info is wrong. Review comment in code for details."

In that way, people affected by this (either developers or users) will be able to provide more context if they find this error.

@Gallaecio let me know what you think

@noviluni
Copy link
Collaborator

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants