Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter: Use intermediate columns for grouping #1070

Merged
merged 3 commits into from Oct 25, 2022

Conversation

victorlin
Copy link
Member

@victorlin victorlin commented Oct 24, 2022

Description of proposed changes

Previously:

  1. year/month/day columns were created on the DataFrame used for grouping.
  2. month was overwritten with (year, month).

This allowed year to be used for grouping, but also the same for day which was an unintended side effect. Instead of adding columns with those names, use names with a more distinct prefix that can be safely discarded before grouping happens.

Related issue(s)

Fixes #1069

Testing

  • Checks pass

Checklist

  • Add a message in CHANGES.md summarizing the changes in this PR. Keep headers and formatting consistent with the rest of the file.

@victorlin victorlin self-assigned this Oct 24, 2022
@victorlin victorlin requested a review from a team October 24, 2022 21:44
@victorlin victorlin marked this pull request as ready for review October 24, 2022 21:44
@codecov
Copy link

codecov bot commented Oct 24, 2022

Codecov Report

Base: 61.77% // Head: 61.80% // Increases project coverage by +0.03% 🎉

Coverage data is based on head (7eacd26) compared to base (0a5bfc1).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1070      +/-   ##
==========================================
+ Coverage   61.77%   61.80%   +0.03%     
==========================================
  Files          52       52              
  Lines        6316     6321       +5     
  Branches     1550     1551       +1     
==========================================
+ Hits         3902     3907       +5     
  Misses       2141     2141              
  Partials      273      273              
Impacted Files Coverage Δ
augur/filter.py 96.00% <100.00%> (+0.03%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@corneliusroemer
Copy link
Member

Grouping by day wouldn't be that terrible as a side effect 🙃

@victorlin victorlin force-pushed the victorlin/update-filter-group-by branch from f2a6d71 to 3043718 Compare October 25, 2022 17:33
Base automatically changed from victorlin/update-filter-group-by to master October 25, 2022 17:47
Copy link
Contributor

@joverlee521 joverlee521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice to remove the side effect of grouping by day. This is open enough to add grouping by day as an official group-by category in the future if requested.

I only have one comment on the internal prefix string.

augur/filter.py Outdated Show resolved Hide resolved
Previously:

1. year/month/day columns were created on the DataFrame used for grouping.
2. month was overwritten with (year, month).

This allowed year to be used for grouping, but also the same for day
which was an unintended side effect. Instead of adding columns with
those names, use names with a more distinct prefix that can be safely
discarded before grouping happens.
@victorlin victorlin force-pushed the victorlin/filter/use-temporary-date-columns branch from 8375de9 to 7eacd26 Compare October 25, 2022 20:22
@victorlin victorlin merged commit 0826806 into master Oct 25, 2022
@victorlin victorlin deleted the victorlin/filter/use-temporary-date-columns branch October 25, 2022 20:37
@tsibley tsibley mentioned this pull request Oct 25, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

filter: Grouping by day works when it shouldn't
3 participants