Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement time grain in temporal filters #24035

Merged
merged 3 commits into from
May 12, 2023

Conversation

villebro
Copy link
Member

@villebro villebro commented May 12, 2023

SUMMARY

While the QueryObjectFilterClause currently supports passing a grain (see here:

class QueryObjectFilterClause(TypedDict, total=False):
col: Column
op: str # pylint: disable=invalid-name
val: Optional[FilterValues]
grain: Optional[str]
isExtra: Optional[bool]
), it was not used in the construction of the actual WHERE clause. This adds support for adding a time grain to temporal filters, making it possible to filter by full time periods. For example, the date 2000-01-01 (Saturday) becomes 1999-12-26 (the previous Monday) with weekly timegrain. Without applying the time grain to the temporal range, the observation for the first week in the query will only consist of two days (Saturday and Sunday), which can give the impression that the events for Monday-Friday never took place.

While writing the test I noticed we had many redundant split strings (they were likely split before due to line width, but were moved on the same line by black). So I cleaned up the ones I found under db_engine_specs.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@@ -1945,7 +1958,6 @@ def get_sqla_query( # pylint: disable=too-many-arguments,too-many-locals,too-ma
inner_groupby_exprs = []
inner_select_exprs = []
for gby_name, gby_obj in groupby_series_columns.items():
label = get_column_name(gby_name)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bycatch - this variable was not used anywhere (apparently this function is so bloated nowadays that the linters just give up on it)

@codecov
Copy link

codecov bot commented May 12, 2023

Codecov Report

Merging #24035 (91b993d) into master (7fe0ca1) will increase coverage by 0.00%.
The diff coverage is 65.21%.

❗ Current head 91b993d differs from pull request most recent head 83f70b3. Consider uploading reports for the commit 83f70b3 to get more accurate results

@@           Coverage Diff           @@
##           master   #24035   +/-   ##
=======================================
  Coverage   68.21%   68.22%           
=======================================
  Files        1941     1941           
  Lines       75259    75261    +2     
  Branches     8168     8168           
=======================================
+ Hits        51341    51344    +3     
+ Misses      21829    21828    -1     
  Partials     2089     2089           
Flag Coverage Δ
hive 53.18% <39.13%> (-0.01%) ⬇️
mysql 78.94% <47.82%> (+0.01%) ⬆️
postgres 79.01% <47.82%> (+0.01%) ⬆️
presto 53.10% <39.13%> (-0.01%) ⬇️
python 82.78% <65.21%> (+<0.01%) ⬆️
sqlite 77.53% <47.82%> (+0.01%) ⬆️
unit 53.05% <39.13%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/charts/api.py 85.85% <ø> (ø)
superset/dashboards/api.py 92.58% <ø> (ø)
superset/databases/api.py 91.58% <ø> (ø)
superset/datasets/api.py 88.00% <ø> (ø)
superset/db_engine_specs/db2.py 91.66% <ø> (ø)
superset/db_engine_specs/hive.py 87.50% <ø> (ø)
superset/db_engine_specs/mysql.py 98.82% <ø> (ø)
superset/extensions/__init__.py 98.86% <0.00%> (ø)
superset/importexport/api.py 100.00% <ø> (ø)
superset/initialization/__init__.py 91.43% <0.00%> (ø)
... and 13 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@@ -1353,6 +1363,7 @@ def get_timestamp_expression(
"""
Return a SQLAlchemy Core element representation of self to be used in a query.

:param column: column object
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bycatch - missing docstring

"PT1S": "CAST({col} as TIMESTAMP)" " - MICROSECOND({col}) MICROSECONDS",
"PT1S": "CAST({col} as TIMESTAMP) - MICROSECOND({col}) MICROSECONDS",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup - that split is no longer needed:

>>> "CAST({col} as TIMESTAMP)" " - MICROSECOND({col}) MICROSECONDS" == "CAST({col} as TIMESTAMP) - MICROSECOND({col}) MICROSECONDS"
True

Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@villebro villebro merged commit f7dd52b into apache:master May 12, 2023
29 checks passed
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.0.0 labels Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/M 🚢 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants