-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Create ExtractWeekOfYear operation #2177
FEAT: Create ExtractWeekOfYear operation #2177
Conversation
c5a77f6
to
0cee0aa
Compare
b44f888
to
6786c58
Compare
|
this PR is ready for review! thanks! |
c771f92
to
6feeb2f
Compare
|
probably it is not a block for review or merge. it is ready again for review. thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, added some comments to make the code more clear
ibis/omniscidb/operations.py
Outdated
| https://en.wikipedia.org/wiki/ISO_week_date | ||
| """ | ||
| # n_weeks_year adjustment | ||
| def _p(year): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you replace _p by a descriptive name? Also, the heading _ is not needed, since this function it's inside another, so it doesn't add much value to mark it as private (it can't be used anywhere else anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used p here because it is the name of the function used by the algorithm presented in the link in See Also section.
but I can change that.
ibis/omniscidb/operations.py
Outdated
| dow = d.day_of_week.index() + 1 | ||
|
|
||
| result = doy - dow | ||
| result += 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes the code less readable to have all this, instead of simply:
d.day_of_year() - d.day_of_week.index() + 11There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right ... I just follow the algorithm from https://en.wikipedia.org/wiki/ISO_week_date .. I forgot to refactor that. thanks!
ibis/omniscidb/operations.py
Outdated
| ) | ||
|
|
||
| op = expr.op() | ||
| return translator.translate(_extract_woy_expr(op.args[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a nested function here?
Also, things like op = expr.op() seem unnecessary and confusing, if you are going to use the result just once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right .. probably I needed that for some reason in the past .. or something like that .. and forgot to refactor that. thanks!
ibis/sql/sqlite/compiler.py
Outdated
| ) | ||
| / 7 | ||
| + 1, | ||
| sa.INTEGER, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's black who is creating this unreadable code. But you can use a variable here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I will improve that. thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the feedback @datapythonista
I will work on that.
ibis/omniscidb/operations.py
Outdated
| https://en.wikipedia.org/wiki/ISO_week_date | ||
| """ | ||
| # n_weeks_year adjustment | ||
| def _p(year): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used p here because it is the name of the function used by the algorithm presented in the link in See Also section.
but I can change that.
ibis/omniscidb/operations.py
Outdated
| dow = d.day_of_week.index() + 1 | ||
|
|
||
| result = doy - dow | ||
| result += 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right ... I just follow the algorithm from https://en.wikipedia.org/wiki/ISO_week_date .. I forgot to refactor that. thanks!
ibis/omniscidb/operations.py
Outdated
| ) | ||
|
|
||
| op = expr.op() | ||
| return translator.translate(_extract_woy_expr(op.args[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right .. probably I needed that for some reason in the past .. or something like that .. and forgot to refactor that. thanks!
ibis/sql/sqlite/compiler.py
Outdated
| ) | ||
| / 7 | ||
| + 1, | ||
| sa.INTEGER, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I will improve that. thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice changes, added some comments that I think should improve readability
ibis/omniscidb/operations.py
Outdated
| https://en.wikipedia.org/wiki/ISO_week_date | ||
|
|
||
| """ | ||
| # adjustment factor for week year extraction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind moving this to the docstring of adjustment factor and provide a proper docstring for it? For me, and I guess for anyone else reading the code, it's not obvious what this is doing. Would be nice to have more information, including examples. I guess this is related to leap-years, but would be nice to know exactly what's the reasoning here, without having to spend a decent amount of time reading the code and researching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the line # adjustment factor for week year extraction please. I think it was left here unintentionally.
ibis/omniscidb/operations.py
Outdated
| def adjustment_factor(year): | ||
| return (year + (year // 4) - (year // 100) + (year // 400)) % 7 | ||
|
|
||
| need_adjustment = (adjustment_factor(year) == 4) | ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| need_adjustment = (adjustment_factor(year) == 4) | ( | |
| needs_adjustment = (adjustment_factor(year) == 4) | ( |
ibis/omniscidb/operations.py
Outdated
|
|
||
| def _woy_preliminary(d: Union[ir.DateValue, ir.TimestampValue]) -> ir.Expr: | ||
| """ | ||
| Return a preliminary week of year (WOY). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain in another paragraph what a preliminary week of the year is please
ibis/omniscidb/operations.py
Outdated
|
|
||
| Parameters | ||
| ---------- | ||
| d : ibis.expr.types.DateValue or ibis.expr.types.TimestampValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is d? Would be nice to have a more descriptive name, and also a description if the name is not clear enough.
ibis/omniscidb/operations.py
Outdated
| d = expr.op().args[0] | ||
|
|
||
| w = _woy_preliminary(d) | ||
| y = d.year() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, avoid naming variables like d, w, y, and use descriptive names.
ibis/pyspark/compiler.py
Outdated
|
|
||
|
|
||
| @compiles(ops.ExtractQuarter) | ||
| def compile_extract_quarter(t, expr, scope, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional? Feels unrelated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right, maybe some problem after rebasing ... thanks!
ibis/omniscidb/operations.py
Outdated
| return 52 + (ibis.case().when(need_adjustment, 1).else_(0).end()) | ||
|
|
||
|
|
||
| def _woy_preliminary(d: Union[ir.DateValue, ir.TimestampValue]) -> ir.Expr: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a function really needed for this? This is being used just once, feels like moving this two lines of code inside _extract_woy would keep things much simpler.
ibis/omniscidb/operations.py
Outdated
| return (result // 7).cast('int16') | ||
|
|
||
|
|
||
| def _extract_woy(translator, expr) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think being explicit about names will make things easier. In this PR woy is clear what it means by the context. But when working on something unrelated, going to read the docstring will be needed. While naming it week_of_year will make it immediate to know.
| def _extract_woy(translator, expr) -> str: | |
| def _extract_week_of_year(translator, expr) -> str: |
bf61069
to
d255621
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Added couple of small comments, to lgtm.
ibis/omniscidb/operations.py
Outdated
| """ | ||
| date = expr.op().args[0] | ||
|
|
||
| week = _week_of_year_preliminary(date) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a separate function for _week_of_year_preliminary seems unnecessary and overcomplicated:
| week = _week_of_year_preliminary(date) | |
| week_of_year_without_adjustment = ((date.day_of_year() - date.day_of_week.index() + 11) // 7).cast('int16') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @datapythonista I did this change. just keep the variable name to week and added a comment about this there.
ibis/omniscidb/operations.py
Outdated
| https://en.wikipedia.org/wiki/ISO_week_date | ||
|
|
||
| """ | ||
| # adjustment factor for week year extraction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the line # adjustment factor for week year extraction please. I think it was left here unintentionally.
|
@datapythonista the comment and I will replace it by # noqa: D202 |
Try to keep in mind that eventually someone will read that code, and would wonder what is the reason for it. Adding a comment for the reader, besides the comment for the linter, will make our future lives easier. Something like: |
ibis/omniscidb/operations.py
Outdated
| .end() | ||
| ) | ||
|
|
||
| return translator.translate(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you writing these as ibis operations? if the db doesn't support them, then don't add them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. removed. thanks.
0310253
to
3d09d2f
Compare
|
@xmnlab many files conflicting here, can you rebase please? |
b566abe
to
ebc168a
Compare
|
@datapythonista thanks! rebased! |
ebc168a
to
99ad527
Compare
ibis/sql/sqlite/compiler.py
Outdated
| @@ -144,6 +144,39 @@ def _extract_epoch_seconds(t, expr): | |||
| return sa.cast(sa_expr, sa.BigInteger) | |||
|
|
|||
|
|
|||
| def _extract_week_of_year(t, expr): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I'm missing something, but in this comment Jeff said that if the db doesn't support the operation, let's simply not have it. Shouldn't we be removing this function then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove it too. I will push the changes in a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xmnlab, lgtm
034b95f
to
9c63372
Compare
|
@jreback I think your comments were addressed. Do you mind having another look here? I think this should be ready. |
Create ExtractWeekOfYear operation and add its support to Clickhouse, CSV, MySQL, Pandas, Parquet, PostgreSQL, PySpark, SQLite and Spark
notes: