-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: Adding more window operations to OmniSciDB #1976
FEAT: Adding more window operations to OmniSciDB #1976
Conversation
ae0472a
to
79c9a49
Compare
9c72721
to
db07745
Compare
Currently CI is raising some errors: py36 building:
doc building
|
db07745
to
4055e04
Compare
@jreback any more thought about this PR? let me know and if it is OK .. I will rebase to fix the conflict on release.rst |
@jreback any more thought about this PR? let me know and if it is OK .. I will rebase to fix the conflict on release.rst |
hey @jreback ! a gentle reminder about this PR :) |
ibis/tests/all/test_window.py
Outdated
result.extend(sub_result) | ||
if diff > 0: | ||
diff -= 1 | ||
return pd.Series(result, index=x.index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you evaluating directly in the api like this?
are the ntile ops not defined?
these should be dispatched generically, not done like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas doesn't have ntile
operation.
the most similar operation isqcut
but it seems there are some differences.
what is your recommendation in this case?
7971a1a
to
9871e6f
Compare
ibis/tests/all/test_window.py
Outdated
""" | ||
# internal ntile function | ||
def _ntile(x: pandas.Series, bucket: int): | ||
n = x.shape[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you actually trying to execute operations inside api definitions?
this is an anti pattern in ibis
further these are actually definitions of pandas ops and not omnisci db native ops
so am puzzled what you are attempting to accomplish
the way to do this is to define these as pandas ops
the api then should return these ops and ibis will dispatch to them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just created that because AFAIK currently there is no way to test ibis ntile operation ...
so that would be just a temporary way to test ibis ntile operations
just for clarification, by pandas ops
do you mean 1) ibis pandas ops or really 2) pandas ops?
if it is 2) .. as it could take an extra time, probably I would prefer to remove ntile here and implement that in a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure you are following. This completely breaks the ibis style & philosophy. I am really not sure what your end goal is Does omniscidb actually have these functions? if so certainly you can add them that backend, but this shouldn't have anything to do with pandas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes .. maybe I am not following you ... what do you mean by api definitions
? because I added that just to test_window.py
it shouldn't be available into the API.
I can remove this code with no problem .. my question is .. how can I test omniscidb ntile operation? normally we just compare the result from the backend to the result from pure pandas operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no ideal what these operations actually do. Some documentation and examples would go along way. You need to write a much more explict test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds fair @jreback thanks for the feedback ... I will probably move ntile to a new PR and I will do it for sure. thanks!
9871e6f
to
a7a7eec
Compare
@jreback I removed the ntile operation from this PR. it is done for a new review. thanks! |
@xmnlab pls rebase. this looks fine. you missed my point above. I don't think is is necessary to add any operations to pandas proper itself (and will likely be rejected). But on the ibis pandas back-end I think you could simply implement these and would provide the good testing comparison. |
a7a7eec
to
ed0ecd0
Compare
rebase done, thanks @jreback ! I just opened that issue there to see if there is any interesting from the pandas community for that implementation .. and also check if there are other ways to get the desired result :) about your suggestion about to implement that on pandas backend .. that is reasonable, I will need to create a separated test for that ... because on tests/all it compares the result between each backend and pandas (pure) .. and in this new case it will check the results between each backend and pandas backend (let me know if I am missing anything ...) again, thanks for the review and suggestions! |
this is ok thanks @xmnlab |
In this PR, added the follow window operations: