Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT-#4419: Extend virtual partitioning API to pandas on Dask #4420
FEAT-#4419: Extend virtual partitioning API to pandas on Dask #4420
Changes from 3 commits
28cdc19
15b0072
ae4eb6c
026bd14
fa377a8
b9d68f7
17a9c65
348272d
6784b4b
4efd83b
af4edbc
0ac6de3
8c44eb1
a5af1ff
971e00a
87cad7c
0de19cd
4f5dd50
f5d3eb1
5e748cd
0cc25c4
43296b2
66c4360
9f48f0d
8dab0d1
2d9f150
9a945e8
5b322ee
a761182
6a9855c
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Instead of having two different loops here, can we combine them into one if/else case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe a better variable name than
o
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be located near to
axis = None
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - I can open a refactor PR for that if you think it warrants it, or I can just roll it in with another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either way of these is fine to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, but this means that our perspective of a partition might contain N dask partitions, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup - a virtual partition is just one or more block partitions (physical partitions) and an axis that they are aligned along.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just have the if check for this encapsulated within
drain_call_queue
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question - tbh I don't think there's really any huge reason to do so. It does save an extra call on the stack, but thats an extremely minuscule effect (so def not why). I think it just boils down to the idea that the
drain_call_queue
method (conceptually) should only be called when we have a call queue?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a test that covers this function and
width
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really have any unit testing at the partition level. This PR is actually the first to add any testing thats supposed to cover that level. We could look into adding more testing, but I think that would belong in a separate PR.