-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework the python bindings using conversion traits from arrow-rs #873
Conversation
Any chance of rewording the commits to be something other than single characters? |
fbea0ed
to
89bbaea
Compare
Squashed. |
89bbaea
to
df9b924
Compare
@jorgecarleitao @andygrove we should roll up apache/arrow-rs#691 and this PR so we can proceed with the ibis datafusion backend. |
df9b924
to
99a8ebf
Compare
d35ed23
to
fccd0f1
Compare
The simplification looks great, thanks @kszucs! Looks like there are some minor rust build errors that need to be addressed. |
d54858c
to
d47311b
Compare
I had to bump the arrow-rs dependency everywhere so we need to address a couple more issues. @houqp there is an async sort order issue which is rather hard to comprehend, any ideas what could have caused this error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great so far! It is a awesome simplification!
I am curious about the naming equivalents but I think we should merge this regardless.
An observation: from here on we are bound to use the same pyo3
version used by arrow-rs
.
I suggest an extra reviewer (e.g. @alamb, @houqp) -- I focused mostly on the Python part.
python/src/catalog.rs
Outdated
database: Arc<dyn SchemaProvider>, | ||
} | ||
|
||
#[pyclass(name = "Table", subclass)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this related to pa.Table
? It is the Python-equivalent of the TableProvider
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, it was an attempt to expose the catalog information to ibis, but I may defer the catalog.rs to a follow-up PR instead.
python/src/catalog.rs
Outdated
catalog: Arc<dyn CatalogProvider>, | ||
} | ||
|
||
#[pyclass(name = "Database", subclass)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here; this is SchemaProvider
in DataFusion, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I'm going to remove this module from the PR to avoid confusion.
I need to cover the newly exposed python API with unittests. |
@kszucs I took a quick look at the |
I am pretty sure I saw a failure in the |
Thanks Andrew for the heads up! I wonder, wouldn't it better if we would assert on record batch equality rather than its string representation? |
@kszucs -- I agree that doing so would avoid such "small floating point changes require many test changes" type problem. I think the downside is that (in my opinion) the tests are then harder to read and update. Let me prepare a draft PR for updating datafusion to the latest arrow-rs so we can at least decouple the "changes needed for just arrow-rs upgrade" from the other changes in this PR |
I understand, though the test actually tests that the two representations are equal rather than that the results are equal.
Great, thank You! |
🤔 Yes , you are correct. There may be something deeper going on here and perhaps related to the change to use unstable sort in apache/arrow-rs#552. All the more reason to debug it now. I will work on that |
#984 is the PR to update to arrow-rs master. The |
@kszucs @jorgecarleitao quick question, do we want to release this change as part of 0.3.0 or delay to 0.4.0? |
830a5ac
to
35ad333
Compare
@houqp rebased |
Thank you @kszucs ! Great simplification to the python binding :) |
Alternative to #856
Inspired by #856 (comment)
Depends on apache/arrow-rs#691
The CI will probably fail since I hardcoded
arrow-rs
local dependency.