Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polars: lazyframe_domain ffi #769

Merged
merged 1 commit into from
Apr 18, 2024
Merged

Polars: lazyframe_domain ffi #769

merged 1 commit into from
Apr 18, 2024

Conversation

Shoeboxam
Copy link
Member

Closes #768

@Shoeboxam Shoeboxam linked an issue Jun 9, 2023 that may be closed by this pull request
@ankke ankke mentioned this pull request Jun 13, 2023
@ankke ankke mentioned this pull request Jun 26, 2023
@PaulineMauryL PaulineMauryL mentioned this pull request Jun 30, 2023
python/src/opendp/_convert.py Outdated Show resolved Hide resolved
dp.series_domain("A", dp.atom_domain(T=dp.f64)),
dp.series_domain("B", dp.atom_domain(T=dp.i32)),
dp.series_domain("C", dp.option_domain(dp.atom_domain(T=dp.String))),
dp.series_domain("D", dp.atom_domain(T=dp.i32)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the python types also work, it'd be good to test them. If they don't, I'd worry about usability -- Not necessarily something to address now, but a comment pointing to an issue would be good if it's not something for right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An annoying aspect is that Polars defaults to i64, but the default inferred type of int is i32. So if you don't specify the types exactly, then the domains will each expect i32 but the data (a few lines below) is actually i64.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. It would be easy for users to get this wrong, so if we can't get around this, maybe just a test to make sure the error message is readable. I'll file a little issue for this.

python/test/test_polars.py Outdated Show resolved Hide resolved
python/test/test_polars.py Outdated Show resolved Hide resolved
rust/src/data/ffi.rs Show resolved Hide resolved
rust/src/domains/polars/frame/ffi.rs Outdated Show resolved Hide resolved
Comment on lines +126 to +123
"keys" => MarginPub::Keys,
"lengths" => MarginPub::Lengths,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This corresponds to this test helper?

def _with_margin(domain, by):
    return dp.with_margin(domain, by=by, public_info="lengths", max_partition_size=50)

I feel weird about passing in strings when we really just need to convey a boolean.

Also: It looks like "lengths" is exercised in the tests. Should we have some coverage for "keys"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's actually three states: no public info, known keys, and known lengths. I think the stringly-typed argument is clearer. An enum would be best, but I don't have FFI for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added another test for initializing with keys.

@Shoeboxam Shoeboxam force-pushed the 768-lazyframe-domain-ffi branch 2 times, most recently from d4322b3 to ea54342 Compare April 14, 2024 02:46
@Shoeboxam Shoeboxam force-pushed the 768-lazyframe-domain-ffi branch 2 times, most recently from be70738 to 86809f6 Compare April 16, 2024 21:33
@Shoeboxam Shoeboxam requested a review from mccalluc April 16, 2024 21:55
mccalluc
mccalluc previously approved these changes Apr 17, 2024
Copy link
Contributor

@mccalluc mccalluc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Changes look good.

dp.series_domain("A", dp.atom_domain(T=dp.f64)),
dp.series_domain("B", dp.atom_domain(T=dp.i32)),
dp.series_domain("C", dp.option_domain(dp.atom_domain(T=dp.String))),
dp.series_domain("D", dp.atom_domain(T=dp.i32)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. It would be easy for users to get this wrong, so if we can't get around this, maybe just a test to make sure the error message is readable. I'll file a little issue for this.

@Shoeboxam Shoeboxam force-pushed the 766-series-domain-ffi branch 2 times, most recently from be5783c to 222bb6b Compare April 17, 2024 23:01
Base automatically changed from 766-series-domain-ffi to main April 18, 2024 02:17
@Shoeboxam Shoeboxam dismissed mccalluc’s stale review April 18, 2024 02:17

The base branch was changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FFI data loaders for Polars LazyFrames add lazyframe_domain ffi
3 participants