Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

depr(duckdb): deprecate read_in_memory #9666

Merged
merged 1 commit into from
Aug 1, 2024

Conversation

gforsyth
Copy link
Member

Description of changes

This functionality is available via create_table (and also available
across all backends instead of just DuckDB).

@gforsyth
Copy link
Member Author

ugh, I forgot this is also in polars and maybe datafusion

@gforsyth gforsyth marked this pull request as draft July 22, 2024 21:11
This functionality is available via `create_table` (and also available
across all backends instead of just DuckDB).
@gforsyth gforsyth marked this pull request as ready for review July 29, 2024 18:50
@cpcloud
Copy link
Member

cpcloud commented Jul 30, 2024

By deprecating this are we now requiring that people effectively copy their data into backends that can operate directly on the in-memory data?

@gforsyth
Copy link
Member Author

I don't think so. It's currently only available on the DuckDB backend.

The suggested alternative works on all backends (except Druid) where a user can pass an in-memory object to create_table and operate on it from there.

@cpcloud
Copy link
Member

cpcloud commented Jul 30, 2024

Oh, also I forgot about memtable 😅 which definitely does not use create_table with the DuckDB backend!

@gforsyth
Copy link
Member Author

gforsyth commented Aug 1, 2024

@cpcloud -- anything else you'd like to see here?

Copy link
Member

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this LGTM!

For some reason I incorrectly assumed this was a breaking change, but it's not.

@cpcloud cpcloud added the refactor Issues or PRs related to refactoring the codebase label Aug 1, 2024
@cpcloud cpcloud changed the title refactor(duckdb): deprecate read_in_memory depr(duckdb): deprecate read_in_memory Aug 1, 2024
@cpcloud cpcloud added the deprecation Issues or PRs related to deprecating APIs label Aug 1, 2024
@gforsyth gforsyth merged commit e13af72 into ibis-project:main Aug 1, 2024
88 checks passed
@gforsyth gforsyth deleted the deprecate_read_in_memory branch August 1, 2024 16:16
@cpcloud cpcloud added this to the 9.3 milestone Aug 5, 2024
@acuitymd-filip
Copy link

It seems like a breaking change. Our use-case is registering Arrow tables and using DuckDB as a strictly in-memory query engine for Arrow data. Using register_in_memory we can pass in a pointer to Arrow data and register it with DuckDB without doing memory copies, etc.

The original functionality was basically equivalent to duckdb.register which created a view inside of DuckDB, effectively, which then referenced the pointer to the arrow table.

The recommended upgrade path creates a table in DuckDB, which involves memory copies, even if the temp flag is set to true.

@acuitymd-filip
Copy link

acuitymd-filip commented Aug 24, 2024

@gforsyth Would you prefer if I create an issue for this ^?

@gforsyth
Copy link
Member Author

Hey @acuitymd-filip -- you can open an issue if you'd like to discuss this further (that will probably be better for searches, anyway).

I can update the options in the upgrade path. If you use ibis.memtable with the DuckDB backend, we don't make any copies of the data and DuckDB doesn't make copies when accessing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecation Issues or PRs related to deprecating APIs refactor Issues or PRs related to refactoring the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants