Skip to content

Allow self-joining of Polars lazyframes#466

Merged
evertlammerts merged 1 commit into
duckdb:v1.5-variegatafrom
evertlammerts:polars_lazyframes_is_pure
May 19, 2026
Merged

Allow self-joining of Polars lazyframes#466
evertlammerts merged 1 commit into
duckdb:v1.5-variegatafrom
evertlammerts:polars_lazyframes_is_pure

Conversation

@evertlammerts
Copy link
Copy Markdown
Member

Fixes #452

This makes sure that Polars' CSE wraps the lazyframe in a CACHE[...] in its plan. This way, the lazyframe will only be consumed once.

We've seen this reported more often. The issue with joining a streaming result against itself is that a result can only be consumed once. One workaround is using multiple connections, but this only works if duckdb comes up with a query plan where a result is not referenced against itself.

As it turns out, Polars allows I/O plugins to register with is_pure, and Polars will de-duplicate them. Afaict, there's not really any downside. I don't see why anybody would want to treat a streaming result as not-pure, or what that would mean. I guess we'll see if that assumption will pass the test of time.

@evertlammerts evertlammerts merged commit f87d6d9 into duckdb:v1.5-variegata May 19, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant