Skip to content

Conversation

aschackmull
Copy link
Contributor

This is a sequence of smaller performance tweaks (commit-by-commit review is encouraged). All of them relate to reducing work by avoiding splits of the tuple stream in the main recursive pipelines. Either by making sure negations are simple anti-joins (thereby also avoiding materialisation) or by avoiding tuple duplication arising from disjunctive filters.

@aschackmull aschackmull added the no-change-note-required This PR does not need a change note label Nov 24, 2021
@aschackmull aschackmull requested review from a team as code owners November 24, 2021 13:47
Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if DCA is happy! I'm curious about the performance implications of removing disjunction-induced tuple duplication. (Could we have a ql-for-ql query for this?)

Copy link
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@atorralba atorralba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, LGTM

Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 from Python (after also talking through what some of the changes do 👍)

@aschackmull aschackmull merged commit a066429 into github:main Nov 25, 2021
@aschackmull aschackmull deleted the dataflow/perf branch November 25, 2021 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants