Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JoinReftype to Relational Joins (to add asof, positional, dependent joins) #7987

Merged
merged 24 commits into from
Jul 6, 2023

Conversation

Tmonster
Copy link
Contributor

Currently users can't specific the join ref type for join relations. This means they cannot create join relations that perform asof, positional, or dependent joins.

In this PR we add the join ref type to the join relation class, while keeping the old constructor for the substrait package so that CI doesn't fail.

@Tmonster Tmonster requested review from hannes and krlmlr June 19, 2023 08:59
Copy link
Collaborator

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, great!

tools/rpkg/R/relational.R Outdated Show resolved Hide resolved
@hannes
Copy link
Member

hannes commented Jun 19, 2023

looks good to me!

Copy link
Contributor

@hawkfish hawkfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small stuff.

}

rel_join <- function(left, right, conds, join = c("inner", "left", "right", "outer", "cross", "semi", "anti")) {
rel_join_ <- function(left, right, conds,
join = c("inner", "left", "right", "outer", "cross", "semi", "anti", "asof"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asof is a JoinRefType, not a JoinType.

expect_error(rel_join(test_df1, test_df2, cond, ref_type="asof"), "Binder Error")
})

test_that("multiple conditions for asof join throws error", {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"multiple inequality"

expect_equal(rel_df, expected_result)
})

test_that("ASOF join works", {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe test outer ASOF joins?

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks good - some comments.

} else if (join_ref_type == "asof") {
join_ref = JoinRefType::ASOF;
} else if (join_ref_type == "dependent") {
join_ref = JoinRefType::DEPENDENT;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dependent joins are only used internally for subquery flattening and shouldn't be explicitly created by the user.

})

test_that("Invalid asof join condition throws error", {
dbExecute(con, "CREATE OR REPLACE MACRO neq(a, b) AS a <> b")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test positional joins as well?

@github-actions github-actions bot marked this pull request as draft June 23, 2023 13:54
@Mytherin Mytherin marked this pull request as ready for review June 25, 2023 08:33
@github-actions github-actions bot marked this pull request as draft June 26, 2023 08:56
@Mytherin Mytherin marked this pull request as ready for review June 26, 2023 09:41
@github-actions github-actions bot marked this pull request as draft June 26, 2023 13:35
@Tmonster
Copy link
Contributor Author

Waiting for CI to pass on my own fork before marking ready to review Tmonster#57

@Tmonster Tmonster marked this pull request as ready for review June 27, 2023 06:45
@github-actions github-actions bot marked this pull request as draft June 28, 2023 07:47
@Tmonster Tmonster marked this pull request as ready for review June 28, 2023 07:47
Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! LGTM - one minor comment


// run a positional cross product
auto join_ref_type = JoinRefType::POSITIONAL;
vcross = v1->CrossProduct(v2, join_ref_type);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a positional join not a positional cross product. Could we move this to the JoinRef as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean, there are join_types and join_ref types. Do you mean just make it a regular join? It would look something like
vcross = v1->Join(v2, "v1.i=v2.j", JoinType::INNER, join_ref_type);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like v1->PositionalJoin(v2) would make more sense to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I feel like adding the type of join to function name might add unnecessary overhead/maintenance. If we do, we may also end up creating relational functions like v1->InnerJoin(v2), v1->LeftJoin(v2) , v1->AsOfJoin(v2) etc. Since the CrossProduct relation already exists I think we should keep someone is already using it. But I think we should stick with
vcross = v1->Join(v2, "v1.i=v2.j", JoinType::INNER, JoinRefType::POSITIONAL) for positional joins

Copy link
Collaborator

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@Tmonster Tmonster requested a review from Mytherin July 4, 2023 10:06
@Mytherin Mytherin changed the base branch from feature to master July 4, 2023 13:17
@Mytherin Mytherin merged commit 150e109 into duckdb:master Jul 6, 2023
54 checks passed
@Mytherin
Copy link
Collaborator

Mytherin commented Jul 6, 2023

Thanks! Looks good

krlmlr pushed a commit to krlmlr/duckdb-r that referenced this pull request Sep 2, 2023
…relational

Add JoinReftype to Relational Joins (to add asof, positional, dependent joins)
krlmlr pushed a commit to krlmlr/duckdb-r that referenced this pull request Sep 2, 2023
…relational

Add JoinReftype to Relational Joins (to add asof, positional, dependent joins)
krlmlr pushed a commit to krlmlr/duckdb-r that referenced this pull request Sep 2, 2023
…relational

Add JoinReftype to Relational Joins (to add asof, positional, dependent joins)
krlmlr pushed a commit to krlmlr/duckdb-r that referenced this pull request Sep 2, 2023
…nt joins)

- Merge pull request duckdb/duckdb#7987 from Tmonster/add_asof_join_to_relational

- Merge pull request duckdb/duckdb#8227 from Tmonster/remove_R_warning: Change join ref type to cross if join type is cross and join ref type is regular

- Merge pull request duckdb/duckdb#8274 from krlmlr/b-relational-join-tests: Fix handling of cross joins
krlmlr pushed a commit to duckdb/duckdb-r that referenced this pull request Sep 5, 2023
…nt joins)

- Merge pull request duckdb/duckdb#7987 from Tmonster/add_asof_join_to_relational

- Merge pull request duckdb/duckdb#8227 from Tmonster/remove_R_warning: Change join ref type to cross if join type is cross and join ref type is regular

- Merge pull request duckdb/duckdb#8274 from krlmlr/b-relational-join-tests: Fix handling of cross joins
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants