feat(r/sedonadb): Add join expression evaluation#781
Conversation
Co-authored-by: Copilot <copilot@github.com>
There was a problem hiding this comment.
Pull request overview
Adds join-expression support to the R sedonadb bindings, enabling sd_join() with dplyr-like join condition specification (sd_join_by()) and post-join column selection/disambiguation (sd_join_select_default(), sd_join_select()), backed by new Rust FFI methods for join execution and expression introspection.
Changes:
- Introduces
sd_join(),sd_join_by(),sd_join_select_default(), andsd_join_select()plus join expression evaluation utilities. - Extends Rust/R FFI to support DataFusion
join_on()and adds expression inspection helpers (qualified_name(),variant_name(),parse_binary()). - Adds comprehensive testthat coverage and snapshots for join-expression behavior and default selection rules.
Reviewed changes
Copilot reviewed 21 out of 22 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| r/sedonadb/R/join-expression.R | New join condition/select specification and join-expression evaluation + default post-join projection logic. |
| r/sedonadb/R/dataframe.R | Adds sd_join() and extends sd_summarise()/sd_summarize() with .env. |
| r/sedonadb/R/expression.R | Adds sd_expr_parse_binary() and makes expression masks drop duplicate column names. |
| r/sedonadb/R/000-wrappers.R | Adds generated R wrappers for InternalDataFrame$join() and new SedonaDBExpr inspection methods. |
| r/sedonadb/src/rust/src/dataframe.rs | Adds InternalDataFrame::join() using DataFusion join_on() with aliases and parsed JoinType. |
| r/sedonadb/src/rust/src/expression.rs | Exposes qualified_name(), variant_name(), and parse_binary() over FFI for R-side logic. |
| r/sedonadb/src/rust/api.h | Declares new FFI symbols for join and expression inspection. |
| r/sedonadb/src/init.c | Registers new .Call entry points for join + expression inspection. |
| r/sedonadb/NAMESPACE | Exports new join APIs and S3 methods for printing and $ table refs. |
| r/sedonadb/tests/testthat/test-join-expression.R | New test suite for join-by/select evaluation, ambiguity errors, and default projection behavior. |
| r/sedonadb/tests/testthat/_snaps/join-expression.md | Snapshot outputs for join-expression printing and evaluated expressions. |
| r/sedonadb/tests/testthat/test-dataframe.R | Adds an integration-style test ensuring select behavior is applied to join results. |
| r/sedonadb/tests/testthat/test-expression.R | Adds tests for new expression inspection helpers. |
| r/sedonadb/man/sd_join.Rd | New user-facing docs for sd_join(). |
| r/sedonadb/man/sd_join_by.Rd | New user-facing docs for sd_join_by(). |
| r/sedonadb/man/sd_join_select.Rd | New user-facing docs for sd_join_select(). |
| r/sedonadb/man/sd_join_select_default.Rd | New user-facing docs for sd_join_select_default(). |
| r/sedonadb/man/sd_expr_column.Rd | Adds alias/doc entry for sd_expr_parse_binary(). |
| r/sedonadb/man/sd_summarise.Rd | Documents new .env parameter for sd_summarise()/sd_summarize(). |
| r/sedonadb/.Rbuildignore | Ignores local AI assistant marker files. |
| .pre-commit-config.yaml | Excludes testthat snapshot directory from trailing-whitespace hook. |
| .gitignore | Ignores .positai. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 23 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
If there are no objections I'll merge this sometime tomorrow! |
Adds
sd_join()to the R bindings with friendly specification of the join condition and the output selection. These are both a huge pain and are very verbose to deal with...there are a lot of ways to specify join keys and a lot of ways to deal with disambiguating names on the output. I implemented roughly how this is done in dplyr withjoin_by()andsuffixwith an escape hatch for other types of selections one might want to do.This also works with
st_intersects()and friends for a spatial join:Created on 2026-04-24 with reprex v2.1.1