Skip to content

Create and use adapter subclass of Relational that forwards to a parent object or to a data frame scan #949

@krlmlr

Description

@krlmlr

Closes tidyverse/duckplyr#442.

The example below highlights the problem.

Implementation guide:

  • Create a subclass of Relational that takes an ALTREP data frame as an SEXP
  • Implement methods for this Relational object to unconditionally forward to the parent relational object stored in the SEXP
  • Create C++ functions rel_project2() and rel_filter2()
  • Use the new Relational subclass in these new C++ functions
  • Ensure the example passes when the ...2() functions are used
  • In the subclass, instead of unconditionally forwarding, now check if the data frame is materialized, and if yes, forward to a new relational object with a data frame scan instead
  • Check that the example still works and actually uses the materialized data frame
  • Ensure that all operators have ...2() versions
drv <- duckdb::duckdb()
con <- DBI::dbConnect(drv)
df1 <- tibble::tibble(a = 1)

"mutate"
#> [1] "mutate"
rel1 <- duckdb:::rel_from_df(con, df1)
"mutate"
#> [1] "mutate"
rel2 <- duckdb:::rel_project(
  rel1,
  list(
    {
      tmp_expr <- duckdb:::expr_reference("a")
      duckdb:::expr_set_alias(tmp_expr, "a")
      tmp_expr
    },
    {
      tmp_expr <- duckdb:::expr_constant(2)
      duckdb:::expr_set_alias(tmp_expr, "b")
      tmp_expr
    }
  )
)
"filter"
#> [1] "filter"
rel3 <- duckdb:::rel_filter(
  rel2,
  list(
    duckdb:::expr_comparison(
      "==",
      list(
        duckdb:::expr_reference("b"),
        duckdb:::expr_constant(2)
      )
    )
  )
)
rel3
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Filter [(b = 2.0)]
#>   Projection [a as a, 2.0 as b]
#>     r_dataframe_scan(0x125e1b3d0)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)

# This materializes the data frame
duckdb:::rel_to_altrep(rel2)
#>   a b
#> 1 1 2

# Expecting this to use a data frame scan only, without a projection
rel2
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [a as a, 2.0 as b]
#>   r_dataframe_scan(0x125e1b3d0)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)

# Expecting this to use filter only, with a data frame scan based on rel2
rel3
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Filter [(b = 2.0)]
#>   Projection [a as a, 2.0 as b]
#>     r_dataframe_scan(0x125e1b3d0)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)

Created on 2025-01-04 with reprex v2.1.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    duckplyr 🗜️Support for the duckplyr R package

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions