Skip to content

Conversation

@paleolimbot
Copy link
Member

Implements the (Geo)Parquet writer in the R bindings:

library(sedonadb)

nc <- sf::read_sf(system.file("shape/nc.shp", package = "sf"))

tmp_parquet <- tempfile(fileext = ".parquet")
nc |> 
  as_sedonadb_dataframe() |> 
  sd_write_parquet(tmp_parquet)

sd_read_parquet(tmp_parquet)
#> ┌─────────┬───┬─────────┬──────────────────────────────────────────────────────┐
#> │   AREA  ┆ … ┆ NWBIR79 ┆                       geometry                       │
#> │ float64 ┆   ┆ float64 ┆                       geometry                       │
#> ╞═════════╪═══╪═════════╪══════════════════════════════════════════════════════╡
#> │   0.114 ┆ … ┆    19.0 ┆ MULTIPOLYGON(((-81.4727554321289 36.23435592651367,… │
#> ├╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
#> │   0.061 ┆ … ┆    12.0 ┆ MULTIPOLYGON(((-81.2398910522461 36.36536407470703,… │
#> ├╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
#> │   0.143 ┆ … ┆   260.0 ┆ MULTIPOLYGON(((-80.45634460449219 36.24255752563476… │
#> ├╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
#> │    0.07 ┆ … ┆   145.0 ┆ MULTIPOLYGON(((-76.00897216796875 36.31959533691406… │
#> ├╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
#> │   0.153 ┆ … ┆  1197.0 ┆ MULTIPOLYGON(((-77.21766662597656 36.24098205566406… │
#> ├╌╌╌╌╌╌╌╌╌┼╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
#> │   0.097 ┆ … ┆  1237.0 ┆ MULTIPOLYGON(((-76.74506378173828 36.23391723632812… │
#> └─────────┴───┴─────────┴──────────────────────────────────────────────────────┘
#> Preview of up to 6 row(s)

Created on 2025-10-10 with reprex v2.1.1

@paleolimbot paleolimbot requested a review from Copilot October 11, 2025 16:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements the sd_write_parquet() function for the R bindings of sedonadb, enabling users to write DataFrames to (Geo)Parquet files with various options for partitioning, sorting, and GeoParquet metadata handling.

Key changes:

  • Added R function sd_write_parquet() with comprehensive parameter validation and configurable output options
  • Implemented underlying Rust functionality in dataframe.rs with support for partitioning, sorting, and GeoParquet versions
  • Added comprehensive test coverage for various write scenarios including geometry data handling

Reviewed Changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
r/sedonadb/R/dataframe.R Implements the main sd_write_parquet() R function with parameter validation
r/sedonadb/src/rust/src/dataframe.rs Adds Rust implementation to_parquet() method with GeoParquet options
r/sedonadb/tests/testthat/test-dataframe.R Comprehensive test suite covering all write scenarios
r/sedonadb/R/000-wrappers.R Auto-generated wrapper for the new Rust function
r/sedonadb/src/init.c C binding registration for the new function
r/sedonadb/src/rust/api.h C header declaration for the new function
r/sedonadb/src/rust/Cargo.toml Added datafusion-expr dependency
r/sedonadb/man/sd_write_parquet.Rd Documentation for the new function
r/sedonadb/NAMESPACE Export declaration for the new function

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

paleolimbot and others added 2 commits October 11, 2025 11:43
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@paleolimbot paleolimbot marked this pull request as ready for review October 11, 2025 17:10
@paleolimbot paleolimbot merged commit 85ce98d into apache:main Oct 14, 2025
13 checks passed
@paleolimbot paleolimbot deleted the r-write-parquet branch October 14, 2025 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants