[GH-2674] Add RS_SetCRS and RS_CRS for custom CRS string support#2677
[GH-2674] Add RS_SetCRS and RS_CRS for custom CRS string support#2677
Conversation
Implements RS_SetCRS(raster, crsString) that accepts CRS definitions in EPSG, WKT1, WKT2, PROJ, and PROJJSON formats. Also implements RS_CRS(raster[, format]) that exports the raster CRS in any of these formats (default: PROJJSON). Closes #2674
There was a problem hiding this comment.
Pull request overview
Adds new raster CRS utilities to Sedona to support setting and retrieving raster CRS using full CRS definition strings (not limited to integer EPSG/SRID), exposed both in the common Java raster layer and as Spark SQL functions.
Changes:
- Introduces
RS_SetCRS(raster, crsString)to set raster CRS from EPSG/WKT1/WKT2/PROJ/PROJJSON inputs via a GeoTools + proj4sedona parsing pipeline. - Introduces
RS_CRS(raster[, format])to export raster CRS asprojjson(default),wkt2,wkt1, orproj. - Adds extensive unit + integration + round-trip compliance tests and SQL docs for the new functions.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
common/src/main/java/org/apache/sedona/common/raster/RasterEditors.java |
Adds CRS string parsing + projection-name normalization to support setCrs. |
common/src/main/java/org/apache/sedona/common/raster/RasterAccessors.java |
Adds CRS export in multiple formats via proj4sedona. |
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/raster/RasterEditors.scala |
Adds Spark SQL expression wrapper for RS_SetCRS. |
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/raster/RasterAccessors.scala |
Adds Spark SQL expression wrapper for RS_CRS (1-arg + 2-arg). |
spark/common/src/main/scala/org/apache/sedona/sql/UDF/Catalog.scala |
Registers RS_SetCRS and RS_CRS in the function catalog. |
spark/common/src/test/scala/org/apache/sedona/sql/rasteralgebraTest.scala |
Adds Spark integration tests for the new SQL functions and formats. |
common/src/test/java/org/apache/sedona/common/raster/RasterEditorsTest.java |
Adds unit tests for setCrs across input formats and projection support. |
common/src/test/java/org/apache/sedona/common/raster/RasterAccessorsTest.java |
Adds unit tests for crs() output formats, null handling, and invalid format. |
common/src/test/java/org/apache/sedona/common/raster/CrsRoundTripComplianceTest.java |
Adds broad CRS round-trip/idempotency compliance coverage across formats/EPSG codes. |
docs/api/sql/Raster-Operators/RS_SetCRS.md |
Adds SQL documentation for RS_SetCRS. |
docs/api/sql/Raster-Operators/RS_CRS.md |
Adds SQL documentation for RS_CRS output formats and limitations. |
docs/api/sql/Raster-Functions.md |
Adds index entries for the two new raster functions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
common/src/main/java/org/apache/sedona/common/raster/RasterAccessors.java
Show resolved
Hide resolved
common/src/main/java/org/apache/sedona/common/raster/RasterEditors.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/apache/sedona/common/raster/RasterEditors.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/apache/sedona/common/raster/RasterEditors.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/apache/sedona/common/raster/RasterEditors.java
Outdated
Show resolved
Hide resolved
common/src/test/java/org/apache/sedona/common/raster/CrsRoundTripComplianceTest.java
Outdated
Show resolved
Hide resolved
- Use longitude-first axis order for WKT parsing (FORCE_LONGITUDE_FIRST_AXIS_ORDER) - Remove tryResolveToEpsg() - RS_SRID returns 0 for custom CRS, use RS_CRS instead - Add null/empty guard for format parameter in crs() - Use ConcurrentHashMap for thread-safe alias cache writes - Guard DefaultMathTransformFactory downcast with instanceof - Catch specific exceptions in proj4sedona parsing, attach as suppressed - Remove 'lossless' claim from PROJJSON docs - Update RS_SRID docs: 0 can mean custom (non-EPSG) CRS - Fix Javadoc to include WKT2 in format list - Update Spark test to expect SRID=0 for WKT1 without AUTHORITY
- Bump proj4sedona from 0.0.6 to 0.0.7 (fixes bugs #44-#48) - Remove documented limitations for datum name loss (#47), lat_ts drift (#44), ellipsoid expansion (#45), WKT2 drift (#46) - Convert WKT2/PROJJSON import-fail tests to normal round-trips (#48) - Add EPSG:28992 to WKT2 round-trip tests (floating-point drift fixed) - Add EPSG:6933 test (Lambert Cylindrical Equal Area now works) - Fix export path: prefer EPSG SRID over WKT1 to avoid projection name compatibility issues between GeoTools and proj4sedona - Add WKT1 projection name normalization for export fallback - Handle +proj=sterea PROJ string re-import (normalize to +proj=stere) - Use normalized matching in Tier 3 fallback (handles space vs underscore) - Update docs to remove resolved limitations
Consolidate all CRS name normalization into a single utility class: - Shared PROJECTION_PATTERN regex (was duplicated in RasterEditors + RasterAccessors) - Pre-normalized fallback map keys for O(1) Tier 3 lookup (was O(n) per call) - Remove duplicate entries from fallback map (space vs underscore variants collapse) - Single entry points: normalizeProjInput(), normalizeWkt1ForGeoTools(), normalizeWkt1ForProj4sedona() - Remove ~210 lines of scattered normalization code from RasterEditors and RasterAccessors
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
common/src/main/java/org/apache/sedona/common/raster/RasterAccessors.java
Outdated
Show resolved
Hide resolved
…y RS_CRS null semantics - RasterAccessors.crs(): apply normalizeWkt1ForProj4sedona() when EPSG code fails and raw WKT1 also fails (consistent with srid==0 branch) - RS_CRS.md: clarify that RS_SRID=0 can mean either no CRS or custom CRS, recommend RS_CRS(raster) IS NULL to test for missing CRS
proj4sedona 0.0.8 registers sterea as alias for Stereographic (#57), so the +proj=sterea → +proj=stere normalization in CrsNormalization is no longer needed.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
common/src/test/java/org/apache/sedona/common/raster/RasterEditorsTest.java
Outdated
Show resolved
Hide resolved
common/src/main/java/org/apache/sedona/common/raster/CrsNormalization.java
Show resolved
Hide resolved
- parseCrsString: add null/blank validation up front - Parameter stripping loop: detect no-op strips, always record lastError - Rename testSetCrsWithAllProj4SedonaProjections → Representative - CrsNormalization.normalizeForMatch: use Locale.ROOT for toLowerCase
- RasterAccessors: extract createProjFromWkt1() helper to deduplicate the try/catch/normalize/retry logic used in both srid>0 and srid==0 branches - RasterEditors.parseCrsString: hoist Hints+CRSFactory creation out of Step 2 try-block so Step 3 reuses the same instances - RasterEditors.stripWktParameter: use Pattern.compile() explicitly instead of String.replaceAll() which recompiles on every call - RasterAccessors.crs: avoid trimming format string twice
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
common/src/test/java/org/apache/sedona/common/raster/CrsRoundTripComplianceTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Did you read the Contributor Guide?
Is this PR related to a ticket?
[GH-XXX] my subject. Closes Can I register a custom CRS to a user-defined EPSG code to use with RS_SetSRID? #2674What changes were proposed in this PR?
Add two new raster functions:
RS_SetCRSandRS_CRS, to support custom CRS string definitions beyond simple integer SRID codes.RS_SetCRS(raster, crsString)
Sets the CRS of a raster using a CRS definition string. Unlike
RS_SetSRIDwhich only accepts integer EPSG codes,RS_SetCRSaccepts CRS definitions in multiple formats:EPSG:4326GEOGCS["WGS 84", ...]GEOGCRS["WGS 84", ...]+proj=longlat +datum=WGS84 +no_defs{"type": "GeographicCRS", ...}Internally, non-WKT1 formats (WKT2, PROJ, PROJJSON) are parsed using proj4sedona 0.0.8 and converted to WKT1 for GeoTools compatibility. The function includes a 3-tier projection name resolution strategy (exact alias match, normalized matching, hardcoded fallback) to handle naming differences between proj4sedona and GeoTools.
RS_CRS(raster[, format])
Returns the CRS of a raster as a string in the specified format:
projjson(default) - Modern JSON representationwkt2- ISO 19162 Well-Known Text 2wkt1- OGC Well-Known Text 1proj- PROJ string formatReturns
nullif the raster has no CRS defined.Design decisions
RS_SRIDwill return 0. Users should useRS_CRSto retrieve the full CRS definition. This avoids expensive EPSG database scans for everyRS_SetCRScall.FORCE_LONGITUDE_FIRST_AXIS_ORDER), consistent with Sedona's existing CRS handling inFunctionsGeoTools.ConcurrentHashMapfor safe concurrent Spark execution.Files changed
Java common layer:
CrsNormalization.java- Centralized CRS name normalization utility bridging GeoTools ↔ proj4sedonaRasterEditors.java-setCrs()implementation with CRS parsing pipelineRasterAccessors.java-crs()implementation with multi-format exportSpark SQL:
RasterEditors.scala,RasterAccessors.scala- Spark SQL expression wrappersCatalog.scala- Function registrationTests:
RasterEditorsTest.java- 7 unit tests (EPSG, WKT1, WKT2, PROJ, PROJJSON, all proj4sedona projections)RasterAccessorsTest.java- 7 unit tests (all output formats, null handling, invalid format)CrsRoundTripComplianceTest.java- 81 round-trip compliance tests across 22+ representative EPSG codes x 4 formats (PROJ, PROJJSON, WKT1, WKT2), verifying idempotency of export-import-re-export cyclesrasteralgebraTest.scala- 9 Spark integration testsDocumentation:
RS_SetCRS.md,RS_CRS.md- Individual function docs with limitations sectionsRS_SRID.md- Updated to document that 0 can mean custom (non-EPSG) CRSRaster-Functions.md- Index page entriesDependency:
How was this patch tested?
RasterEditorsTesttests (EPSG, WKT1, WKT2, PROJ, PROJJSON, all proj4sedona projections)RasterAccessorsTesttests (all output formats, null handling, invalid format)CrsRoundTripComplianceTesttests verifying idempotency across 22+ representative EPSG codes × 4 formatsrasteralgebraTest.scalamvn test -pl commonDid this PR include necessary documentation updates?
v1.9.0format.