Skip to content

Conversation

@eadgbear
Copy link
Contributor

@eadgbear eadgbear commented May 6, 2025

Latest commit: Adding object functions v1

  • Introduced three new variant object manipulation functions:
    • object_delete: Remove elements from variant objects
    • object_insert: Add new elements to variant objects
    • object_pick: Select specific elements from variant objects
  • Minor refinements to timestamp and time-related functions

Code standards alignment

  • Comprehensive refactoring of variant functions for better code quality
  • Standardized implementation across array operations including:
    • Array manipulation (append, cat, compact, construct)
    • Array queries (contains, distinct, position)
    • Array transformations (reverse, sort, slice)
    • Array combinations (intersection, except, overlap)
  • Improved code organization and documentation
  • Enhanced error handling and type safety

Initial variant function implementation

  • Established core variant function framework
  • Added extensive array manipulation capabilities:
    • Basic operations (append, prepend, remove)
    • Query operations (contains, position, size)
    • Transformation functions (compact, distinct, sort)
    • Range and generation functions
    • Multi-array operations (zip, overlap, intersection)
  • Implemented JSON parsing and handling
  • Added object aggregation support
  • Created visitor pattern implementation for variant types
  • Restructured codebase to support variant operations
  • Improved query planning for variant operations

TODO

object_construct needs more work and will arrive in a subsequent PR

Copy link
Contributor

@rampage644 rampage644 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't dive deep into actual udf implemenetation, overall structure overall lgtm

@rampage644 rampage644 requested review from Vedin, jonathanc-n and ravlio May 6, 2025 13:51
@rampage644 rampage644 requested a review from Copilot May 12, 2025 02:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces variant object manipulation functions and refactors existing variant functions and module structures for improved organization and type safety. Key changes include:

  • Updates to the module paths for UDFs and UDAFs, e.g. moving functions into dedicated udafs and udfs folders.
  • The addition of new dependency jsonpath_lib and a new lint rule in Cargo.toml.
  • Removal of the aggregate/mod.rs file and adjustments in snapshot metadata.

Reviewed Changes

Copilot reviewed 73 out of 73 changed files in this pull request and generated no comments.

Show a summary per file
File Description
crates/runtime/src/execution/datafusion/functions/udfs/date_from_parts.rs Updated import paths to reflect new module structure for UDFs.
crates/runtime/src/execution/datafusion/functions/udafs/mod.rs Introduced module to register UDAFs.
crates/runtime/src/execution/datafusion/functions/udafs/any_value.rs Removed the Apache license header and added an attribute to the new constructor.
crates/runtime/src/execution/datafusion/functions/table/flatten.rs Adjusted test import paths to match the new UDF module structure.
crates/runtime/src/execution/datafusion/functions/mod.rs Restructured module exports to separate udafs and udfs.
crates/runtime/src/execution/datafusion/functions/aggregate/mod.rs Removed the aggregate module likely due to refactoring consolidations.
crates/runtime/Cargo.toml Added new dependency jsonpath_lib.
crates/metastore/src/snapshots/embucket_metastore__metastore__tests__create_volumes.snap Updated snapshot metadata by adding snapshot_kind.
Cargo.toml Added a new lint rule to allow redundant_closure_for_method_calls.
Comments suppressed due to low confidence (1)

crates/runtime/src/execution/datafusion/functions/udafs/any_value.rs:1

  • The Apache license header has been removed in this file. Please ensure that the appropriate licensing text is maintained according to project licensing policies.
// Licensed to the Apache Software Foundation (ASF) under one

@ravlio
Copy link
Contributor

ravlio commented May 12, 2025

@eadgbear do you plan to resolve conflicts?

@eadgbear
Copy link
Contributor Author

@ravlio What was the reason for moving all of the functions to a new crate? It's now made this PR even bigger with the merge

@rampage644
Copy link
Contributor

The decision to move functions into a separate crate followed as a natural consequence of the #744.

Copy link
Contributor

@rampage644 rampage644 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure this won't accidentally get merged without a rebase

@ravlio
Copy link
Contributor

ravlio commented May 14, 2025

@eadgbear I found such missing functions:

array_construct_compact
array_flatten
array_to_string
array_union_agg
array_unique_agg

If you don't mind, I'll take them to work

@rampage644
Copy link
Contributor

array_construct_compact

That one I think is achieved via AST rewrite rule btw

@eadgbear
Copy link
Contributor Author

eadgbear commented May 14, 2025 via email

@ravlio
Copy link
Contributor

ravlio commented May 14, 2025

@eadgbear the thing is that array_string has separator argument:

ARRAY_TO_STRING( <array> , <separator_string> )

not a big deal to implement though, just replace , to <separator_string>, but anyway let me implement this

@eadgbear
Copy link
Contributor Author

eadgbear commented May 14, 2025 via email

@rampage644 rampage644 closed this May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants