Implement `project` for `Transform`. #264

liurenjie1024 · 2024-03-14T02:47:54Z

What's project used for? Say we have row filter a = 10, and we have a partition spec bucket(a, 37) as bs, if one row matches a=10, then its partition value should match bucket(10, 37) as bs, and we project a=10 to bs = bucket(10, 37).

Blocked by #283

The text was updated successfully, but these errors were encountered:

marvinlanhenke · 2024-03-14T13:58:17Z

@liurenjie1024
...just to make sure I'm on the right track - and to confirm the idea.

Here is what I would (high-level):

implement fn project on Transform
match Transform & orginal BoundPredicate
handle each Transform (e.g. Bucket) and Predicate according to Java Implementation
return Option<Predicate>

In order to implement the ProjectionEvaluator later, we can then get the PartitionSpec from the manifest, iterate over fields: Vec<PartitionField> and call fn project(...) on each field's transform...

some pseudo-implementation to illustrate:

impl Transform {
// ...
pub fn project(&self, name: String, pred: &BoundPredicate) -> Option<Predicate> {
        match self {
            Transform::Bucket(_) => match pred {
                BoundPredicate::Unary(expr) => Some(Predicate::Unary(UnaryExpression::new(
                    expr.op(),
                    Reference::new(name),
                ))),
                _ => unimplemented!(),
            },
            _ => unimplemented!(),
        }
    }
}

#[test]
fn test_bucket_project() {
    let trans = Transform::Bucket(8);

    let name = "projected_name".to_string();

    let field = NestedField::required(1, "a", Type::Primitive(PrimitiveType::Int));

    let pred = BoundPredicate::Unary(UnaryExpression::new(
        PredicateOperator::IsNull,
        BoundReference::new("original_name", Arc::new(field)),
    ));

    let result = trans.project(name, &pred).unwrap();

    assert_eq!(format!("{result}"), "projected_name IS NULL");
}

liurenjie1024 · 2024-03-19T02:11:16Z

@marvinlanhenke Sorry for late reply. Yeah, this is exactly what I'm thinking about, thanks!

marvinlanhenke · 2024-03-19T06:32:20Z

thank you, once #283 is ready, I will continue here.

marvinlanhenke · 2024-03-25T12:30:01Z

@ZENOTME
are you still working on #287 and the proposed interface?
just looking for a quick update on this - and huge thanks for your effort on this.

ZENOTME · 2024-03-25T12:32:20Z

@ZENOTME are you still working on #287 and the proposed interface? just looking for a quick update on this - and huge thanks for your effort on this.

Yes. I will update it later.

marvinlanhenke · 2024-03-27T13:16:36Z

@liurenjie1024 @ZENOTME
...the fn project(...) is getting kinda extensive (lengthy). I think it'd be appropriate to introduce some helper functions here instead of writing a huge match statement - to make the code more readable and reduce duplication.

Any suggestions/ preferences where to put those?

mod.rs in iceberg/src/transform
create utils.rs in iceberg/src/spec
create fns on impl Transform itself

thanks for your thoughts on this

EDIT:
For now, I decided to go with option no. 3, which I like since the scope of those helper functions is restricted only to Transform - putting those into some utils.rs would not be justified, since nobody else woud need those fns...

liurenjie1024 · 2024-03-28T01:24:58Z

@liurenjie1024 @ZENOTME ...the fn project(...) is getting kinda extensive (lengthy). I think it'd be appropriate to introduce some helper functions here instead of writing a huge match statement - to make the code more readable and reduce duplication.

Any suggestions/ preferences where to put those?

mod.rs in iceberg/src/transform

create utils.rs in iceberg/src/spec

create fns on impl Transform itself

thanks for your thoughts on this

EDIT: For now, I decided to go with option no. 3, which I like since the scope of those helper functions is restricted only to Transform - putting those into some utils.rs would not be justified, since nobody else woud need those fns...

Hi, @marvinlanhenke Thanks for raising this. I prefer option no. 3 too with similar reason: putting related codes together helps to organize things better.

ZENOTME · 2024-03-28T13:30:56Z

@liurenjie1024 @ZENOTME ...the fn project(...) is getting kinda extensive (lengthy). I think it'd be appropriate to introduce some helper functions here instead of writing a huge match statement - to make the code more readable and reduce duplication.

Any suggestions/ preferences where to put those?

mod.rs in iceberg/src/transform

create utils.rs in iceberg/src/spec

create fns on impl Transform itself

thanks for your thoughts on this

EDIT: For now, I decided to go with option no. 3, which I like since the scope of those helper functions is restricted only to Transform - putting those into some utils.rs would not be justified, since nobody else woud need those fns...

+1 for no. 3 too

liurenjie1024 added this to the 0.3.0 Release milestone Mar 14, 2024

marvinlanhenke mentioned this issue Mar 14, 2024

[WIP] Implement project for Transform. #264 #269

Closed

This was referenced Mar 20, 2024

feat: add transform_literal #287

Merged

Implement transforms projection #289

Closed

marvinlanhenke mentioned this issue Mar 27, 2024

feat: Project transform #309

Merged

marvinlanhenke mentioned this issue Mar 31, 2024

Bug: fn day_timestamp_micro produces wrong results #311

Closed

liurenjie1024 closed this as completed in #309 Apr 5, 2024

marvinlanhenke mentioned this issue Apr 6, 2024

Add BoundPredicateVisitor trait #320

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement `project` for `Transform`. #264

Implement `project` for `Transform`. #264

liurenjie1024 commented Mar 14, 2024 •

edited

marvinlanhenke commented Mar 14, 2024

liurenjie1024 commented Mar 19, 2024

marvinlanhenke commented Mar 19, 2024

marvinlanhenke commented Mar 25, 2024

ZENOTME commented Mar 25, 2024

marvinlanhenke commented Mar 27, 2024 •

edited

liurenjie1024 commented Mar 28, 2024

ZENOTME commented Mar 28, 2024

Implement project for Transform. #264

Implement project for Transform. #264

Comments

liurenjie1024 commented Mar 14, 2024 • edited

marvinlanhenke commented Mar 14, 2024

liurenjie1024 commented Mar 19, 2024

marvinlanhenke commented Mar 19, 2024

marvinlanhenke commented Mar 25, 2024

ZENOTME commented Mar 25, 2024

marvinlanhenke commented Mar 27, 2024 • edited

liurenjie1024 commented Mar 28, 2024

ZENOTME commented Mar 28, 2024

Implement `project` for `Transform`. #264

Implement `project` for `Transform`. #264

liurenjie1024 commented Mar 14, 2024 •

edited

marvinlanhenke commented Mar 27, 2024 •

edited