Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Duration Field & FieldType support #1374

Merged
merged 41 commits into from
Apr 6, 2023
Merged

feat: Duration Field & FieldType support #1374

merged 41 commits into from
Apr 6, 2023

Conversation

chloeminkyung
Copy link
Contributor

@chloeminkyung chloeminkyung commented Apr 4, 2023

  • implementation
  • mathematical
  • unit tests
  • will tackle on sql representation of Duration on separate PR (e.g. INTERVAL '1 MINUTE', DURATION '1 MINUTE', "6/04/2023 10:00" + 30 MINUTES)
pub struct DozerDuration(pub std::time::Duration, pub TimeUnit);

pub enum TimeUnit {
    Seconds,
    Milliseconds,
    Microseconds,
    Nanoseconds,
}

Field implementation

Duration(DozerDuration),

FieldType implementation

/// Duration up to nanoseconds.
Duration,
// Duration + Duration = Duration
// Duration - Duration = Duration
// Duration * Duration = Error
// Duration / Duration = Error
// Duration % Duration = Error

// Duration + Timestamp = Timestamp
// Duration - Timestamp = Error
// Duration * Duration = Error
// Duration / Duration = Error
// Duration % Duration = Error

// Timestamp + Duration = Timestamp
// Timestamp - Duration = Timestamp
// Timestamp * Duration = Error
// Timestamp / Duration = Error
// Timestamp % Duration = Error

// Timestamp + Timestamp = Error
// Timestamp - Timestamp = Duration (valid only when Timestamp (left) > Timestamp (right))
// Timestamp * Timestamp = Error
// Timestamp / Timestamp = Error
// Timestamp % Timestamp = Error

Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
# Conflicts:
#	dozer-orchestrator/src/pipeline/sinks.rs
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
@chubei
Copy link
Contributor

chubei commented Apr 4, 2023

Can you summarize:

  • How is duration represented in REST API
  • How is duration represented in gRPC API
  • How does duration convert to and from arrow data types in gRPC connector
  • How does duration convert from arrow data types in object store connector

And I remember we determined to use

pub struct DozerDuration(pub std::time::Duration, pub TimeUnit);

Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
@chloeminkyung
Copy link
Contributor Author

chloeminkyung commented Apr 5, 2023

Can you summarize:

  • How is duration represented in REST API
  • How is duration represented in gRPC API
  • How does duration convert to and from arrow data types in gRPC connector
  • How does duration convert from arrow data types in object store connector

And I remember we determined to use

pub struct DozerDuration(pub std::time::Duration, pub TimeUnit);
  • How is duration represented in REST API
fn convert_duration_to_object(d: &DozerDuration) -> Value {
    let mut m = Map::new();
    m.insert("value".to_string(), Value::from(d.0.as_nanos().to_string())); // as_nanos() will give out u128 value which has to be represented in string as there is no support for 128 primitive value yet
    m.insert("time_unit".to_string(), Value::from(d.1.to_string()));
    Value::Object(m)
}
  • How is duration represented in gRPC API
fn map_duration_to_prost_coord_map(d: DozerDuration) -> Value {
    Value {
        value: Some(value::Value::DurationValue(DurationType {
            value: d.0.as_nanos().to_string(),
            time_unit: d.1.to_string(),
        })),
    }
}
  • How does duration convert to and from arrow data types in gRPC connector
FieldType::Duration => DataType::Duration(TimeUnit::Nanosecond),

FieldType::Duration => {
            let mut builder = arrow::array::DurationNanosecondArray::builder(count);
            for field in fields {
                match field {
                    Field::Duration(value) => builder.append_value(value.0.as_nanos().as_i64()),
                    Field::Null => builder.append_null(),
                    _ => panic!("Unexpected field type"),
                }
            }
            Arc::new(builder.finish())
        }
  • How does duration convert from arrow data types in object store connector
GrpcTypes::value::Value::DurationValue(d) => {
            let duration_type_desc = descriptor.duration_field.message.clone();
            let value_field_desc = &descriptor.duration_field.value;
            let time_unit_field_desc = &descriptor.duration_field.time_unit;
            let mut duration = DynamicMessage::new(duration_type_desc);
            duration.set_field(value_field_desc, prost_reflect::Value::String(d.value));
            duration.set_field(
                time_unit_field_desc,
                prost_reflect::Value::String(d.time_unit),
            );
            Value::Message(duration)
        }

Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
@coveralls
Copy link

coveralls commented Apr 5, 2023

Pull Request Test Coverage Report for Build 4626028337

  • 776 of 1109 (69.97%) changed or added relevant lines in 30 files are covered.
  • 440 unchanged lines in 26 files lost coverage.
  • Overall coverage increased (+0.6%) to 72.398%

Changes Missing Coverage Covered Lines Changed/Added Lines %
dozer-api/src/generator/oapi/utils.rs 0 1 0.0%
dozer-cache/src/cache/lmdb/cache/main_environment/mod.rs 0 1 0.0%
dozer-ingestion/tests/test_suite/basic.rs 0 1 0.0%
dozer-sql/src/pipeline/expression/datetime.rs 0 1 0.0%
dozer-sql/src/pipeline/expression/python_udf.rs 0 1 0.0%
dozer-sql/src/pipeline/expression/tests/execution.rs 9 10 90.0%
dozer-ingestion/tests/test_suite/connectors/sql.rs 0 2 0.0%
dozer-sql/src/pipeline/expression/scalar/number.rs 0 2 0.0%
dozer-api/src/rest/api_generator.rs 14 17 82.35%
dozer-sql/src/pipeline/aggregation/count.rs 0 3 0.0%
Files with Coverage Reduction New Missed Lines %
dozer-api/src/cache_builder/mod.rs 1 59.09%
dozer-api/src/generator/oapi/utils.rs 1 48.15%
dozer-api/src/grpc/types_helper.rs 1 41.4%
dozer-api/src/grpc/typed/helper.rs 3 59.24%
dozer-api/src/rest/api_generator.rs 4 70.63%
dozer-sql/src/pipeline/aggregation/max.rs 4 57.01%
dozer-sql/src/pipeline/aggregation/min.rs 4 57.01%
dozer-sql/src/pipeline/errors.rs 4 21.74%
dozer-api/src/generator/oapi/generator.rs 5 84.95%
dozer-core/src/executor/source_node.rs 5 94.74%
Totals Coverage Status
Change from base Build 4625570253: 0.6%
Covered Lines: 33422
Relevant Lines: 46164

💛 - Coveralls

Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
@chloeminkyung chloeminkyung requested review from v3g42, chubei, snork-alt and mediuminvader and removed request for v3g42 April 6, 2023 05:52
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
@chloeminkyung chloeminkyung marked this pull request as ready for review April 6, 2023 05:54
Signed-off-by: Chloe Kim <chloeminkyung@gmail.com>
Copy link
Contributor

@v3g42 v3g42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@chloeminkyung chloeminkyung removed the request for review from chubei April 6, 2023 06:19
@chloeminkyung chloeminkyung merged commit 9ddf429 into main Apr 6, 2023
@chloeminkyung chloeminkyung deleted the feat/duration branch April 6, 2023 07:09
@chloeminkyung chloeminkyung linked an issue Apr 6, 2023 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce Duration as a FieldType
5 participants