Skip to content

Conversation

@binary-signal
Copy link
Contributor

Add ipc_compression feature to arrow dependency

@binary-signal
Copy link
Contributor Author

plus implement deserialize for timestamps

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for Fluss temporal data types (Time, Timestamp, TimestampLTz) with zstd compression by enabling the ipc_compression feature in the Arrow dependency. The implementation includes Arrow type conversions based on precision values and JSON serialization/deserialization for these temporal types.

Key changes:

  • Enabled ipc_compression feature for Arrow dependency to support zstd compression
  • Implemented Arrow type conversions for Time, Timestamp, and TimestampLTz with precision-based TimeUnit selection
  • Added JSON serialization/deserialization support for temporal types with precision handling

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
Cargo.toml Added ipc_compression feature to arrow dependency for zstd compression support
crates/fluss/src/record/arrow.rs Added TimeUnit import, temporal type builders, and to_arrow_type implementations for Time/Timestamp/TimestampLTz
crates/fluss/src/metadata/json_serde.rs Added JSON serialization/deserialization for Time/Timestamp/TimestampLTz types with precision handling

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

1..=3 => ArrowDataType::Time32(TimeUnit::Millisecond),
4..=6 => ArrowDataType::Time64(TimeUnit::Microsecond),
7..=9 => ArrowDataType::Time64(TimeUnit::Nanosecond),
_ => ArrowDataType::Time32(TimeUnit::Second), // fallback
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback comment is misleading. According to TimeType::MAX_PRECISION (9), precision values greater than 9 should be considered invalid rather than silently falling back to Time32(Second). Consider either adding validation to reject invalid precision values, or updating the comment to clarify this is handling out-of-range precision values (e.g., '// fallback for invalid precision > 9').

Suggested change
_ => ArrowDataType::Time32(TimeUnit::Second), // fallback
_ => ArrowDataType::Time32(TimeUnit::Second), // fallback for invalid precision > 9

Copilot uses AI. Check for mistakes.
1..=3 => TimeUnit::Millisecond,
4..=6 => TimeUnit::Microsecond,
7..=9 => TimeUnit::Nanosecond,
_ => TimeUnit::Microsecond, // default is 6, so microseconds
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback branches in both Timestamp and TimestampLTz match arms (lines 566 and 577) use Microsecond with the comment 'default is 6, so microseconds'. This duplicates the logic. Consider extracting a helper function like precision_to_time_unit(precision: u32) -> TimeUnit to avoid code duplication and ensure consistency between Timestamp and TimestampLTz conversions.

Copilot uses AI. Check for mistakes.
7..=9 => TimeUnit::Nanosecond,
_ => TimeUnit::Microsecond, // default is 6, so microseconds
};
// Use UTC as the timezone for local timezone timestamps
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment 'Use UTC as the timezone for local timezone timestamps' is confusing. TimestampLTz represents 'TIMESTAMP WITH LOCAL TIME ZONE', which typically means the timestamp should be stored in UTC but interpreted in the local timezone. The implementation is correct, but the comment should clarify that UTC is used for storage while the local timezone is applied during interpretation/display.

Suggested change
// Use UTC as the timezone for local timezone timestamps
// Store timestamps in UTC; local timezone is applied during interpretation/display

Copilot uses AI. Check for mistakes.
Comment on lines 118 to 124
obj.insert(Self::FIELD_NAME_PRECISION.to_string(), json!(_type.precision()));
}
DataType::Timestamp(_type) => {
todo!()
obj.insert(Self::FIELD_NAME_PRECISION.to_string(), json!(_type.precision()));
}
DataType::TimestampLTz(_type) => {
todo!()
obj.insert(Self::FIELD_NAME_PRECISION.to_string(), json!(_type.precision()));
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three consecutive match arms (Time, Timestamp, TimestampLTz) have identical implementations - all insert precision into the JSON object. These can be combined into a single match arm pattern: DataType::Time(_type) | DataType::Timestamp(_type) | DataType::TimestampLTz(_type) => { obj.insert(Self::FIELD_NAME_PRECISION.to_string(), json!(_type.precision())); }

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant