Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set minimum timestamp of partitions properly #3141

Merged
merged 4 commits into from May 11, 2023

Conversation

dominiklohmann
Copy link
Member

This fixes a regression that caused partitions created by the partition transformer to have a minimum timestamp of the epoch when the transformation pipeline created new table slices, i.e., lost the original import timestamp as part of the executed pipeline.

This fixes a regression that caused partitions created by the partition
transformer to have a minimum timestamp of the epoch when the
transformation pipeline created new table slices, i.e., lost the
original import timestamp as part of the executed pipeline.
@dominiklohmann dominiklohmann added the bug Incorrect behavior label May 11, 2023
@dominiklohmann dominiklohmann requested a review from tobim May 11, 2023 07:02
The catalog already gets its schemas from the partition synopses, so
there's no need to send it the schemas individually. This was previously
needed when type registry and catalog were separate.
@tobim
Copy link
Member

tobim commented May 11, 2023

I believe we need to adjust the initial value of state.min_import_time to vast::time::max() to make it fully correct. Possible diff:

diff --git a/libvast/include/vast/system/partition_transformer.hpp b/libvast/include/vast/system/partition_transformer.hpp
index e9f8d5a770..e79184775e 100644
--- a/libvast/include/vast/system/partition_transformer.hpp
+++ b/libvast/include/vast/system/partition_transformer.hpp
@@ -103,10 +103,10 @@ struct partition_transformer_state {
   size_t events = 0ull;
 
   /// Oldest import timestamp of the input data.
-  vast::time min_import_time = {};
+  vast::time min_import_time = vast::time::max();
 
   /// Newest import timestamp of the input data.
-  vast::time max_import_time = {};
+  vast::time max_import_time = vast::time::min();
 
   /// The data of the newly created partition(s).
   std::multimap<type, active_partition_state::serialization_data> data = {};

@dominiklohmann dominiklohmann merged commit 9981b7c into main May 11, 2023
40 checks passed
@dominiklohmann dominiklohmann deleted the topic/partition-transformer-timestamps branch May 11, 2023 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior
Projects
None yet
2 participants