Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write homogenous partitions from the partition transformer #2277

Merged
merged 15 commits into from Jun 2, 2022

Conversation

lava
Copy link
Member

@lava lava commented May 13, 2022

Ensure that the partition transformer always produces uniform partitions, ie. partitions that only contain a single type.

馃摑 Checklist

  • All user-facing changes have changelog entries.
  • The changes are reflected on docs.tenzir.com/vast, if necessary.
  • The PR description contains instructions for the reviewer, if necessary.

馃幆 Review Instructions

Mainly look at commit 38a4729 , ideally run locally and see that it produces multiple partitions.

@lava lava force-pushed the story/sc-31754/homogenous-partition-transformer branch from d221b48 to ac94034 Compare May 13, 2022 16:42
@lava lava force-pushed the story/sc-31754/homogenous-partition-transformer branch from 363ff9e to b42dba9 Compare May 24, 2022 08:36
@lava lava force-pushed the story/sc-31754/homogenous-partition-transformer branch from 77ba623 to 84d4f22 Compare May 25, 2022 11:50
@lava lava marked this pull request as ready for review May 25, 2022 11:52
@lava lava force-pushed the story/sc-31754/homogenous-partition-transformer branch from 84d4f22 to 925509b Compare May 25, 2022 11:56
@lava lava changed the title wip: Write homogenous partitions from the partition transformer Write homogenous partitions from the partition transformer May 25, 2022
@lava lava force-pushed the story/sc-31754/homogenous-partition-transformer branch from 925509b to fea5ff3 Compare May 27, 2022 14:05
lava added 4 commits May 27, 2022 20:11
This changes the partition transformer to write not just
one output partition but rather a separate partition for
every output type.

In general the logic is "all-or-nothing", ie. the partition
transformer returns an error if a single output partition
fails to be persisted to disk.

This has a number of downstream implications, in particular
the index now also returns a vector of new partition infos
and the catalog has been expanded with a few bulk handlers.
On systems where `python` points to Python 2 by default,
using a mixture of calls to `python` and `python3` leads
to an inconsistent environment missing required
dependencies.
@lava lava force-pushed the story/sc-31754/homogenous-partition-transformer branch from fea5ff3 to eba7c29 Compare May 27, 2022 18:12
@lava lava changed the base branch from master to topic/remove-legacy-index May 27, 2022 18:12
Base automatically changed from topic/remove-legacy-index to master May 27, 2022 19:18
Copy link
Member

@tobim tobim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few additional questions. I think this only needs a little polish now (and maybe a fix for the serialization data if I understand the code correctly).

libvast/include/vast/system/actors.hpp Show resolved Hide resolved
libvast/src/system/partition_transformer.cpp Outdated Show resolved Hide resolved
libvast/src/system/partition_transformer.cpp Outdated Show resolved Hide resolved
@lava lava force-pushed the story/sc-31754/homogenous-partition-transformer branch from f1eb482 to 0b77acf Compare May 30, 2022 14:58
Copy link
Member

@tobim tobim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great that this is fixed now. Only the test needs to use the VAST_FMT_RUNTIME macro fix.

libvast/test/system/partition_transformer.cpp Outdated Show resolved Hide resolved
Copy link
Member

@tobim tobim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new changes look ok as far as I can assess them right now.

I'm not sure if possible at the moment but we should add an integration test for this at some time.

@lava lava force-pushed the story/sc-31754/homogenous-partition-transformer branch from 23b3feb to 93456ef Compare May 31, 2022 13:51
@lava
Copy link
Member Author

lava commented Jun 1, 2022

I'm not sure if possible at the moment but we should add an integration test for this at some time.

It's possible but really ugly at the moment; see the compaction tests. This will become much easier when we have the vast flush command.

Co-authored-by: Gy艖z艖 G谩sp谩r <gyozo.gaspar@tenzir.com>
@lava lava force-pushed the story/sc-31754/homogenous-partition-transformer branch from abde549 to d7bcfa9 Compare June 2, 2022 12:58
Co-authored-by: Gy艖z艖 G谩sp谩r <gyozo.gaspar@tenzir.com>
@lava lava force-pushed the story/sc-31754/homogenous-partition-transformer branch from d7bcfa9 to 9acd287 Compare June 2, 2022 15:01
@lava lava enabled auto-merge June 2, 2022 15:01
@lava lava merged commit 56b2bd4 into master Jun 2, 2022
@lava lava deleted the story/sc-31754/homogenous-partition-transformer branch June 2, 2022 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants