Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINIFICPP-1380 - Batch behavior for CompressContent and MergeContent processors #917

Closed
wants to merge 14 commits into from

Conversation

adamdebreceni
Copy link
Contributor

Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file?
  • If applicable, have you updated the NOTICE file?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible.

@adamdebreceni adamdebreceni marked this pull request as draft October 1, 2020 12:03
@adamdebreceni adamdebreceni marked this pull request as ready for review October 2, 2020 10:08
extensions/libarchive/BinFiles.h Outdated Show resolved Hide resolved
extensions/libarchive/CompressContent.cpp Outdated Show resolved Hide resolved
extensions/libarchive/CompressContent.cpp Outdated Show resolved Hide resolved
extensions/libarchive/CompressContent.cpp Outdated Show resolved Hide resolved
extensions/libarchive/CompressContent.h Outdated Show resolved Hide resolved
core::Property BinFiles::BatchSize(
core::PropertyBuilder::createProperty("Batch Size")
->withDescription("Maximum number of FlowFiles processed in a single session")
->withDefaultValue<uint32_t>(1)->build());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if this is the best default value...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to preserve the original behavior, as this is an optimazation feature the user can configure on an as-needed basis.
Should we change it, and if so, what do you think would be the appropriate default?

auto flow = session->get();

if (flow == nullptr) {
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if yield should be applied in case 0 FFs were processed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem with yielding is that, this processor (MergeContent) can function correctly even if there are no incoming flowFiles (it can still emit already processed merged files)
this is not true for CompressContent so added a yield there

#define FOR_EACH_7(fn, delim, _1, _2, _3, _4, _5, _6, _7) \
fn(_1) delim() fn(_2) delim() fn(_3) delim() fn(_4) delim() \
fn(_5) delim() fn(_6) delim() fn(_7)
#define FOR_EACH_8(fn, delim, _1, _2, _3, _4, _5, _6, _7, _8) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it can be done in a generic way with variadic arg and using the first of the variadic part to process and pass the rest to a recursive call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish, but recursive macro expansion is a no-go 😢

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could give something like this a try:
#define FOR_EACH_8(fn, delim, _1, ...) fn(_1) delim() FOR_EACH_7(fn, delim, __VA_ARGS__)

libminifi/include/utils/Enum.h Show resolved Hide resolved
libminifi/test/Utils.h Outdated Show resolved Hide resolved
Copy link
Contributor

@lordgamez lordgamez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@arpadboda arpadboda closed this in ed523d4 Nov 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants