Skip to content

MINIFICPP-1434: Fix flaky CSite2SiteTests on macOS#994

Closed
martinzink wants to merge 2 commits intoapache:mainfrom
martinzink:MINIFICPP-1434
Closed

MINIFICPP-1434: Fix flaky CSite2SiteTests on macOS#994
martinzink wants to merge 2 commits intoapache:mainfrom
martinzink:MINIFICPP-1434

Conversation

@martinzink
Copy link
Member

@martinzink martinzink commented Feb 5, 2021

Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file?
  • If applicable, have you updated the NOTICE file?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible.

@martinzink
Copy link
Member Author

The issue is that on macOS the default sigpipe handler crashes the application.
In production code we are already ignoring this by setting the sigpipe handle to nothing. see https://github.com/apache/nifi-minifi-cpp/blob/main/main/MiNiFiMain.cpp#L181
Given this I think it would be appropriate to ignore it in these tests as well.

I've tested this on my fork with custom CI, on both macos-xcode12.0 and macos-xcode11.2.1 with 200 runs
make test ARGS="--timeout 300 -j4 --output-on-failure -R CSite2SiteTests --repeat-until-fail 200
Without the fix both configuration failed (both around the 100th run), with the fix they passed 3 times (thats 1200 runs without failure)

TEST_CASE("TestSiteToBootStrap", "[S2S3]") {

#ifndef WIN32
signal(SIGPIPE, SIG_IGN);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate how this fixes the issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, a SIGPIPE is sent to a process if it tried to write to a socket that had been shutdown for writing or isn't connected (anymore), this can easily happen in multithreaded enviroment.
The issue is that on macOS the default sigpipe handler crashes the application.
In production code we are already ignoring this signal by setting the sigpipe handle to nothing. see https://github.com/apache/nifi-minifi-cpp/blob/main/main/MiNiFiMain.cpp#L181

Given that in production code we are ignoring the signal, I think it would be best to ignore it here as well.

I've tested this on my fork with custom CI, on both macos-xcode12.0 and macos-xcode11.2.1 with 200 runs
make test ARGS="--timeout 300 -j4 --output-on-failure -R CSite2SiteTests --repeat-until-fail 200
Without the fix both configuration failed (both around the 100th run), with the fix they passed 3 times (that's 1200 runs without failure)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the details, makes sense!
At least whoever finds this change lager, will see the reasoning here.

@arpadboda arpadboda closed this in b977773 Feb 8, 2021
@martinzink martinzink deleted the MINIFICPP-1434 branch March 6, 2023 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants