Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable backup handler in tests #1399

Merged
merged 2 commits into from
Apr 30, 2020
Merged

Disable backup handler in tests #1399

merged 2 commits into from
Apr 30, 2020

Conversation

t-bast
Copy link
Member

@t-bast t-bast commented Apr 29, 2020

We've been witnessing random test suites freezes (since ages).
We've observed that when these freezes happen, there are usually a lot of "too many open files" errors raised by the OS.

The backup handler is a likely culprit as the IntegrationSpec is running multiple nodes and exchanging HTLCs at a fast rate.

At least it won't hurt disabling it in tests, and will speed up the test suite.

@t-bast t-bast requested a review from pm47 April 29, 2020 14:38
@t-bast
Copy link
Member Author

t-bast commented Apr 29, 2020

Allright one of the test runs failed, so it's clearly not the only culprit...
Back to digging into files...

EDIT: the file limit was somewhat low on semaphore, since I added the limit configuration it doesn't seem to repro

We've been witnessing random test suites freezes (since ages).
We've observed that when these freezes happen, there are usually a lot of
"too many open files" errors raised by the OS.

The backup handler is a likely culprit as the IntegrationSpec is running
multiple nodes and exchanging HTLCs at a fast rate.

At least it won't hurt disabling it in tests, and will speed up the
test suite.
Comment on lines +24 to +26
# configure file limits
- ulimit -S -n 1024000
- echo "fs.file-max = 1024000" | sudo tee -a /etc/sysctl.conf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have fixed the test issues in #1400 I wonder if this is really necessary. If it is, why do the test pass? Shouldn't the "too many open files" error completely break the tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the "too many open files" error completely break the tests?

It looks like it does, but it makes our test suite hang so we don't cleanly witness it (because we kill the build after 15 minutes to avoid wasting resources).

I wonder if this is really necessary

I can't be entirely sure, but since I added that I haven't repro-ed the hang on semaphore (I restarted the test suite several times), so it looks like it happens less often (hopefully it doesn't happen anymore). At least it can't hurt to raise this limit (the ulimit is already at 1024000 by default in semaphore, but it wasn't configured in sysctl.conf).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so this fixes the intermittent hanging while #1400 introduced then fixed a permanent hanging.

@t-bast t-bast merged commit c3a66f3 into master Apr 30, 2020
@t-bast t-bast deleted the fix-test-freeze branch April 30, 2020 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants