-
Notifications
You must be signed in to change notification settings - Fork 321
sdjournal: fix a race in GetEntry test #181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hard-sleep instead of waiting for a journal event. This fixes a race due to waiting for any events but enumerating only matching ones.
|
All together #178, #179, and #181 let the whole testsuite run on travis without failures. |
|
@lucab how about just putting them up in a single PR? |
|
@jonboulle #179 is some "creative" travis usage which I'm not sure if ok. The other two could have been merged yes, but this last one was unplanned at first as I was expecting #180 to be harder to trace down. |
|
@lucab is that like "creative accounting"? ;-). |
|
@jonboulle exactly 😄 I expanded the discussion in that PR. I'm not joining here the other two testsuite-fixing PRs now, but I'll avoid wasting resources next time. Relevant to this specific PR, there is a thread ongoing on systemd ML: https://lists.freedesktop.org/archives/systemd-devel/2016-June/036946.html |
| if r < 0 { | ||
| t.Fatalf("Error waiting to journal") | ||
| } | ||
| time.Sleep(time.Duration(1) * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on why a hard sleep is better than waiting for the journal? This seems counter-intuitive to me as time.Sleep cries for flakes later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From ML thread (quoting myself in lack of a better source):
sd_journal_wait() will trigger on *any* events, while sd_journal_get_data()
will apply the filter and find no matching entries.
In our context, Wait() will return as soon as there is a new message available in the log even if it doesn't match the filter, in which case Next() will return EOF (if expected entry is not yet available).
This sleep could be replaced with something like a Wait&Next with a retrial counter, but it will not eliminate the flakiness of the test: it will then depend on the magic number of retrials and additionally also on how many events happen before our expected entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could loop over sd_journal_wait()/sd_journal_get_data() until we eventually get our message matching the field, but I agree this could end up in an endless loop.
Not really happy about the time.Sleep at all, so let's observe this in subsequent CI builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me neither, but I think sdjournal would need to grow a C API for filtered event-triggering to properly address this usecase. May I go on and merge this as-is for now?
Hard-sleep instead of waiting for a journal event.
This fixes a race due to waiting for any events but enumerating only
matching ones.
Closes #180