Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: fix raciness in stop mode tests #5526

merged 7 commits into from Jul 23, 2018


Copy link

@zyga zyga commented Jul 17, 2018

This branch fixes raciness in the stop-mode tests. This test was failing for
the following reasons:

  • the registration of the signal handler was racing with the kill
  • the shell semantics for trap is error prone and was used incorrectly

The first issue is fixed with a "stamp" like file that we look for in the test.
This ensures that before we attempt to stop the service we wait for it to
initialize fully, that is to register the signal handlers.

The second issue is fixed by re-writing the test in Python 3. This requires
some more explanation. We had established signal handlers using the "trap"
shell builtin and were expecting them to run when a signal (sent by systemd as
a result of stopping the service) arrives. What we can see by running the shell
script with "strace -e %signal" indicates that shell disables trap handlers
before executing any command. A trivial test involved increasing the sleep time
to a large value and sending the expected signal at that time. The signal
handler never runs in that case.

This is fixed by writing the test service in Python and not relying on
additional processes. This has the expected semantics and works well in

Shell is tricky, more than we though, and we keep learning this the hard way.

Signed-off-by: Zygmunt Krynicki

zyga added 4 commits July 17, 2018 12:13
This fixes the issue where shell disables signal handlers while running
external commands. This was causing the signal handler established via
"trap" do nothing at all while the signal was delivered during "sleep 1"
process lifetime.

Signed-off-by: Zygmunt Krynicki <>
This fixes this issue:

    Traceback (most recent call last):
      File "./start-stop-mode", line 49, in <module>
      File "./start-stop-mode", line 45, in main
      File "./start-stop-mode", line 29, in on_signal
	print("got {}".format(signal.Signals(signum).name.lower()))
      File "./start-stop-mode", line 29, in on_signal
	print("got {}".format(signal.Signals(signum).name.lower()))
    RuntimeError: reentrant call inside <_io.BufferedWriter name='<stdout>'>

Signed-off-by: Zygmunt Krynicki <>
This allows the test to either use system synchronization or hand-made
synchronization but in either case, not be racy.

Signed-off-by: Zygmunt Krynicki <>
Signed-off-by: Zygmunt Krynicki <>
zyga added 2 commits July 17, 2018 14:02
I've seen the light now. I can sleep.

Signed-off-by: Zygmunt Krynicki <>
Signed-off-by: Zygmunt Krynicki <>
Copy link

codecov-io commented Jul 17, 2018

Codecov Report

Merging #5526 into master will increase coverage by 0.23%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5526      +/-   ##
+ Coverage   79.06%   79.29%   +0.23%     
  Files         514      514              
  Lines       38949    39492     +543     
+ Hits        30795    31316     +521     
- Misses       5681     5688       +7     
- Partials     2473     2488      +15
Impacted Files Coverage Δ
overlord/ifacestate/implicit.go 96.42% <0%> (-3.58%) ⬇️
overlord/hookstate/hookmgr.go 75.23% <0%> (+0.93%) ⬆️
overlord/snapstate/snapstate.go 85.6% <0%> (+3.55%) ⬆️
overlord/ifacestate/helpers.go 78.79% <0%> (+7.88%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4dd420d...85d675d. Read the comment docs.

Copy link
Collaborator Author

zyga commented Jul 19, 2018

As a general comment, the test still suffers some issues but we suspect those are in our general log extracting helpers. We've decided to merge this as is as it fixes real issues. Investigation of the log extractor bug continues in the branch referenced by mvo above.

Copy link

@mvo5 mvo5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you! Much nicer than the original shell version.

mvo5 added a commit to mvo5/snappy that referenced this pull request Jul 19, 2018
@bboozzoo bboozzoo merged commit c6ce331 into snapcore:master Jul 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet
4 participants