Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#12018 Fix flaky twisted.test.test_twistd tests. #12019

Merged
merged 10 commits into from
Oct 31, 2023

Conversation

adiroiban
Copy link
Member

@adiroiban adiroiban commented Oct 12, 2023

Scope and purpose

Fixes #12018

It looks like the twistd tests are failing quite often.

This is caused by test_getLogObserverDefaultFile not closing the log file... but finding the source of this error wasn't that easy :)

[FAIL]
Traceback (most recent call last):
  File "/home/runner/work/twisted/twisted/.tox/alldeps-withcov-posix/lib/python3.11/site-packages/twisted/test/test_twistd.py", line 1030, in test_setUidSameAsCurrentUid
    self.assertEqual(expectedWarning, warningsShown[0]["message"])
  File "/home/runner/work/twisted/twisted/.tox/alldeps-withcov-posix/lib/python3.11/site-packages/twisted/trial/_synctest.py", line 444, in assertEqual
    super().assertEqual(first, second, msg)
  File "/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/unittest/case.py", line 873, in assertEqual
    assertion_func(first, second, msg=msg)
  File "/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/unittest/case.py", line 1253, in assertMultiLineEqual
    self.fail(self._formatMessage(msg, standardMsg))
twisted.trial.unittest.FailTest: 'tried to drop privileges and setuid 1001[52 chars]ing.' != "unclosed file <_io.FileIO name='/home/ru[103 chars]rue>"
- tried to drop privileges and setuid 1001 but uid is already 1001; should we be root? Continuing.
+ unclosed file <_io.FileIO name='/home/runner/work/twisted/twisted/.tox/alldeps-withcov-posix/tmp/_trial_temp/3/twistd.log' mode='rb+' closefd=True>


twisted.test.test_twistd.UnixApplicationRunnerStartApplicationTests.test_setUidSameAsCurrentUid

This tries to fix the error.

it also moves more tests that start an application in the same testcase, so that they are not executed in parallel.

There is missing coverage, but the code is just moved.

@adiroiban
Copy link
Member Author

Since this generates many flaky tests on trunk I think we can fix it as it is now.

So this is kind of "emergency PR" to reduce the noise on other PRs.

It should reduce the probability, since now the tests should no longer be executed in parallel.

If I run $ trial -j 2 -u twisted.test.test_twistd

I still get a failure...maybe after 100 runs.

  UnixApplicationRunnerStartApplicationTests
    test_applicationStartsWithConfiguredNumericIDs ...                     [OK]
    test_setUidSameAsCurrentUid ...                                        [OK]
    test_setUidWithoutGid ...                                              [OK]
    test_applicationStartsWithConfiguredNameIDs ...                     [ERROR]
    test_setupEnvironment ...                                              [OK]
    test_shedPrivileges ...                                                [OK]
    test_shedPrivilegesError ...                                           [OK]

===============================================================================
[ERROR]
Traceback (most recent call last):
  File "/home/adi/chevah/twisted/src/twisted/test/test_twistd.py", line 1012, in test_applicationStartsWithConfiguredNameIDs
    self._applicationStartsWithConfiguredID(
  File "/home/adi/chevah/twisted/src/twisted/test/test_twistd.py", line 981, in _applicationStartsWithConfiguredID
    runner.preApplication()
  File "/home/adi/chevah/twisted/src/twisted/scripts/_twistd_unix.py", line 205, in preApplication
    checkPID(self.config["pidfile"])
  File "/home/adi/chevah/twisted/src/twisted/scripts/_twistd_unix.py", line 109, in checkPID
    sys.exit(
builtins.SystemExit: Another twistd server is running, PID 290716

This could either be a previously started instance of your application or a
different application entirely. To start a new one, either run it in some other
directory, or use the --pidfile and --logfile parameters to avoid clashes.


twisted.test.test_twistd.UnixApplicationRunnerStartApplicationTests.test_applicationStartsWithConfiguredNameIDs
-------------------------------------------------------------------------------
Ran 88 tests in 0.167s

needs-review

@@ -1013,22 +903,132 @@ def test_setUidSameAsCurrentUid(self):
If the specified UID is the same as the current UID of the process,
then a warning is displayed.
"""

# FIXME:https://github.com/twisted/twisted/issues/10332
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I reported this issue some time ago.

The thing is that at the start of the test there was no failure... so the extra failure was generated while running the test.

To help troubleshoot these tests, I have updated the code to assert the warnings length last, and show all warnings on error.

@adiroiban
Copy link
Member Author

With the code from 52a1023

If I run $ trial -u -j2 twisted.test.test_twistd.UnixApplicationRunnerStartApplicationTests it fails after the second run

and is fixed with this diff

diff --git a/src/twisted/test/test_twistd.py b/src/twisted/test/test_twistd.py
index e31ee7c720..46a230c63d 100644
--- a/src/twisted/test/test_twistd.py
+++ b/src/twisted/test/test_twistd.py
@@ -884,7 +884,9 @@ class UnixApplicationRunnerStartApplicationTests(TestCase):
         self.patch(os, "setgid", setgid)
 
         options = twistd.ServerOptions()
-        options.parseOptions(["--nodaemon", "--uid", str(wantedUid)])
+        options.parseOptions(
+            ["--nodaemon", "--uid", str(wantedUid), "--pidfile=setuid.pid"]
+        )

@adiroiban
Copy link
Member Author

If I run $ trial -u -j16 twisted.test.test_twistd I get a faiure quite fast.

but if I apply this diff it works

diff --git a/src/twisted/test/test_twistd.py b/src/twisted/test/test_twistd.py
index 46a230c63d..dfd0fbc65f 100644
--- a/src/twisted/test/test_twistd.py
+++ b/src/twisted/test/test_twistd.py
@@ -1733,7 +1733,7 @@ class UnixAppLoggerTests(TestCase):
 
         self.assertEqual(self.signals, [])
 
-    def test_getLogObserverDefaultFile(self):
+    def _getLogObserverDefaultFile(self):
         """
         When daemonized and C{logfile} is empty, the observer returned by
         L{UnixAppLogger._getLogObserver} points at C{twistd.log} in the current

so I guess the issue is with test_getLogObserverDefaultFile not cleaning the pid file

@adiroiban
Copy link
Member Author

The big issue was with 77f4e16

I still think that it is worth having all the tests that start an application in the same testcase, so that we don't start multiple application from the same process at the same time.

@exarkun
Copy link
Member

exarkun commented Oct 13, 2023

I still think that it is worth having all the tests that start an application in the same testcase, so that we don't start multiple application from the same process at the same time.

This doesn't make sense. It doesn't make any difference to disttrial whether tests are part of the same testcase or not. Any organization of tests leaves the same potential for concurrent execution.

@adiroiban
Copy link
Member Author

adiroiban commented Oct 13, 2023

You are right in terms of parallel execution.

Thanks for looking into this.

There was somethin here :)

You can try to run $ trial -u -j2 twisted.test.test_twistd on trunk and on this branch.

It fails in trunk and works on this branch.

It look like the issue is with tests starting the applications with twistd.pid file, but not stopping the application at the end of the test.

I have reverted the move.

I will leave a comment inline for the pid part.

If you have time, please take a look at the latest changes.

I hope they can be merged and have fewer flaky tests.

Thanks

@@ -994,7 +994,9 @@ def setgid(gid):
self.patch(os, "setgid", setgid)

options = twistd.ServerOptions()
options.parseOptions(["--nodaemon", "--uid", str(wantedUid)])
options.parseOptions(
["--nodaemon", "--uid", str(wantedUid), "--pidfile={}".format(pidFile)]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a workaround.

Without this $ trial -u -j2 twisted.test.test_twistd would fail with something like

Traceback (most recent call last):
  File "/home/adi/chevah/twisted/src/twisted/test/test_twistd.py", line 1012, in test_applicationStartsWithConfiguredNameIDs
    self._applicationStartsWithConfiguredID(
  File "/home/adi/chevah/twisted/src/twisted/test/test_twistd.py", line 981, in _applicationStartsWithConfiguredID
    runner.preApplication()
  File "/home/adi/chevah/twisted/src/twisted/scripts/_twistd_unix.py", line 205, in preApplication
    checkPID(self.config["pidfile"])
  File "/home/adi/chevah/twisted/src/twisted/scripts/_twistd_unix.py", line 109, in checkPID
    sys.exit(
builtins.SystemExit: Another twistd server is running, PID 290716

This could either be a previously started instance of your application or a
different application entirely. To start a new one, either run it in some other
directory, or use the --pidfile and --logfile parameters to avoid clashes.


twisted.test.test_twistd.UnixApplicationRunnerStartApplicationTests.test_applicationStartsWithConfiguredNameIDs
-------------------------------------------------------------------------------
Ran 88 tests in 0.167s

options.parseOptions(["--nodaemon", "--uid", str(wantedUid)])
options.parseOptions(
["--nodaemon", "--uid", str(wantedUid), "--pidfile={}".format(pidFile)]
)
application = service.Application("test_setupEnvironment")
self.runner = UnixApplicationRunner(options)
runner = UnixApplicationRunner(options)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that one problem is that the tests are starting the application here, but they don't stop the application at the end of the test.

Not sure what API should be used to stop an application

@@ -1739,7 +1737,8 @@ def test_getLogObserverDefaultFile(self):
"""
logFiles = _patchTextFileLogObserver(self.patch)
logger = UnixAppLogger({"logfile": "", "nodaemon": False})
logger._getLogObserver()
observer = logger._getLogObserver()
self.addCleanup(observer._outFile.close)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was the main issue.
The file was not closed at cleanup.

The observer API/implementation is not idea... the file is opened when you get the observer.... but then there is no API to close the observer.

@adiroiban
Copy link
Member Author

@glyph do you have time to look into this PR

This is a flaky tests for which I think we have a fix.

I would like to have it merged so that at least we have 1 fewer flaky test.

Thanks

Copy link
Member

@glyph glyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a ton for tracking this down. I think the suggestions inline are worth applying; let's not litter up the filesystem.

src/twisted/test/test_twistd.py Outdated Show resolved Hide resolved
src/twisted/test/test_twistd.py Outdated Show resolved Hide resolved
adiroiban and others added 2 commits October 31, 2023 18:28
Co-authored-by: Glyph <glyph@twistedmatrix.com>
@adiroiban
Copy link
Member Author

Thanks a ton for tracking this down. I think the suggestions inline are worth applying; let's not litter up the filesystem.

Thanks for the suggestions.
I was focused on getting the tests passed... so I did the bare minimum of changes.

I have applied your suggestions and I will merge.

Happy to see the flaky test fixed.

@adiroiban adiroiban merged commit ca3f79e into trunk Oct 31, 2023
23 checks passed
@adiroiban adiroiban deleted the 12018-twistd-flaky-tests branch October 31, 2023 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

twisted.test.test_twistd tests are not safe to be run in parallel
4 participants