Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-109917: Fix test instability in test_concurrent_futures #110306

Merged
merged 1 commit into from
Oct 3, 2023

Conversation

elfstrom
Copy link
Contributor

@elfstrom elfstrom commented Oct 3, 2023

The test had an instability issue due to the ordering of the dummy queue operation and the real wakeup pipe operations. Both primitives are thread safe but not done atomically as a single update and may interleave arbitrarily. With the old order of operations this can lead to an incorrect state where the dummy queue is full but the wakeup pipe is empty. By swapping the order in clear() I think this can no longer happen in any possible operation interleaving (famous last words).

This is a minimal update to fix the issue. Long term it might be better to rewrite the test along the lines of #110129 where we make it possible to override the wakeup message and cause the blocking behaviour that way instead of the small dummy queue used now. Less mocking is usually better and the test can be faster as well. It does require a minor modification to the implementation and there were other issues in #110129 but the general test idea seems better than the current solution.

The test had an instability issue due to the ordering of the dummy
queue operation and the real wakeup pipe operations. Both primitives
are thread safe but not done atomically as a single update and may
interleave arbitrarily. With the old order of operations this can lead
to an incorrect state where the dummy queue is full but the wakeup
pipe is empty. By swapping the order in clear() I think this can no
longer happen in any possible operation interleaving (famous last
words).
@bedevere-app bedevere-app bot added tests Tests in the Lib/test dir awaiting review labels Oct 3, 2023
@cjw296 cjw296 added skip issue skip news 3.11 only security fixes 3.12 bugs and security fixes 3.13 bugs and security fixes labels Oct 3, 2023
@cjw296 cjw296 merged commit a376a72 into python:main Oct 3, 2023
28 checks passed
@cjw296 cjw296 added needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes labels Oct 3, 2023
@miss-islington
Copy link
Contributor

Thanks @elfstrom for the PR, and @cjw296 for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @elfstrom for the PR, and @cjw296 for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 3, 2023
…honGH-110306)

The test had an instability issue due to the ordering of the dummy
queue operation and the real wakeup pipe operations. Both primitives
are thread safe but not done atomically as a single update and may
interleave arbitrarily. With the old order of operations this can lead
to an incorrect state where the dummy queue is full but the wakeup
pipe is empty. By swapping the order in clear() I think this can no
longer happen in any possible operation interleaving (famous last
words).
(cherry picked from commit a376a72)

Co-authored-by: elfstrom <elfstrom@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 3, 2023
…honGH-110306)

The test had an instability issue due to the ordering of the dummy
queue operation and the real wakeup pipe operations. Both primitives
are thread safe but not done atomically as a single update and may
interleave arbitrarily. With the old order of operations this can lead
to an incorrect state where the dummy queue is full but the wakeup
pipe is empty. By swapping the order in clear() I think this can no
longer happen in any possible operation interleaving (famous last
words).
(cherry picked from commit a376a72)

Co-authored-by: elfstrom <elfstrom@users.noreply.github.com>
@bedevere-app
Copy link

bedevere-app bot commented Oct 3, 2023

GH-110315 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Oct 3, 2023
@bedevere-app
Copy link

bedevere-app bot commented Oct 3, 2023

GH-110316 is a backport of this pull request to the 3.11 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.11 only security fixes label Oct 3, 2023
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 FreeBSD 3.x has failed when building commit a376a72.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/1223/builds/326) and take a look at the build logs.
  4. Check if the failure is related to this commit (a376a72) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/1223/builds/326

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "<frozen importlib._bootstrap>", line 1354, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1316, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1256, in _find_spec
  File "<frozen importlib._bootstrap_external>", line 1538, in find_spec
  File "<frozen importlib._bootstrap_external>", line 1512, in _get_spec
  File "<frozen importlib._bootstrap_external>", line 1630, in find_spec
  File "<frozen importlib._bootstrap_external>", line 161, in _path_isfile
  File "<frozen importlib._bootstrap_external>", line 153, in _path_is_mode_type
  File "<frozen importlib._bootstrap_external>", line 147, in _path_stat
ValueError: embedded null byte


Traceback (most recent call last):
  File "/buildbot/buildarea/3.x.ware-freebsd/build/Lib/threading.py", line 1067, in _bootstrap_inner
    self.run()
  File "/buildbot/buildarea/3.x.ware-freebsd/build/Lib/threading.py", line 1004, in run
    self._target(*self._args, **self._kwargs)
  File "/buildbot/buildarea/3.x.ware-freebsd/build/Lib/test/test_interpreters.py", line 584, in task
    interp = interpreters.create()
             ^^^^^^^^^^^^^^^^^^^^^
  File "/buildbot/buildarea/3.x.ware-freebsd/build/Lib/test/support/interpreters.py", line 26, in create
    id = _interpreters.create(isolated=isolated)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: interpreter creation failed
k

cjw296 pushed a commit that referenced this pull request Oct 3, 2023
…-110306) (#110316)

gh-109917: Fix test instability in test_concurrent_futures (GH-110306)

The test had an instability issue due to the ordering of the dummy
queue operation and the real wakeup pipe operations. Both primitives
are thread safe but not done atomically as a single update and may
interleave arbitrarily. With the old order of operations this can lead
to an incorrect state where the dummy queue is full but the wakeup
pipe is empty. By swapping the order in clear() I think this can no
longer happen in any possible operation interleaving (famous last
words).
(cherry picked from commit a376a72)

Co-authored-by: elfstrom <elfstrom@users.noreply.github.com>
cjw296 pushed a commit that referenced this pull request Oct 3, 2023
…-110306) (#110315)

gh-109917: Fix test instability in test_concurrent_futures (GH-110306)

The test had an instability issue due to the ordering of the dummy
queue operation and the real wakeup pipe operations. Both primitives
are thread safe but not done atomically as a single update and may
interleave arbitrarily. With the old order of operations this can lead
to an incorrect state where the dummy queue is full but the wakeup
pipe is empty. By swapping the order in clear() I think this can no
longer happen in any possible operation interleaving (famous last
words).
(cherry picked from commit a376a72)

Co-authored-by: elfstrom <elfstrom@users.noreply.github.com>
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot s390x Fedora 3.x has failed when building commit a376a72.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/223/builds/4604) and take a look at the build logs.
  4. Check if the failure is related to this commit (a376a72) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/223/builds/4604

Failed tests:

  • test_subprocess

Failed subtests:

  • test_pipesize_default - test.test_subprocess.ProcessTestCaseNoPoll.test_pipesize_default

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "/home/dje/cpython-buildarea/3.x.edelsohn-fedora-z/build/Lib/test/test_subprocess.py", line 763, in test_pipesize_default
    self.assertEqual(
AssertionError: 65536 != 8192

@vstinner
Copy link
Member

vstinner commented Oct 4, 2023

Thanks for the fix :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 bugs and security fixes 3.13 bugs and security fixes skip issue skip news tests Tests in the Lib/test dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants