Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix runner hanging, fix an error in thread handling, and serialize concurrent generator-runner invocations #7201

Conversation

christian-monch
Copy link
Contributor

This PR fixes a number of issues.

The PR extends the number of concurrent runner-call scenarios beyond the current implementation, i.e. serialize synchronous (non-generator) ThreadedRunner.run()-calls. The scenarios now supported are:

  1. Concurrently invoke ThreadedRunner.run() without generator, leading to serialization of invocations.
  2. Concurrently invoke ThreadedRunner.run() with generator, leading to serialisation of invocations, i.e. later invocations return if the currently active generator is exhausted
  3. Concurrently read from a generator returned by ThreadedRunner.run(). Any number of threads can call send() on a generator, results will be returned as soon as they are available. Which of the waiting threads is continued is non-deterministic. If the generator is exhausted, all threads will receive a StopIteration-exception.

The serialization should fix spurious test errors. Fixes #7138

In addition the PR fixes an error that could lead to hanging threads if close() or del is called on a BatchedCommand-instance. This should fix a number of hanging errors in tests that use the @serve_path_via_http() decorator. Fixes #6804

This commit adds code to record the thread that currently
"owns" or "uses" a ThreadedRunner-instance. This allows to
improve the message of the re-entered-RuntimeError.

The commit robustifies the generator by ensuring that the
exhausted state is actually trivial, i.e. calling send() on
an exhausted generator will only use minimal information
that is internal to the generator.

In addition it tests for None-message in the first send()-call.
This moves the import of datalad.cfg from
datalad.cmd.BatchedCommand._get_abandon to the
global scope of datalad.cmd. That prevents an
error when BatchedCommand instances are deleted
during shutdown of python.
This commit adds code to report a situation in which
the process does not exist anymore, but
a_threaded_runner.process.poll() does return None
instead of the return code

That might be due to an error in the handling of
processes in BatchedCommand.

THIS IS A TEMPORARY COMMIT IN ORDER TO
WORK OUT THE PROBLEMS
This commit removes the storing of ThreadedRunner
instances in WitlessRunner. The stored runner is
not used and it is not useful since multiple threads
may enter WitlessRunner.run() and overwrite the
stored runner in unpredictable ways

The commit also improves the description of the
`cwd`-parameters and their behavior.

The commit fixes a problem with path reporting in
CommandErrors which could appear if a non None
`cwd` was provided to `WitlessRunner.run()`. I.e.
the default `cwd` set in the constructor would
have been reported instead of the actual `cwd`,
provided when `WitlessRunner.run()` was called.

The tests are extended to check for multi-thread
support of `WitlessRunner.run()` in the
non-generator mode.
This commit blocks a thread that is starting a
generator based ThreadedRunner.run, if there
is already an active generator. The thread is
unblocked, when the active generator is
exhausted.
Use python script instead of `cat` to make
make tests windows compatible
This commit fixes an error where a process exit was
falsly assumed when poll() returned None. That gets
rid of the "lost process"-state.

Also adapt the tests to the new synchronization
in generator runners, i.e. block a new run on an
active generator runner until the existing generator
is exhausted.
@christian-monch christian-monch added the semver-internal Changes only affect the internal API label Nov 23, 2022
changelog.d/pr-7201.md Outdated Show resolved Hide resolved
changelog.d/pr-7201.md Outdated Show resolved Hide resolved
Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the description -- magnificent. Thank you ! Left some minor comments

datalad/runner/exception.py Outdated Show resolved Hide resolved
datalad/cmd.py Show resolved Hide resolved
datalad/runner/nonasyncrunner.py Outdated Show resolved Hide resolved
datalad/runner/tests/test_nonasyncrunner.py Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Nov 23, 2022

Codecov Report

Base: 88.89% // Head: 90.92% // Increases project coverage by +2.03% 🎉

Coverage data is based on head (c17d197) compared to base (653438e).
Patch coverage: 93.92% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##            maint    #7201      +/-   ##
==========================================
+ Coverage   88.89%   90.92%   +2.03%     
==========================================
  Files         355      355              
  Lines       46532    46671     +139     
  Branches     6332     6348      +16     
==========================================
+ Hits        41363    42437    +1074     
+ Misses       5154     4219     -935     
  Partials       15       15              
Impacted Files Coverage Δ
datalad/runner/tests/test_threadsafety.py 96.77% <ø> (-3.23%) ⬇️
datalad/cmd.py 93.75% <80.00%> (+0.04%) ⬆️
datalad/runner/nonasyncrunner.py 98.12% <88.88%> (-1.53%) ⬇️
datalad/runner/tests/test_nonasyncrunner.py 98.97% <95.45%> (-1.03%) ⬇️
datalad/runner/runner.py 100.00% <100.00%> (ø)
datalad/runner/tests/test_witless_runner.py 96.66% <100.00%> (+0.37%) ⬆️
datalad/tests/test_cmd.py 100.00% <100.00%> (ø)
datalad/_version.py 45.68% <0.00%> (ø)
datalad/__init__.py 98.00% <0.00%> (+16.00%) ⬆️
... and 2 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

christian-monch and others added 8 commits November 23, 2022 22:23
Co-authored-by: Yaroslav Halchenko <debian@onerussian.com>
Co-authored-by: Yaroslav Halchenko <debian@onerussian.com>
Co-authored-by: Yaroslav Halchenko <debian@onerussian.com>
This commit adds a comment to explain why `ImportError`
is caught in SafeDelCloseMixIn.__del__()
This commit filters warnings about unraisable exceptions
in `__del__()`, which lead to test errors in some
python 3.7 versions.
Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just left one typo fix suggestion. Glad to see new tests! Thank you!

@christian-monch
Copy link
Contributor Author

Just left one typo fix suggestion. Glad to see new tests! Thank you!

NP. You are welcome :-)

@yarikoptic-gitmate
Copy link
Collaborator

PR released in 0.17.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-internal Changes only affect the internal API
Projects
None yet
3 participants