Skip to content

Conversation

@nmheim
Copy link
Contributor

@nmheim nmheim commented Aug 28, 2025

Relevant issue or PR

#356

Description of changes

  • Add a threading.Lock to tesseract_core.runtime.logs.LogPipe
  • Add a threading.Lock to tesseract_core.sdk.logs.LogPipe
  • Make apply endpoint as picklable as possible

Testing done

  • Add new reproducer example to tests (and rename the example?)

@codecov
Copy link

codecov bot commented Aug 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.66%. Comparing base (a6ad367) to head (16e507c).

❗ There is a different number of reports uploaded between BASE (a6ad367) and HEAD (16e507c). Click for more details.

HEAD has 3 uploads less than BASE
Flag BASE (a6ad367) HEAD (16e507c)
29 26
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #357      +/-   ##
==========================================
- Coverage   76.35%   70.66%   -5.69%     
==========================================
  Files          29       29              
  Lines        3345     3348       +3     
  Branches      525      525              
==========================================
- Hits         2554     2366     -188     
- Misses        558      761     +203     
+ Partials      233      221      -12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nmheim nmheim requested a review from apaleyes September 1, 2025 11:43

docker system prune --force

tesseract run reproducer apply '{"inputs":{}}' --output-path outputs
Copy link
Contributor

@johnbcoughlin johnbcoughlin Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're going to commit this example let's name it something more specific, like process-pool-executor-reproducer

@josiahbjorgaard josiahbjorgaard self-requested a review September 29, 2025 14:45
@josiahbjorgaard
Copy link

josiahbjorgaard commented Sep 30, 2025

I dug into how we are doing this log redirect and the calling thread executes here - the LogPipe thread is executed within an ExitStack(), which I now know means that it calls each item as if it were a context manager.

This is why in LogPipe we do strange things - start and join the thread within the __enter__ and __exit__ method overrides. This is necessary because ExitStack() only uses the context manager methods of the LogPipe and so it must be started and joined in them. I actually find this structure to be overly complicated because I don't see the need to have LogPipe in a thread here, since we necessarily must block the calling thread until the LogPipe run is finished executing - which is why we need .join() in __exit__. I also do not see the need to have it run as a context manager - we could also have it in the calling function like

logpipe = LogPipe()
logpipe.start()
logpipe.join()

but still, why a thread if we block on join without doing other work?

Regardless, without restructuring the LogPipe class, I think the solution here is fine and I'm able to test it succefully. I recommend adding a warning to the join timeout because we are effectively telling it to stop running the LogPipe thread's run method at the timeout.

Copy link

@josiahbjorgaard josiahbjorgaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, suggest a warning and a question.

Curious if you think refactoring logpipe to not be threaded would be cleaner code and same functionality?

# Use a timeout so something weird happening in the logging thread doesn't
# cause this to hang indefinitely
#
# FIXME: this always times out in the multiprocessing case?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this with timeout=1, 10, and no timeout and the reproducer had no error, but this timeout is going to vary depending on the time it takes run to execute.

Should we add a warning here so that we can revise the timeout later if we frequently find an issue with the timeout, or set it to something even longer?

if self.is_alive():
    warn.warning("LogPipe thread timed out while executing.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the FIXME, it works for me?

@PasteurBot
Copy link
Contributor

CLA signatures required

Thank you for your PR, we really appreciate it! Like many open-source projects, we ask that all contributors sign our Contributor License Agreement before we can accept your contribution. This only needs to be done once per contributor. You can do so by commenting the following on this pull request:


@PasteurBot I have read the CLA Document and I hereby sign the CLA


1 out of 2 committers have signed the CLA.
✅ (nmheim)[https://github.com/nmheim]
@josiahbjorgaard
You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@dionhaefner
Copy link
Contributor

superseded by #392

@pasteurlabs pasteurlabs locked and limited conversation to collaborators Nov 12, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants