Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WB-8532] Quick fix for service hang #3280

Merged
merged 5 commits into from
Feb 23, 2022
Merged

Conversation

raubitsj
Copy link
Member

@raubitsj raubitsj commented Feb 22, 2022

WB-8532

Description

When an internal "process" dies for a run, lets force an exit of the entire service process for now so that the user process does not hang waiting on a message that will never get handled.

There are better ways to do this, but this is the simplest way to make sure we always die.

The issue was we were calling sys.exit(1) from a non mainthread in the service process which is a no no.

TODO:

  • add a test
  • fix exit code?

Testing

How was this PR tested?

Checklist

  • Name PR "[WB-NNNN][WB-MMMM] Add support for..." similar to entries in CHANGELOG.md
  • Include reference to internal ticket "Fixes WB-NNNN" (and github issue "Fixes #NNNN" if applicable)

@codecov
Copy link

codecov bot commented Feb 22, 2022

Codecov Report

Merging #3280 (8f9d04b) into master (4b59ad8) will increase coverage by 0.01%.
The diff coverage is 63.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3280      +/-   ##
==========================================
+ Coverage   79.95%   79.96%   +0.01%     
==========================================
  Files         212      212              
  Lines       27908    27919      +11     
==========================================
+ Hits        22314    22326      +12     
+ Misses       5594     5593       -1     
Flag Coverage Δ
functest 56.88% <63.15%> (+0.02%) ⬆️
unittest 69.32% <26.31%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
wandb/sdk/internal/internal.py 87.07% <0.00%> (-0.99%) ⬇️
wandb/sdk/wandb_run.py 88.75% <42.85%> (-0.04%) ⬇️
wandb/sdk/wandb_manager.py 94.26% <83.33%> (-0.66%) ⬇️
wandb/sdk/service/service.py 89.74% <100.00%> (+0.13%) ⬆️
wandb/sdk/wandb_setup.py 89.56% <100.00%> (ø)
wandb/sdk/lib/git.py 76.35% <0.00%> (ø)
wandb/sdk/internal/meta.py 90.74% <0.00%> (+3.08%) ⬆️

@raubitsj raubitsj changed the title quick fix for service hang [WB-8532] Quick fix for service hang Feb 22, 2022
@raubitsj raubitsj marked this pull request as draft February 22, 2022 23:41
@raubitsj raubitsj marked this pull request as ready for review February 23, 2022 00:54
@raubitsj raubitsj requested review from kptkin and vwrj February 23, 2022 01:00
@raubitsj raubitsj merged commit 8e04b8e into master Feb 23, 2022
@raubitsj raubitsj deleted the quick-fix-service-hang branch February 23, 2022 04:28
@raubitsj raubitsj added this to the sdk-2022-03.1 milestone Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants