Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(sdk): improve file descriptor management #4617

Merged
merged 5 commits into from
Dec 20, 2022
Merged

perf(sdk): improve file descriptor management #4617

merged 5 commits into from
Dec 20, 2022

Conversation

dmitryduev
Copy link
Member

@dmitryduev dmitryduev commented Dec 13, 2022

Fixes WB-11427 (probably not fully, but this should be a good starting point)
Fixes #2825

Description

Reduce the default number of open file descriptors:

  • default to lazily loading optional external modules:
    • make wandb.util.get_module lazily load modules by default
    • wandb.old.summary was loading h5py, which many users will never need. It was accounting for ~40% of open file descriptors when you simply say import wandb.

With the above two improvements, for a simple script junk.py like this:

import time

import wandb

run = wandb.init()
time.sleep(5)
run.finish()
p=`ps -ef | grep "junk" | awk 'NR==1 {print $2}'`
lsof -p $p | wc -l 

goes down from ~115 to ~70.

  • Make sure to close the internal log file when finishing the internal process

Currently, for a script like this:

import wandb

for i in range(20):
    print(i)
    with wandb.init():
        pass

All the debug-internal.log files stay open until the service process exits -- fixed this.

Testing

How was this PR tested?

Checklist

  • Include reference to internal ticket "Fixes WB-NNNN" and/or GitHub issue "Fixes #NNNN" (if applicable)
  • Ensure PR title compliance with the conventional commits standards

@codecov
Copy link

codecov bot commented Dec 13, 2022

Codecov Report

Merging #4617 (669ba01) into main (39d50c3) will increase coverage by 0.05%.
The diff coverage is 96.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4617      +/-   ##
==========================================
+ Coverage   83.19%   83.24%   +0.05%     
==========================================
  Files         279      279              
  Lines       34903    34925      +22     
==========================================
+ Hits        29038    29075      +37     
+ Misses       5865     5850      -15     
Flag Coverage Δ
functest 55.98% <92.00%> (-0.10%) ⬇️
unittest 73.82% <96.00%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
wandb/__init__.py 91.34% <ø> (ø)
wandb/data_types.py 82.86% <ø> (ø)
wandb/util.py 87.80% <94.44%> (+0.50%) ⬆️
wandb/apis/public.py 81.87% <100.00%> (ø)
wandb/sdk/internal/internal.py 87.22% <100.00%> (+0.44%) ⬆️
wandb/sdk/lib/git.py 77.71% <0.00%> (ø)
wandb/integration/tensorboard/monkeypatch.py 91.35% <0.00%> (ø)
wandb/sdk/wandb_run.py 90.89% <0.00%> (+0.23%) ⬆️
wandb/sdk/wandb_setup.py 88.94% <0.00%> (+0.50%) ⬆️
... and 2 more

@dmitryduev dmitryduev changed the title chore(sdk): default to lazily loading optional external modules chore(sdk): improve file descriptor management Dec 13, 2022
@dmitryduev dmitryduev requested a review from a team December 13, 2022 09:07
@dmitryduev dmitryduev marked this pull request as ready for review December 13, 2022 09:07
@dmitryduev dmitryduev added this to the sdk-2023-01.1 milestone Dec 13, 2022
def get_module(
name: str,
required: Optional[Union[str, bool]] = None,
lazy: bool = True,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be True by default?

@dmitryduev dmitryduev merged commit f5ce4c1 into main Dec 20, 2022
@dmitryduev dmitryduev deleted the WB-11427 branch December 20, 2022 18:49
@dmitryduev dmitryduev changed the title chore(sdk): improve file descriptor management perf(sdk): improve file descriptor management Jan 6, 2023
@github-actions github-actions bot added cc-perf and removed cc-chore labels Jan 6, 2023
@github-actions github-actions bot added cc-perf and removed cc-perf labels Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OSError Too many open files: /tmp/tmphv67gzd0wandb-media
2 participants