You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right after the first example using CoolSystem, the readme starts using test_tube to introduce tensorboard logging. After looking at the setup.py file, it becomes clear that test_tube is a hard dependency.
been thinking about bringing some of the test-tube stuff into Lightning, but i kind of like the idea of keeping it separate so other libraries can use it. Any thoughts?
IMO the Slurm stuff from test-tube belongs in lightning (or a third entirely separate library). The other functionality would be good to keep separate. I want to take a stab at #47 sometime soon. From that perspective, we'd rather not have any functionality in Lightning that overlaps with MLFlow and friends.
There looks to be bunch of nice features in test-tube that pytorch-lightning takes advantage of. If some of the Slurm stuff were moved here and you fix a bug in test-tube, it would need to be copied here as well.
Another option is to vendor test-tube into the pytorch-lightning library. In the examples, it can look something like:
from pytorch_lightning.test_tube import Experiment
from pytorch_lightning.test_tube.hpr import SlurmCluster
As for support other logging libraries, this may need to directly integrated into test_tube as it can manage logging in a cluster. As you said in #47:
Each call to log needs to be process-safe. Meaning when using distributed only rank=0 will log.
Right after the first example using
CoolSystem
, the readme starts usingtest_tube
to introduce tensorboard logging. After looking at thesetup.py
file, it becomes clear thattest_tube
is a hard dependency.To reduce cognitive load on readers, it may be good to link to https://github.com/williamFalcon/test-tube right before the code that imports
test_tube
.The text was updated successfully, but these errors were encountered: