integrations: Add DVC Live integration to Ray Tune #237

MarkoMFilip · 2022-04-06T09:28:09Z

Similar to how Ray provides integration for other loggers within Ray Tune it would be good if DVC Live could have its own integration. Concretely, in its documentation for integration of ML Flow with Ray Tune, Ray gives examples of how it created two specific functions to help people both run hyperpartameter optimization with Tune and at the same time track the experiments with ML Flow. If we want to use DVC's Experiments and Checkpoints with Ray Tune, it would be good to have a similar integration available.

grizzlybearg · 2023-02-09T16:29:03Z

Hi @daavoo @MarkoMFilip has there been any progress with this??

daavoo · 2023-02-09T17:06:25Z

Hi @daavoo @MarkoMFilip has there been any progress with this??

Hi @grizzlybearg , there has not been direct progress but since the issue was opened we have added some features (mainly https://github.com/iterative/dvclive/releases/tag/1.1.0) that should allow implementing something similar to the integrations defined in https://docs.ray.io/en/latest/tune/examples/tune-mlflow.html .

I might try to set up a draft P.R. tomorrow since I have checked the code for the MLflowLoggerCallback and it looks simple enough

grizzlybearg · 2023-02-15T19:13:05Z

Thanks @daavoo

bastienboutonnet · 2023-06-30T12:28:59Z

@daavoo I was trying to see if there had been a PR for this, as I'd really love to be able to use DVC live with Ray Tune. Not so keen on the other ML monitoring platforms out there. I couldn't find anything related here. Could it be that I'm looking in the wrong place?

Maybe with some guidelines, I'd love to help out if that idea has not been further implemented.

daavoo · 2023-06-30T13:11:38Z

Hi @bastienboutonnet , are you using Ray Tune alongside an existing ML Framework (i.e. keras, pytorch lightning)?

bastienboutonnet · 2023-06-30T18:23:23Z

@daavoo We are currently using huggingface transformer trainers

daavoo · 2023-06-30T19:12:07Z

@daavoo We are currently using huggingface transformer trainers

Thanks! Tried to set up a quick example following https://huggingface.co/blog/ray-tune and passing:

from dvclive.huggingface import DVCLiveCallback

trainer.add_callback(DVCLiveCallback(save_dvc_exp=True))

But I think I actually need to look into it in more detail 😓 It appears that there is a bug with Ray trying to deserialize the internal DVC Repo instance used by DVCLive

dberenbaum · 2023-08-24T17:52:02Z

@bastienboutonnet @grizzlybearg @MarkoMFilip or others watching this issue, do you already use DVC and Ray Tune? Do you use them together at all, and if so, how?

Since Ray will often be running on a distributed cluster, the typical DVCLive workflow of writing metrics and plots to local files and using Git to sync them won't work (even locally, since each trial writes to its own run folder, it violates the assumptions of DVC). A couple options would be to:

Launch Ray from within DVC and sync back each trial's results. Sync the metrics and plots data to a central store (like cloud storage or DVC Studio), keeping track of the experiment associated with those metrics so they can be synced back to the Git/DVC repo.
Launch DVC from within Ray inside each remote trial. Each trial clones the repo and pulls data, then runs the trial, commits the result, and pushes back to DVC and Git storage.

Related discussions: #676, #638

cc @aguschin

daavoo added A: frameworks Area: ML Framework integration feature request labels Apr 6, 2022

dberenbaum added the p3-nice-to-have label Mar 6, 2023

dberenbaum added p2-medium and removed p3-nice-to-have labels May 1, 2023

dberenbaum added p1-important Include in the next sprint and removed p2-medium labels Jun 30, 2023

daavoo self-assigned this Jun 30, 2023

dberenbaum assigned dberenbaum and unassigned daavoo Aug 16, 2023

dberenbaum mentioned this issue Aug 24, 2023

Support S3 URI as Live.dir to store DVCLive data in cloud storage #676

Open

dberenbaum mentioned this issue Sep 8, 2023

Allow for null baseline_sha iterative/dvc-studio-client#72

Closed

dberenbaum removed the p1-important Include in the next sprint label Mar 5, 2024

dberenbaum added the p2-medium label Apr 24, 2024

0x2b3bfa0 unassigned dberenbaum Sep 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

integrations: Add DVC Live integration to Ray Tune #237

integrations: Add DVC Live integration to Ray Tune #237

MarkoMFilip commented Apr 6, 2022

grizzlybearg commented Feb 9, 2023

daavoo commented Feb 9, 2023

grizzlybearg commented Feb 15, 2023

bastienboutonnet commented Jun 30, 2023

daavoo commented Jun 30, 2023

bastienboutonnet commented Jun 30, 2023

daavoo commented Jun 30, 2023 •

edited

Loading

dberenbaum commented Aug 24, 2023

integrations: Add DVC Live integration to Ray Tune #237

integrations: Add DVC Live integration to Ray Tune #237

Comments

MarkoMFilip commented Apr 6, 2022

grizzlybearg commented Feb 9, 2023

daavoo commented Feb 9, 2023

grizzlybearg commented Feb 15, 2023

bastienboutonnet commented Jun 30, 2023

daavoo commented Jun 30, 2023

bastienboutonnet commented Jun 30, 2023

daavoo commented Jun 30, 2023 • edited Loading

dberenbaum commented Aug 24, 2023

daavoo commented Jun 30, 2023 •

edited

Loading