[tune] Cross Validation (simply parellization) patterns? #7744

cottrell · 2020-03-25T20:42:47Z

Something similar was asked before but this is different.

What pattern is imagined for run dags that have one config instance leading to a number of (model, data_train, data_test) runs? You might typically collapse this into a mean score - std score or something like that.

I feel like I am fighting the framework so am probably not getting something right. The issue is the Trainable class handle the serialization/persistance but you really way to pass everything down to the remotes that get parallized.

Or is there some pattern at the tune.run level that allows you to sample across the train/test pairs (not optimize) i.e. treat the data like config params?

#6560

bllchmbrs · 2020-03-25T20:46:00Z

Hey David, great question. Would you be open to a quick call on this subject - it's something we've been discussing internally (the ray / anyscale team) and it'd be good to understand your perspective to make sure we're thinking about it in the right way - given your well formulated question :).

can you ping me, bill @ anyscale or if you're on the ray slack I'm on there too.

richardliaw · 2020-03-25T20:50:00Z

Hey @cottrell, would this work https://ray.readthedocs.io/en/latest/tune-searchalg.html#repeated-evaluations for you?

Documentation here: https://ray.readthedocs.io/en/latest/tune/api_docs/suggestion.html#ray.tune.suggest.Repeater

What you could do is you can have the Trainable execute something different depending on the trial_index.

class TestMe(Trainable):
    def _setup(self, config):
        index = config[tune.suggest.repeater.TRIAL_INDEX]
        data_train, data_test = create_from_index(index, config)

    def _train(self):
        ...

tune.run(TestMe, search_alg=Repeater(HyperOptSearch(search_space), repeat=5, set_index=True))

Does this make sense? Feel free to follow up with any questions (or any suggestions for how we can improve the docs).

cottrell · 2020-03-25T21:37:59Z

@richardliaw I think a modified Repeater would handle that case I'm thinking of ... but I'm kind of reluctant to fit a framework around it. You could get really fancy and just treat all individual runs as independent and just update the stats scoring mechanism in the scheduler I guess. Will try to jump on slack

cottrell · 2020-03-25T21:42:40Z

Can do a call to chat. @anabranch Will msg you @ anyscale ... I've requested to join slack but it might take a few days.

richardliaw · 2020-03-25T21:54:49Z

Sent an invite; happy to chat online.

richardliaw · 2020-03-27T04:02:53Z

I think we resolved this offline (feel free to reopen if not resolved.)

antonwnk · 2020-05-24T23:34:17Z

Curious what the specific resolution of this was. @cottrell, have you settled for the Repeater solution Richard showed above or have you found a different way around it?

richardliaw · 2020-11-03T16:49:49Z

Ah, I think the resolution was to just not use this type of computation pattern.

bbalaji-ucsd · 2021-02-25T00:55:13Z

I'm interested in this issue. Is there an example of doing cross validation with tune? I'm interested in doing cross validation with tensorflow or torch frameworks.

FarzanT · 2021-03-02T16:19:59Z

I'm also interested! @richardliaw, is the Repeater class the best way of performing cross-validation with Ray Tune? Thanks!

FarzanT · 2021-03-16T03:33:14Z

@richardliaw Hi, do you have any advice on using the Repeater for cross-validation? If not, do you have another technique in mind?
Thank you!

richardliaw · 2021-03-16T06:54:27Z

Hey @FarzanT sorry for the slow reply! Repeater would allow you to run the same model/hypers but using an index to do a different split.

In general I would suggest just doing the cross validation manually within the training run. This is what we do in tune-sklearn: https://github.com/ray-project/tune-sklearn/blob/master/tune_sklearn/_trainable.py#L176

stale · 2021-07-14T07:07:04Z

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

stale · 2021-07-28T07:10:32Z

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

krfricke · 2021-07-28T09:12:55Z

Here's a concept that might work:

https://discuss.ray.io/t/ray-tune-confidence-interval/2967

Let me pull the code from the example:

from ray import tune
import numpy as np

def train(config):
    print(config["repeat"], config["a"])
    return np.random.randn() * 0.1 + config["a"]


searcher = tune.suggest.basic_variant.BasicVariantGenerator(
    constant_grid_search=True
)

analysis = tune.run(
    train,
    config={
        "repeat": tune.grid_search(list(range(4))),
        "a": tune.uniform(0, 1)
    },
    search_alg=searcher,
    num_samples=2)

df = analysis.dataframe()

print(df.groupby(["config/a"]).mean())

Basically what you could do is to use the repeat config parameter to select a different subset of the data to perform cross-validation on. By fetching the resulting dataframes you can then calculate metrics like mean, std, etc.

This was enabled by the constant_grid_search parameter introduced in #16501.

I'll close this issue for now, but if there's any more questions or suggestions around this topic, please feel free to re-open or create a new discuss/issue.

mlguruz · 2022-03-31T17:39:35Z

Hey @FarzanT sorry for the slow reply! Repeater would allow you to run the same model/hypers but using an index to do a different split.

In general I would suggest just doing the cross validation manually within the training run. This is what we do in tune-sklearn: https://github.com/ray-project/tune-sklearn/blob/master/tune_sklearn/_trainable.py#L176

Hi, if we do it manually then there isn't any parallelization across CV folds, right?

Is there any way to run different folds in parallel within a Trial?

Or, I guess a more general question is, can Tune allow us to have a concept of Trial dependencies?

Appreciate if you could share your thoughts on this!

zcarrico-fn · 2023-09-28T14:23:50Z

@krfricke , thank you for the example! How would you recommend this be done if the Searcher is BayesOptSearch, which will raise an exception if tune.grid_search is passed? (Although the Repeater strategy will technically work with BayesOptSearch it's not appropriate for nested cross-validation because the prior would be the average performance (b/c Repeater takes the average) and would include performance on hold-out data b/c it would be incorporating results from neighboring outer-splits. This is complex so I'm happy to explain further if you wish 😄 )

cottrell added the question Just a question :) label Mar 25, 2020

richardliaw changed the title ~~Cross Validation (simply parellization) patterns in tune?~~ [tune] Cross Validation (simply parellization) patterns? Mar 25, 2020

richardliaw added the tune Tune-related issues label Mar 25, 2020

richardliaw closed this as completed Mar 27, 2020

richardliaw reopened this Nov 3, 2020

stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Jul 14, 2021

stale bot closed this as completed Jul 28, 2021

richardliaw reopened this Jul 28, 2021

stale bot removed the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Jul 28, 2021

krfricke closed this as completed Jul 28, 2021

mlguruz mentioned this issue Apr 1, 2022

[Feature] [Tune] Trial-wise dependencies #23654

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tune] Cross Validation (simply parellization) patterns? #7744

[tune] Cross Validation (simply parellization) patterns? #7744

cottrell commented Mar 25, 2020

bllchmbrs commented Mar 25, 2020

richardliaw commented Mar 25, 2020

cottrell commented Mar 25, 2020

cottrell commented Mar 25, 2020

richardliaw commented Mar 25, 2020

richardliaw commented Mar 27, 2020

antonwnk commented May 24, 2020

richardliaw commented Nov 3, 2020

bbalaji-ucsd commented Feb 25, 2021

FarzanT commented Mar 2, 2021

FarzanT commented Mar 16, 2021

richardliaw commented Mar 16, 2021

stale bot commented Jul 14, 2021

stale bot commented Jul 28, 2021

krfricke commented Jul 28, 2021

mlguruz commented Mar 31, 2022

zcarrico-fn commented Sep 28, 2023 •

edited

[tune] Cross Validation (simply parellization) patterns? #7744

[tune] Cross Validation (simply parellization) patterns? #7744

Comments

cottrell commented Mar 25, 2020

bllchmbrs commented Mar 25, 2020

richardliaw commented Mar 25, 2020

cottrell commented Mar 25, 2020

cottrell commented Mar 25, 2020

richardliaw commented Mar 25, 2020

richardliaw commented Mar 27, 2020

antonwnk commented May 24, 2020

richardliaw commented Nov 3, 2020

bbalaji-ucsd commented Feb 25, 2021

FarzanT commented Mar 2, 2021

FarzanT commented Mar 16, 2021

richardliaw commented Mar 16, 2021

stale bot commented Jul 14, 2021

stale bot commented Jul 28, 2021

krfricke commented Jul 28, 2021

mlguruz commented Mar 31, 2022

zcarrico-fn commented Sep 28, 2023 • edited

zcarrico-fn commented Sep 28, 2023 •

edited