Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify the Trainer class to handle simultaneous execution of Ray Tune and Weights & Biases #10823

Merged
merged 2 commits into from Mar 22, 2021

Conversation

ruanchaves
Copy link
Contributor

What does this PR do?

The proper way to integrate Ray Tune and Weights & Biases is to pass a wandb parameter to tune.run.

However, this parameter is handled as a dictionary inside the config argument, and there is no distinction between wandb parameters and standard model optimization parameters. The following code comes from their docs:

from ray.tune.logger import DEFAULT_LOGGERS
from ray.tune.integration.wandb import WandbLogger
tune.run(
    train_fn,
    config={
        # define search space here
        "parameter_1": tune.choice([1, 2, 3]),
        "parameter_2": tune.choice([4, 5, 6]),
        # wandb configuration
        "wandb": {
            "project": "Optimization_Project",
            "api_key_file": "/path/to/file",
            "log_config": True
        }
    },
    loggers=DEFAULT_LOGGERS + (WandbLogger, ))

This is not a problem for Ray Tune. However, it is a problem for the transformers integration because it treats wandb as a model parameter, and therefore configuring wandb in this way will raise an error message claiming that wandb is not a training argument.

The following code will raise such an error:

    # Initialize our Trainer
    trainer = Trainer(
        model_init=model_init,
        args=training_args,
        train_dataset=train_dataset,
        eval_dataset=eval_dataset if training_args.do_eval else None,
        compute_metrics=compute_metrics,
        tokenizer=tokenizer,
        data_collator=data_collator,
    )

    # Hyperparameter Search

    def hp_space_fn(empty_arg):
        config = {
                    "warmup_steps": tune.choice([50, 100, 500, 1000]),
                    "learning_rate": tune.choice([1.5e-5, 2e-5, 3e-5, 4e-5]),
                    "num_train_epochs": tune.quniform(0.0, 10.0, 0.5),
        }
        wandb_config = {
                "wandb": {
                        "project": os.environ.get(
                            'WANDB_PROJECT',
                            'wandb_project'),
                        "api_key": os.environ.get('API_KEY'),
                        "log_config": True
                        }
        }
        config.update(wandb_config)
        return config

    best_run = trainer.hyperparameter_search(
            direction="maximize",
            backend="ray",
            scheduler=PopulationBasedTraining(
                        time_attr='time_total_s',
                        metric='eval_f1_thr_0',
                        mode='max',
                        perturbation_interval=600.0
                    ),
            hp_space=hp_space_fn,
            loggers=DEFAULT_LOGGERS + (WandbLogger, ),
    )

One way to work around this is to instantiate a subclass based on the Trainer:

    class CustomTrainer(Trainer):

        def __init__(self, *args, **kwargs):
            super(CustomTrainer, self).__init__(*args, **kwargs)

        def _hp_search_setup(self, trial: Any):
            try:
                trial.pop('wandb', None)
            except AttributeError:
                pass
            super(CustomTrainer, self)._hp_search_setup(trial)

However, this looks like a hack because throwing away wandb arguments in model config on _hp_search_setup should be standard Trainer behavior.

That's why I'm submitting a PR that directly modifies the _hp_search_setup of the Trainer class to ignore wandb arguments if Ray is chosen as a backend.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

I'm tagging @richardliaw and @amogkam as they're directly involved in Ray Tune.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing! Waiting for @richardliaw or @amogkam approval before merging.

Copy link
Collaborator

@amogkam amogkam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

@sgugger sgugger merged commit a8d4d67 into huggingface:master Mar 22, 2021
Iwontbecreative pushed a commit to Iwontbecreative/transformers that referenced this pull request Jul 15, 2021
… and Weights & Biases (huggingface#10823)

* Modify the _hp_search_setup method on the Trainer class to handle the wandb argument passed by Ray Tune to model config.

* Reformat single quotes as double quotes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants