Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Config] Enable component overrides #456

Merged
merged 11 commits into from
Mar 14, 2024
Merged

[Config] Enable component overrides #456

merged 11 commits into from
Mar 14, 2024

Conversation

RdoubleA
Copy link
Contributor

@RdoubleA RdoubleA commented Mar 6, 2024

Context

After the config system was updated in #406 with _component_ fields, the CLI override experience for specifying TorchTune objects became clunky. For example, to change datasets, we now have to specify the component field in CLI:

tune full_finetune --config alpaca_llama2_full_finetune.yaml --override dataset._component_=torchtune.datasets.SlimOrcaDataset

Instead, we update the parse utility to enable specifying component path without using _component_ and merge the overrides properly. The above command will now become:

tune full_finetune --config alpaca_llama2_full_finetune.yaml dataset=torchtune.datasets.SlimOrcaDataset

Changelog

  • Update parsing to recognize _component_ and enable component overrides by adding a merge utility, merge_yaml_and_cli_args
  • Remove the --override flag by popular demand
  • Update tutorials

Test plan

Added unit test and ran pytest tests
tune --nnodes 1 --nproc_per_node 1 full_finetune --config alpaca_llama2_full_finetune dataset=torchtune.datasets.SlimOrcaDataset dataset.train_on_input=False

Running recipe_main with parameters {'tokenizer': {'_component_': 'torchtune.models.llama2.llama2_tokenizer', 'path': '/tmp/llama2/tokenizer.model'}, 'dataset': {'_component_': 'torchtune.datasets.SlimOrcaDataset', 'train_on_input': False}, 'seed': None, 'shuffle': True, 'model': {'_component_': 'torchtune.models.llama2.llama2_7b'}, 'model_checkpoint': '/tmp/llama2_native', 'batch_size': 2, 'epochs': 3, 'optimizer': {'_component_': 'torch.optim.SGD', 'lr': 2e-05}, 'loss': {'_component_': 'torch.nn.CrossEntropyLoss'}, 'max_steps_per_epoch': None, 'gradient_accumulation_steps': 1, 'log_every_n_steps': None, 'run_generation': None, 'resume_from_checkpoint': False, 'device': 'cuda', 'dtype': 'fp32', 'enable_fsdp': True, 'enable_activation_checkpointing': True, 'cpu_offload': False, 'metric_logger': {'_component_': 'torchtune.utils.metric_logging.DiskLogger', 'log_dir': '${output_dir}'}, 'output_dir': '/tmp/alpaca-llama2-finetune'}

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 6, 2024
Copy link

netlify bot commented Mar 6, 2024

Deploy Preview for torchtune-preview ready!

Name Link
🔨 Latest commit 91fec7a
🔍 Latest deploy log https://app.netlify.com/sites/torchtune-preview/deploys/65f3579c407d7100087ebe1d
😎 Deploy Preview https://deploy-preview-456--torchtune-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Comment on lines +49 to +59
yaml_args, cli_args = parser.parse_known_args(
[
"--config",
"test.yaml",
"b.c=4", # Test overriding a flat param in a component
"b=5", # Test overriding component path
"b.b.c=6", # Test nested dotpath
"d=6", # Test overriding a flat param
"e=7", # Test adding a new param
]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One case that's not covered here is when we only override some of the fields for a given DictConfig. E.g. with _Config as you've defined it above I would like to see the case of just b=5 tested (check that the final value of b.c=3), and the case of just b.c=4 (check that the final value of b._component_=2). I think (?) these did not work before and we actually needed to override every single field, but if I understand these changes correctly that will no longer be the case. Either way, would be good to explicitly test for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't require overriding every single field, but yes it would be good to test for this explicitly


Overriding components
^^^^^^^^^^^^^^^^^^^^^
If you would like to override a parameter in the config that has a :code:`_component_`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking

Suggested change
If you would like to override a parameter in the config that has a :code:`_component_`
If you would like to override a class or function in the config that is instantiated via the :code:`_component_`

# If a cli arg overrides a yaml arg with a _component_ field, update the
# key string to reflect this
if (
k in yaml_kwargs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the assumption here is that anything in the CLI overrides was already in the YAML file, right? (I think this is fine and don't see a way to avoid it, just wanna confirm that we will not support the cases in Hydra like +my_appended_config_field=value.)

Copy link
Contributor Author

@RdoubleA RdoubleA Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for components, you cannot append a new component (well, you technically could but it would not be pretty and you'll have to explicitly use _component_). It has to exist in the yaml file. Appending new config values that aren't in the yaml file will still work without needing the +

Copy link
Contributor

@ebsmothers ebsmothers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good to me and will improve the UX a lot. Really only one main question from me on the testing: just wanna confirm that we can override individual fields of a DictConfig without overriding all of them -- I think we can but it's not immediately clear from the command in the test plan.


if enable_fsdp:
cmd.append("--enable-fsdp")
cmd.append("enable_fsdp=True")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is an issue with this test, it passes on main because the flag is not parsed correctly before I update it. After the change, this test fails because we don't call init_distributed when enable_fsdp is true. any advice on how I can quickly patch this? @rohan-varma @ebsmothers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I actually missed this until now, but I think you, Rohan, and I all discovered this more or less independently. #472 should fix this

@ebsmothers
Copy link
Contributor

One more comment here.. I think #454 is gonna land before this so please just take a pass and make sure all instances of --override are gone in the final version of your PR.

@RdoubleA RdoubleA force-pushed the rafiayub/override_update branch 2 times, most recently from 3fee2ce to ec9ce97 Compare March 13, 2024 19:24
Copy link

pytorch-bot bot commented Mar 14, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/456

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 91fec7a with merge base e570803 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@RdoubleA RdoubleA merged commit 9c75d48 into main Mar 14, 2024
16 checks passed
@RdoubleA RdoubleA deleted the rafiayub/override_update branch March 14, 2024 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants