Skip to content

[REQUEST] make ZeRO inference config more intuitive wrt train_* configuration #1738

@stas00

Description

@stas00

With the great improvements done over the last few months ZeRO can be now be easily used for inference, as in this example:
huggingface/transformers#15399 (comment)
But to the user who may no longer care for training it's no longer intuitive to have these train-specific config variables:

    "train_batch_size": train_batch_size,
    "train_micro_batch_size_per_gpu": 1,

it's time to come up with a train/inference-non-specific config names and deprecate the train-specific ones.

there might be others, but these 2 just stood out.

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions