Skip to content

Default arguments of clm example are confusing #13847

@BramVanroy

Description

@BramVanroy

I was having a look at the run_clm.py script and which new arguments are available to push to the hub.

python transformers\examples\pytorch\language-modeling\run_clm.py -h

I see the following options (note the True defaults for all):

  --no_keep_linebreaks  Whether to keep line breaks when using TXT files or not. (default: True)
  --keep_linebreaks [KEEP_LINEBREAKS]
                        Whether to keep line breaks when using TXT files or not. (default: True)

  --no_dataloader_pin_memory
                        Whether or not to pin memory for DataLoader. (default: True)
  --dataloader_pin_memory [DATALOADER_PIN_MEMORY]
                        Whether or not to pin memory for DataLoader. (default: True)

  --no_skip_memory_metrics
                        Whether or not to skip adding of memory profiler reports to metrics. (default: True)
  --skip_memory_metrics [SKIP_MEMORY_METRICS]
                        Whether or not to skip adding of memory profiler reports to metrics. (default: True)

From this, I cannot figure out what the default behaviour is or what I should change to become the expected behavior. I do not know what the use case is for this but it seems much better to only keep one of each option. If one the two for each option is deprecated, then that could be added in the description too.

I'm on current master (4.12 dev).

Who can help

@sgugger, @patil-suraj

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions