Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Conversation

@Jack-Khuu
Copy link
Contributor

@Jack-Khuu Jack-Khuu commented Aug 1, 2024

As described in Issue 932, the legacy implementation of the arg parser results in subcommands requiring the existence of cli args that they don't actually use.

This PR fixes this by doing a safe getattr check instead of a raw field access.

As a side effect, it also allows us to remove conditional suppression of args.
Instead we can just omit the arg from the parser if they aren't needed.

Also happens to solve a --help bug #976


chat
 python torchchat.py chat --help
usage: torchchat chat [-h] [--checkpoint-path CHECKPOINT_PATH] [--compile] [--compile-prefill] [--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}] [--quantize QUANTIZE] [--device {fast,cpu,cuda,mps}]
                      [--dso-path DSO_PATH | --pte-path PTE_PATH] [--max-new-tokens MAX_NEW_TOKENS] [--top-k TOP_K] [--temperature TEMPERATURE] [--hf-token HF_TOKEN] [--model-directory MODEL_DIRECTORY] [-v] [--seed SEED]
                      [model]

options:
  -h, --help            show this help message and exit
  -v, --verbose         Verbose output
  --seed SEED           Initialize torch seed

Model Specification:
  (REQUIRED) Specify the base model. Args are mutually exclusive.

  model                 Model name for well-known models
  --checkpoint-path CHECKPOINT_PATH
                        Use the specified model checkpoint path

Model Configuration:
  Specify model configurations

  --compile             Whether to compile the model with torch.compile
  --compile-prefill     Whether to compile the prefill. Improves prefill perf, but has higher compile times.
  --dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}
                        Override the dtype of the model (default is the checkpoint dtype). Options: bf16, fp16, fp32, fast16, fast
  --quantize QUANTIZE   Quantization options. pass in as '{"<mode>" : {"<argname1>" : <argval1>, "<argname2>" : <argval2>,...},}' modes are: embedding, linear:int8, linear:int4, linear:a8w4dq, precision.
  --device {fast,cpu,cuda,mps}
                        Hardware device to use. Options: cpu, cuda, mps

Exported Model Path:
  Specify the path of the exported model files to ingest

  --dso-path DSO_PATH   Use the specified AOT Inductor .dso model file
  --pte-path PTE_PATH   Use the specified ExecuTorch .pte model file

Generation:
  Configs for generating output based on provided prompt

  --max-new-tokens MAX_NEW_TOKENS
                        Maximum number of new tokens
  --top-k TOP_K         Top-k for sampling
  --temperature TEMPERATURE
                        Temperature for sampling

Model Downloading:
  Specify args for model downloading (if model is not downloaded)

  --hf-token HF_TOKEN   A HuggingFace API token to use when downloading model artifacts
  --model-directory MODEL_DIRECTORY
                        The directory to store downloaded model artifacts. Default: /home/jackkhuu/.torchchat/model-cache
generate
python torchchat.py generate --help
usage: torchchat generate [-h] [--checkpoint-path CHECKPOINT_PATH] [--compile] [--compile-prefill] [--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}] [--quantize QUANTIZE] [--device {fast,cpu,cuda,mps}]
                          [--dso-path DSO_PATH | --pte-path PTE_PATH] [--prompt PROMPT] [--num-samples NUM_SAMPLES] [--max-new-tokens MAX_NEW_TOKENS] [--top-k TOP_K] [--temperature TEMPERATURE] [--hf-token HF_TOKEN]
                          [--model-directory MODEL_DIRECTORY] [-v] [--seed SEED]
                          [model]

options:
  -h, --help            show this help message and exit
  -v, --verbose         Verbose output
  --seed SEED           Initialize torch seed

Model Specification:
  (REQUIRED) Specify the base model. Args are mutually exclusive.

  model                 Model name for well-known models
  --checkpoint-path CHECKPOINT_PATH
                        Use the specified model checkpoint path

Model Configuration:
  Specify model configurations

  --compile             Whether to compile the model with torch.compile
  --compile-prefill     Whether to compile the prefill. Improves prefill perf, but has higher compile times.
  --dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}
                        Override the dtype of the model (default is the checkpoint dtype). Options: bf16, fp16, fp32, fast16, fast
  --quantize QUANTIZE   Quantization options. pass in as '{"<mode>" : {"<argname1>" : <argval1>, "<argname2>" : <argval2>,...},}' modes are: embedding, linear:int8, linear:int4, linear:a8w4dq, precision.
  --device {fast,cpu,cuda,mps}
                        Hardware device to use. Options: cpu, cuda, mps

Exported Model Path:
  Specify the path of the exported model files to ingest

  --dso-path DSO_PATH   Use the specified AOT Inductor .dso model file
  --pte-path PTE_PATH   Use the specified ExecuTorch .pte model file

Generation:
  Configs for generating output based on provided prompt

  --prompt PROMPT       Input prompt for manual output generation
  --num-samples NUM_SAMPLES
                        Number of samples
  --max-new-tokens MAX_NEW_TOKENS
                        Maximum number of new tokens
  --top-k TOP_K         Top-k for sampling
  --temperature TEMPERATURE
                        Temperature for sampling

Model Downloading:
  Specify args for model downloading (if model is not downloaded)

  --hf-token HF_TOKEN   A HuggingFace API token to use when downloading model artifacts
  --model-directory MODEL_DIRECTORY
                        The directory to store downloaded model artifacts. Default: /home/jackkhuu/.torchchat/model-cache
export
python torchchat.py export --help
usage: torchchat export [-h] [--checkpoint-path CHECKPOINT_PATH] [--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}] [--quantize QUANTIZE] [--device {fast,cpu,cuda,mps}]
                        [--output-pte-path OUTPUT_PTE_PATH | --output-dso-path OUTPUT_DSO_PATH] [--hf-token HF_TOKEN] [--model-directory MODEL_DIRECTORY] [-v] [--seed SEED]
                        [model]

options:
  -h, --help            show this help message and exit
  -v, --verbose         Verbose output
  --seed SEED           Initialize torch seed

Model Specification:
  (REQUIRED) Specify the base model. Args are mutually exclusive.

  model                 Model name for well-known models
  --checkpoint-path CHECKPOINT_PATH
                        Use the specified model checkpoint path

Model Configuration:
  Specify model configurations

  --dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}
                        Override the dtype of the model (default is the checkpoint dtype). Options: bf16, fp16, fp32, fast16, fast
  --quantize QUANTIZE   Quantization options. pass in as '{"<mode>" : {"<argname1>" : <argval1>, "<argname2>" : <argval2>,...},}' modes are: embedding, linear:int8, linear:int4, linear:a8w4dq, precision.
  --device {fast,cpu,cuda,mps}
                        Hardware device to use. Options: cpu, cuda, mps

Export Output Path:
  Specify the output path for the exported model files

  --output-pte-path OUTPUT_PTE_PATH
                        Output to the specified ExecuTorch .pte model file
  --output-dso-path OUTPUT_DSO_PATH
                        Output to the specified AOT Inductor .dso model file

Model Downloading:
  Specify args for model downloading (if model is not downloaded)

  --hf-token HF_TOKEN   A HuggingFace API token to use when downloading model artifacts
  --model-directory MODEL_DIRECTORY
                        The directory to store downloaded model artifacts. Default: /home/jackkhuu/.torchchat/model-cache
eval
python torchchat.py eval --help
usage: torchchat eval [-h] [--checkpoint-path CHECKPOINT_PATH] [--compile] [--compile-prefill] [--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}] [--quantize QUANTIZE] [--device {fast,cpu,cuda,mps}]
                      [--dso-path DSO_PATH | --pte-path PTE_PATH] [--tasks TASKS [TASKS ...]] [--limit LIMIT] [--max-seq-length MAX_SEQ_LENGTH] [--hf-token HF_TOKEN] [--model-directory MODEL_DIRECTORY] [-v] [--seed SEED]
                      [model]

options:
  -h, --help            show this help message and exit
  -v, --verbose         Verbose output
  --seed SEED           Initialize torch seed

Model Specification:
  (REQUIRED) Specify the base model. Args are mutually exclusive.

  model                 Model name for well-known models
  --checkpoint-path CHECKPOINT_PATH
                        Use the specified model checkpoint path

Model Configuration:
  Specify model configurations

  --compile             Whether to compile the model with torch.compile
  --compile-prefill     Whether to compile the prefill. Improves prefill perf, but has higher compile times.
  --dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}
                        Override the dtype of the model (default is the checkpoint dtype). Options: bf16, fp16, fp32, fast16, fast
  --quantize QUANTIZE   Quantization options. pass in as '{"<mode>" : {"<argname1>" : <argval1>, "<argname2>" : <argval2>,...},}' modes are: embedding, linear:int8, linear:int4, linear:a8w4dq, precision.
  --device {fast,cpu,cuda,mps}
                        Hardware device to use. Options: cpu, cuda, mps

Exported Model Path:
  Specify the path of the exported model files to ingest

  --dso-path DSO_PATH   Use the specified AOT Inductor .dso model file
  --pte-path PTE_PATH   Use the specified ExecuTorch .pte model file

Evaluation:
  Configs for evaluating model performance

  --tasks TASKS [TASKS ...]
                        List of lm-eluther tasks to evaluate. Usage: --tasks task1 task2
  --limit LIMIT         Number of samples to evaluate
  --max-seq-length MAX_SEQ_LENGTH
                        Maximum length sequence to evaluate

Model Downloading:
  Specify args for model downloading (if model is not downloaded)

  --hf-token HF_TOKEN   A HuggingFace API token to use when downloading model artifacts
  --model-directory MODEL_DIRECTORY
                        The directory to store downloaded model artifacts. Default: /home/jackkhuu/.torchchat/model-cache

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 1, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/987

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 460ef5b with merge base cd0307a (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 1, 2024
if verb == "export":
_add_export_output_path_args(parser)
if verb == "eval":
_add_exported_input_path_args(parser)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this arg input was previously dropped and is now added back in (Eval can take exported model inputs)

Comment on lines +364 to +384
Initialize distributed related setups if the user specified
using distributed inference. If not, this is a no-op.
Args:
builder_args (:class:`BuilderArgs`):
Command args for model building.
Returns:
Tuple[Optional[DeviceMesh], Optional[ParallelDims]]:
- The first element is an optional DeviceMesh object,
Tuple[Optional[DeviceMesh], Optional[ParallelDims]]:
- The first element is an optional DeviceMesh object,
which which describes the mesh topology of devices for the DTensor.
- The second element is an optional ParallelDims object,
- The second element is an optional ParallelDims object,
which represents the parallel dimensions configuration.
"""
if not builder_args.use_distributed:
return None, None
dist_config = 'llama3_8B.toml' # TODO - integrate with chat cmd line
world_mesh, parallel_dims = launch_distributed(dist_config)

world_mesh, parallel_dims = launch_distributed(dist_config)

assert world_mesh is not None and parallel_dims is not None, f"failed to launch distributed using {dist_config}"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lint

@Jack-Khuu Jack-Khuu marked this pull request as ready for review August 1, 2024 00:43
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants