Skip to content

Conversation

mengniwang95
Copy link
Contributor

@mengniwang95 mengniwang95 commented Sep 29, 2025

User description

Type of Change

example


PR Type

Enhancement


Description

  • Added scripts for quantization and benchmarking Llama4 model

  • Included setup script for environment preparation

  • Provided README with step-by-step instructions


Diagram Walkthrough

flowchart LR
  A["Add run_benchmark.sh"] -- "Benchmarking script" --> B["Add run_quant.sh"]
  B -- "Quantization script" --> C["Add setup.sh"]
  C -- "Environment setup" --> D["Add README.md"]
  D -- "Instructions" --> E["Add requirements.txt"]
  E -- "Dependencies" --> F["Complete Llama4 example"]
Loading

File Walkthrough

Relevant files
Enhancement
run_benchmark.sh
Add benchmark script for Llama4                                                   

examples/3.x_api/pytorch/multimodal-modeling/quantization/auto_round/llama4/run_benchmark.sh

  • Added main function to handle script execution
  • Implemented parameter initialization
  • Defined benchmark execution logic
+61/-0   
run_quant.sh
Add quantization script for Llama4                                             

examples/3.x_api/pytorch/multimodal-modeling/quantization/auto_round/llama4/run_quant.sh

  • Added main function to handle script execution
  • Implemented parameter initialization
  • Defined quantization tuning logic
+58/-0   
setup.sh
Add setup script for Llama4 example                                           

examples/3.x_api/pytorch/multimodal-modeling/quantization/auto_round/llama4/setup.sh

  • Added commands to install dependencies
  • Cloned and installed vllm-fork repository
+8/-0     
Documentation
README.md
Add README for Llama4 example                                                       

examples/3.x_api/pytorch/multimodal-modeling/quantization/auto_round/llama4/README.md

  • Provided step-by-step instructions for environment setup
  • Included commands for model preparation and execution
+36/-0   
Dependencies
requirements.txt
Add requirements for Llama4 example                                           

examples/3.x_api/pytorch/multimodal-modeling/quantization/auto_round/llama4/requirements.txt

  • Listed required Python packages for the example
+6/-0     

Mengni Wang and others added 3 commits September 28, 2025 23:14
Signed-off-by: Mengni Wang <mengni.wang@nitel.com>
Signed-off-by: Mengni Wang <mengni.wang@nitel.com>
@PRAgent4INC
Copy link
Collaborator

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Typo

There is a typo in the command pip install packaging -- upgrade. It should be pip install packaging --upgrade.

pip install packaging -- upgrade
Default Value

The default value for tuned_checkpoint is set inside the run_tuning function. It would be better to set it in the init_params function to make the default value more explicit.

}
Package Version

Ensure that the specified package versions are compatible with each other and with the rest of the project dependencies.

auto-tound==0.8.0
compressed-tensors
lm-eval
setuptools_scm
torchao==0.12.0
triton==3.3.1

@PRAgent4INC
Copy link
Collaborator

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Correct package name typo

Correct the typo in the package name auto-tound to auto-round.

examples/3.x_api/pytorch/multimodal-modeling/quantization/auto_round/llama4/requirements.txt [1]

-auto-tound==0.8.0
+auto-round==0.8.0
Suggestion importance[1-10]: 8

__

Why: Correcting the typo in the package name auto-tound to auto-round is crucial for the script to function correctly, as it ensures the correct package is installed.

Medium
General
Fix space in command

Correct the space between packaging and --upgrade.

examples/3.x_api/pytorch/multimodal-modeling/quantization/auto_round/llama4/setup.sh [3]

-pip install packaging -- upgrade
+pip install packaging --upgrade
Suggestion importance[1-10]: 6

__

Why: The space between packaging and --upgrade is incorrect and should be fixed for proper execution.

Low
Verify branch existence

Verify that the branch mxfp4 exists and is the correct one for the project.

examples/3.x_api/pytorch/multimodal-modeling/quantization/auto_round/llama4/setup.sh [5]

+git clone -b mxfp4 https://github.com/mengniwang95/vllm-fork.git
 
-
Suggestion importance[1-10]: 5

__

Why: Verifying the branch existence is important but does not directly impact the functionality of the script. It is more of a maintenance task.

Low

@chensuyue chensuyue merged commit 35d72bd into master Sep 30, 2025
12 checks passed
@chensuyue chensuyue deleted the mengni/scout_example branch September 30, 2025 03:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants