quic · quic-dhirajku · Sep 24, 2025 · Sep 25, 2025 · Nov 12, 2025
@@ -1,22 +1,160 @@
 ## Contributing to PROJECT
 
 Hi there!
-We’re thrilled that you’d like to contribute to this project.
+We're thrilled that you'd like to contribute to this project.
 Your help is essential for keeping this project great and for making it better.
 
-## Branching Strategy
 
-In general, contributors should develop on branches based off of `main` and pull requests should be made against `main`.
+## Submitting Your Contribution
 
-## Submitting a pull request
+Follow these steps to submit your example to the QEfficient repository:
 
 1. Please read our [code of conduct](CODE-OF-CONDUCT.md) and [license](LICENSE).
-1. Fork and clone the repository.
-1. Create a new branch based on `main`: `git checkout -b <my-branch-name> main`.
-1. Make your changes, add tests, and make sure the tests still pass.
-1. Commit your changes using the [DCO](http://developercertificate.org/). You can attest to the DCO by commiting with the **-s** or **--signoff** options or manually adding the "Signed-off-by".
-1. Push to your fork and submit a pull request from your branch to `main`.
-1. Pat yourself on the back and wait for your pull request to be reviewed.
+
+### 1. Fork and Clone the Repository
+
+First, fork the repository to your GitHub account, then clone your fork:
+
+```bash
+# Fork the repository on GitHub (click the "Fork" button)
+# Then clone your fork
+git clone git@github.com:YOUR_USERNAME/efficient-transformers.git
+cd efficient-transformers
+
+# Add upstream remote to keep your fork in sync
+git remote add upstream git@github.com:quic/efficient-transformers.git
+```
+
+### 2. Create a Feature Branch
+
+Create a descriptive branch for your changes:
+
+```bash
+# Update your main branch
+git checkout main
+git pull upstream main
+
+# Create a new branch
+git checkout -b <branch-name>
+```
+
+### 3. Make Your Changes
+
+When making changes to the codebase:
+
+- **Follow Existing Design Patterns**
+  - Review similar implementations before creating new code
+  - Maintain consistency with the project's architecture and coding style
+  - Reuse existing utilities and base classes where applicable
+
+- **Onboarding New Models**
+  - For adding new model support, refer to the comprehensive guide: `examples/onboarding_guide/causallm/`
+  - Follow the step-by-step process with code examples provided
+
+- **Testing is Mandatory**
+  - Add tests for all new features in the appropriate `tests/` subdirectory
+  - Run tests locally before pushing: `pytest tests/path/to/your/test.py -v`
+  - For model additions, verify all 4 pipeline stages (PyTorch HF → KV → ORT → AI 100) and make sure tokens are matching with refernce PyTorch HF 
+
+- **Documentation**
+  - **For New Features/Flags:**
+    - Document usage in `docs/source/<appropriate-page>` with feature description and usage examples
+    - Ensure documentation is clear enough for others to understand and use the feature
+  - **For New Models:**
+    - Test with basic inference scripts in the `examples/` folder
+    - If specific changes are needed, create a dedicated example file
+    - Update `docs/source/validate.md` with the model's HuggingFace card name and relevant details
+
+
+- **Code Quality Checks**
+  - Pre-commit hooks, DCO sign-off, and CI checks are covered in the following steps
+  - Ensure you complete steps 4-8 before finalizing your PR
+
+### 4. Run Pre-commit Checks
+
+Before committing, ensure your code passes all quality checks:
+
+```bash
+# Install pre-commit and ruff if not already installed
+pip install pre-commit
+pip install ruff
+
+# Run pre-commit on your changed files
+pre-commit run --files path/to/your/file1.py path/to/your/file2.py 
+
+# Run Ruff check 
+ruff check
+```
+
+**Important:** If pre-commit reports any failures:
+- Some issues will be auto-fixed (formatting, trailing whitespace, etc.)
+- For issues that aren't auto-fixed, manually correct them
+- Re-run `pre-commit run --files <files>` or `ruff check` until all checks pass
+
+### 5. Commit with Sign-off (DCO)
+
+All commits must be signed off to comply with the Developer Certificate of Origin (DCO):
+
+```bash
+# Stage your changes
+git add examples/your_domain/your_example.py
+git add examples/your_domain/README.md
+
+# Commit with sign-off
+git commit -s --author "Your Name <your.email@example.com>" -m "Add [model-name] support
+
+- Implements inference for [model-name]
+- Includes documentation and usage examples
+- Tested with [specific configurations]"
+```
+
+**Commit Message Guidelines:**
+- Use a clear, descriptive title 
+- Add a blank line, then detailed description if needed
+- Always include the `-s` flag for DCO sign-off
+
+### 6. Push to Your Fork
+
+Push your branch to your forked repository:
+
+```bash
+git push origin <branch-name>
+```
+
+### 7. Create a Pull Request
+
+1. Go to your fork on GitHub
+2. Click "Compare & pull request" for your branch
+3. Fill out the PR template with:
+   - **Title:** Clear, descriptive title (e.g., "Add Llama-3.2-Vision Support" or "Fix memory leak in KV cache")
+   - **Description:** 
+     - What changes were made and why
+     - What problem it solves or feature it adds
+     - Any special considerations or breaking changes
+     - Links to relevant documentation, issues, or model cards (if applicable)
+   - **Testing:** Describe how you tested your changes
+
+### 8. Ensure CI Checks Pass
+
+After creating the PR, verify that all automated checks pass:
+
+- ✅ **DCO Check:** Ensures all commits are signed off
+- ✅ **Lint Check:** Code style and formatting validation
+- ✅ **Tests:** Automated test suite (if applicable)
+
+If any checks fail:
+1. Review the error messages in the PR
+2. Make necessary fixes in your local branch
+3. Commit and push the fixes (with sign-off)
+4. The PR will automatically update and re-run checks
+
+### 9. Address Review Feedback
+
+Maintainers will review your PR and may request changes:
+- Make requested changes in your local branch
+- Commit with sign-off and push to update the PR
+- Respond to comments to facilitate discussion
+
 
 Here are a few things you can do that will increase the likelihood of your pull request to be accepted:
 

@@ -0,0 +1,196 @@
+# Onboarding a CausalLM Model
+
+## Prerequisites
+
+Install `qefficient-transformers` library in editable mode:
+```sh
+git clone https://github.com/quic/efficient-transformers.git
+cd efficient-transformers
+pip install -e .
+```
+
+## Introduction
+
+This guide walks you through onboarding a new CausalLM model to QEfficient-transformers. We use an example model named `Blueprint` to demonstrate the required changes.
+
+---
+
+## Onboarding Process
+
+![Onboarding Flowchart](./Onboarding.png)
+
+---
+
+## Step 1: Check Transformers Library
+
+1. **Locate the model** in the transformers library:
+   - Path: `/src/transformers/models/<model_name>/modeling_<model_name>.py`
+   - Example: `/src/transformers/models/blueprint/modeling_blueprint.py`
+
+2. **Identify required classes**:
+   - Attention Layer
+   - Decoder Layer
+   - Model (main class)
+   - ForCausalLM (top-level)
+   - RMSNorm/LayerNorm
+   - RotaryEmbedding (if applicable)
+
+3. **Check existing implementations** in `QEfficient/transformers/models/`:
+   - If similar classes exist → Reuse patterns
+   - If not → Create custom implementations
+
+---
+
+## Step 2: Create Custom Files & Mappings
+
+### 2.1 Create Custom Modeling File
+
+Create directory structure:
+```
+QEfficient/transformers/models/blueprint/
+├── __init__.py
+└── modeling_blueprint.py
+```
+
+**Key modifications in `modeling_blueprint.py`:**
+- `QEffBlueprintRotaryEmbedding`: Precompute sin/cos for rotary embeddings
+- `QEffBlueprintAttention`: Use `position_ids`, return `past_key_value`, implement `__qeff_init__`
+- `QEffBlueprintDecoderLayer`: Return `past_key_value` from forward pass
+- `QEffBlueprintModel`: Use `QEffDynamicCache` instead of standard cache
+- `QEffBlueprintForCausalLM`: Entry point with additional parameters
+
+See `modeling_example.py` for detailed implementation examples.
+
+### 2.2 Add Mappings in pytorch_transforms.py
+
+**CustomOpsTransform** (RMSNorm mapping):
+```python
+class CustomOpsTransform(ModuleMappingTransform):
+    _module_mapping = {
+        BlueprintRMSNorm: CustomRMSNormAIC,
+    }
+```
+
+**KVCacheTransform** (all model classes):
+```python
+class KVCacheTransform(ModuleMappingTransform):
+    _module_mapping = {
+        BlueprintAttention: QEffBlueprintAttention,
+        BlueprintDecoderLayer: QEffBlueprintDecoderLayer,
+        BlueprintModel: QEffBlueprintModel,
+        BlueprintForCausalLM: QEffBlueprintForCausalLM,
+    }
+```
+
+See `example_pytorch_transforms.py` for complete example.
+
+---
+
+## Step 3: Testing (4-Stage Pipeline)
+
+Your implementation is validated through four stages:
+
+| Stage | Description | Validation |
+|-------|-------------|------------|
+| **1. PyTorch HF** | Original transformers model | Baseline tokens |
+| **2. PyTorch KV** | After QEff transforms | Tokens match Stage 1 |
+| **3. ONNX/ORT** | After export to ONNX | Tokens match Stage 2 |
+| **4. Cloud AI 100** | Hardware execution | Tokens match Stage 3 |
+
+**Test function:** `check_causal_lm_pytorch_vs_kv_vs_ort_vs_ai100` in `tests/transformers/models/test_causal_lm_models.py`
+
+### Common Issues
+
+**Token mismatch (Stage 1→2):**
+- Check all classes are mapped in `KVCacheTransform`
+- Verify `__qeff_init__` methods exist
+- Ensure `position_ids` are correctly passed
+
+**ONNX export failure (Stage 2→3):**
+- Check for unsupported PyTorch operations
+- Verify dynamic shapes are defined
+
+**Compilation failure (Stage 3→4):**
+- Reduce `num_cores` or model size
+- Check device availability: `get_available_device_id()`
+
+---
+
+## Step 4: Add to Test Suite
+
+Edit `tests/transformers/models/test_causal_lm_models.py`:
+
+```python
+test_models_causal = [
+    "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
+    "gpt2",
+    # ... existing models ...
+    "YourOrg/YourModel-7B",  # Add your model here
+]
+```
+
+**Run tests:**
+```bash
+# Test your specific model
+pytest tests/transformers/models/test_causal_lm_models.py::test_custom_causal_lm_pytorch_vs_kv_vs_ort_vs_ai100 -k "YourModel" -v
+
+# Run all regular tests
+pytest tests/transformers/models/test_causal_lm_models.py -m regular
+```
+
+---
+
+## Step 5: Validation Checklist
+
+Before submitting PR:
+
+**Implementation:**
+- [ ] Created `QEfficient/transformers/models/<model_name>/` directory
+- [ ] Implemented all required custom classes
+- [ ] Added mappings in `CustomOpsTransform` and `KVCacheTransform`
+- [ ] Added imports at top of `pytorch_transforms.py`
+
+**Testing:**
+- [ ] Model added to `test_models_causal` list
+- [ ] All 4 stages pass (PyTorch HF → KV → ORT → AI 100)
+- [ ] Continuous batching tests pass
+- [ ] `qconfig.json` generated successfully
+
+**Code Quality:**
+- [ ] Code follows project style guidelines
+- [ ] Commits use DCO sign-off (`git commit -s`)
+- [ ] Branch created from `main`
+
+---
+
+## Step 6: Submit Pull Request
+
+Follow guidelines in [CONTRIBUTING.md](../../../CONTRIBUTING.md):
+
+1. Create feature branch: `git checkout -b add-yourmodel-support main`
+2. Commit with DCO: `git commit -s -m "Add support for YourModel"`
+3. Push and create PR targeting `main` branch
+4. Include test results in PR description
+
+---
+
+## Troubleshooting Quick Reference
+
+| Issue | Solution |
+|-------|----------|
+| Token mismatch between stages | Check class mappings, verify `position_ids` handling |
+| Shape errors | Verify KV cache dimensions, check `past_key_value` returns |
+| ONNX export fails | Replace unsupported ops, define dynamic shapes |
+| Compilation fails | Reduce `num_cores`, check device availability |
+| Runtime errors | Verify input shapes match specializations |
+
+**Debug tip:** Start with `n_layer=1` and short prompts, then gradually increase complexity.
+
+---
+
+## References
+
+- [Hugging Face Transformers](https://github.com/huggingface/transformers)
+- [QEfficient Transformers](https://github.com/quic/efficient-transformers)
+- [Contributing Guidelines](../../../CONTRIBUTING.md)
+- [Test Suite](../../../tests/transformers/models/test_causal_lm_models.py)