Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 148 additions & 10 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,160 @@
## Contributing to PROJECT

Hi there!
Were thrilled that youd like to contribute to this project.
We're thrilled that you'd like to contribute to this project.
Your help is essential for keeping this project great and for making it better.

## Branching Strategy

In general, contributors should develop on branches based off of `main` and pull requests should be made against `main`.
## Submitting Your Contribution

## Submitting a pull request
Follow these steps to submit your example to the QEfficient repository:

1. Please read our [code of conduct](CODE-OF-CONDUCT.md) and [license](LICENSE).
1. Fork and clone the repository.
1. Create a new branch based on `main`: `git checkout -b <my-branch-name> main`.
1. Make your changes, add tests, and make sure the tests still pass.
1. Commit your changes using the [DCO](http://developercertificate.org/). You can attest to the DCO by commiting with the **-s** or **--signoff** options or manually adding the "Signed-off-by".
1. Push to your fork and submit a pull request from your branch to `main`.
1. Pat yourself on the back and wait for your pull request to be reviewed.

### 1. Fork and Clone the Repository

First, fork the repository to your GitHub account, then clone your fork:

```bash
# Fork the repository on GitHub (click the "Fork" button)
# Then clone your fork
git clone git@github.com:YOUR_USERNAME/efficient-transformers.git
cd efficient-transformers

# Add upstream remote to keep your fork in sync
git remote add upstream git@github.com:quic/efficient-transformers.git
```

### 2. Create a Feature Branch

Create a descriptive branch for your changes:

```bash
# Update your main branch
git checkout main
git pull upstream main

# Create a new branch
git checkout -b <branch-name>
```

### 3. Make Your Changes

When making changes to the codebase:

- **Follow Existing Design Patterns**
- Review similar implementations before creating new code
- Maintain consistency with the project's architecture and coding style
- Reuse existing utilities and base classes where applicable

- **Onboarding New Models**
- For adding new model support, refer to the comprehensive guide: `examples/onboarding_guide/causallm/`
- Follow the step-by-step process with code examples provided

- **Testing is Mandatory**
- Add tests for all new features in the appropriate `tests/` subdirectory
- Run tests locally before pushing: `pytest tests/path/to/your/test.py -v`
- For model additions, verify all 4 pipeline stages (PyTorch HF → KV → ORT → AI 100) and make sure tokens are matching with refernce PyTorch HF

- **Documentation**
- **For New Features/Flags:**
- Document usage in `docs/source/<appropriate-page>` with feature description and usage examples
- Ensure documentation is clear enough for others to understand and use the feature
- **For New Models:**
- Test with basic inference scripts in the `examples/` folder
- If specific changes are needed, create a dedicated example file
- Update `docs/source/validate.md` with the model's HuggingFace card name and relevant details


- **Code Quality Checks**
- Pre-commit hooks, DCO sign-off, and CI checks are covered in the following steps
- Ensure you complete steps 4-8 before finalizing your PR

### 4. Run Pre-commit Checks

Before committing, ensure your code passes all quality checks:

```bash
# Install pre-commit and ruff if not already installed
pip install pre-commit
pip install ruff

# Run pre-commit on your changed files
pre-commit run --files path/to/your/file1.py path/to/your/file2.py

# Run Ruff check
ruff check
```

**Important:** If pre-commit reports any failures:
- Some issues will be auto-fixed (formatting, trailing whitespace, etc.)
- For issues that aren't auto-fixed, manually correct them
- Re-run `pre-commit run --files <files>` or `ruff check` until all checks pass

### 5. Commit with Sign-off (DCO)

All commits must be signed off to comply with the Developer Certificate of Origin (DCO):

```bash
# Stage your changes
git add examples/your_domain/your_example.py
git add examples/your_domain/README.md

# Commit with sign-off
git commit -s --author "Your Name <your.email@example.com>" -m "Add [model-name] support

- Implements inference for [model-name]
- Includes documentation and usage examples
- Tested with [specific configurations]"
```

**Commit Message Guidelines:**
- Use a clear, descriptive title
- Add a blank line, then detailed description if needed
- Always include the `-s` flag for DCO sign-off

### 6. Push to Your Fork

Push your branch to your forked repository:

```bash
git push origin <branch-name>
```

### 7. Create a Pull Request

1. Go to your fork on GitHub
2. Click "Compare & pull request" for your branch
3. Fill out the PR template with:
- **Title:** Clear, descriptive title (e.g., "Add Llama-3.2-Vision Support" or "Fix memory leak in KV cache")
- **Description:**
- What changes were made and why
- What problem it solves or feature it adds
- Any special considerations or breaking changes
- Links to relevant documentation, issues, or model cards (if applicable)
- **Testing:** Describe how you tested your changes

### 8. Ensure CI Checks Pass

After creating the PR, verify that all automated checks pass:

- ✅ **DCO Check:** Ensures all commits are signed off
- ✅ **Lint Check:** Code style and formatting validation
- ✅ **Tests:** Automated test suite (if applicable)

If any checks fail:
1. Review the error messages in the PR
2. Make necessary fixes in your local branch
3. Commit and push the fixes (with sign-off)
4. The PR will automatically update and re-run checks

### 9. Address Review Feedback

Maintainers will review your PR and may request changes:
- Make requested changes in your local branch
- Commit with sign-off and push to update the PR
- Respond to comments to facilitate discussion


Here are a few things you can do that will increase the likelihood of your pull request to be accepted:

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
196 changes: 196 additions & 0 deletions examples/onboarding_guide/causallm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
# Onboarding a CausalLM Model

## Prerequisites

Install `qefficient-transformers` library in editable mode:
```sh
git clone https://github.com/quic/efficient-transformers.git
cd efficient-transformers
pip install -e .
```

## Introduction

This guide walks you through onboarding a new CausalLM model to QEfficient-transformers. We use an example model named `Blueprint` to demonstrate the required changes.

---

## Onboarding Process

![Onboarding Flowchart](./Onboarding.png)

---

## Step 1: Check Transformers Library

1. **Locate the model** in the transformers library:
- Path: `/src/transformers/models/<model_name>/modeling_<model_name>.py`
- Example: `/src/transformers/models/blueprint/modeling_blueprint.py`

2. **Identify required classes**:
- Attention Layer
- Decoder Layer
- Model (main class)
- ForCausalLM (top-level)
- RMSNorm/LayerNorm
- RotaryEmbedding (if applicable)

3. **Check existing implementations** in `QEfficient/transformers/models/`:
- If similar classes exist → Reuse patterns
- If not → Create custom implementations

---

## Step 2: Create Custom Files & Mappings

### 2.1 Create Custom Modeling File

Create directory structure:
```
QEfficient/transformers/models/blueprint/
├── __init__.py
└── modeling_blueprint.py
```

**Key modifications in `modeling_blueprint.py`:**
- `QEffBlueprintRotaryEmbedding`: Precompute sin/cos for rotary embeddings
- `QEffBlueprintAttention`: Use `position_ids`, return `past_key_value`, implement `__qeff_init__`
- `QEffBlueprintDecoderLayer`: Return `past_key_value` from forward pass
- `QEffBlueprintModel`: Use `QEffDynamicCache` instead of standard cache
- `QEffBlueprintForCausalLM`: Entry point with additional parameters

See `modeling_example.py` for detailed implementation examples.

### 2.2 Add Mappings in pytorch_transforms.py

**CustomOpsTransform** (RMSNorm mapping):
```python
class CustomOpsTransform(ModuleMappingTransform):
_module_mapping = {
BlueprintRMSNorm: CustomRMSNormAIC,
}
```

**KVCacheTransform** (all model classes):
```python
class KVCacheTransform(ModuleMappingTransform):
_module_mapping = {
BlueprintAttention: QEffBlueprintAttention,
BlueprintDecoderLayer: QEffBlueprintDecoderLayer,
BlueprintModel: QEffBlueprintModel,
BlueprintForCausalLM: QEffBlueprintForCausalLM,
}
```

See `example_pytorch_transforms.py` for complete example.

---

## Step 3: Testing (4-Stage Pipeline)

Your implementation is validated through four stages:

| Stage | Description | Validation |
|-------|-------------|------------|
| **1. PyTorch HF** | Original transformers model | Baseline tokens |
| **2. PyTorch KV** | After QEff transforms | Tokens match Stage 1 |
| **3. ONNX/ORT** | After export to ONNX | Tokens match Stage 2 |
| **4. Cloud AI 100** | Hardware execution | Tokens match Stage 3 |

**Test function:** `check_causal_lm_pytorch_vs_kv_vs_ort_vs_ai100` in `tests/transformers/models/test_causal_lm_models.py`

### Common Issues

**Token mismatch (Stage 1→2):**
- Check all classes are mapped in `KVCacheTransform`
- Verify `__qeff_init__` methods exist
- Ensure `position_ids` are correctly passed

**ONNX export failure (Stage 2→3):**
- Check for unsupported PyTorch operations
- Verify dynamic shapes are defined

**Compilation failure (Stage 3→4):**
- Reduce `num_cores` or model size
- Check device availability: `get_available_device_id()`

---

## Step 4: Add to Test Suite

Edit `tests/transformers/models/test_causal_lm_models.py`:

```python
test_models_causal = [
"TinyLlama/TinyLlama-1.1B-Chat-v1.0",
"gpt2",
# ... existing models ...
"YourOrg/YourModel-7B", # Add your model here
]
```

**Run tests:**
```bash
# Test your specific model
pytest tests/transformers/models/test_causal_lm_models.py::test_custom_causal_lm_pytorch_vs_kv_vs_ort_vs_ai100 -k "YourModel" -v

# Run all regular tests
pytest tests/transformers/models/test_causal_lm_models.py -m regular
```

---

## Step 5: Validation Checklist

Before submitting PR:

**Implementation:**
- [ ] Created `QEfficient/transformers/models/<model_name>/` directory
- [ ] Implemented all required custom classes
- [ ] Added mappings in `CustomOpsTransform` and `KVCacheTransform`
- [ ] Added imports at top of `pytorch_transforms.py`

**Testing:**
- [ ] Model added to `test_models_causal` list
- [ ] All 4 stages pass (PyTorch HF → KV → ORT → AI 100)
- [ ] Continuous batching tests pass
- [ ] `qconfig.json` generated successfully

**Code Quality:**
- [ ] Code follows project style guidelines
- [ ] Commits use DCO sign-off (`git commit -s`)
- [ ] Branch created from `main`

---

## Step 6: Submit Pull Request

Follow guidelines in [CONTRIBUTING.md](../../../CONTRIBUTING.md):

1. Create feature branch: `git checkout -b add-yourmodel-support main`
2. Commit with DCO: `git commit -s -m "Add support for YourModel"`
3. Push and create PR targeting `main` branch
4. Include test results in PR description

---

## Troubleshooting Quick Reference

| Issue | Solution |
|-------|----------|
| Token mismatch between stages | Check class mappings, verify `position_ids` handling |
| Shape errors | Verify KV cache dimensions, check `past_key_value` returns |
| ONNX export fails | Replace unsupported ops, define dynamic shapes |
| Compilation fails | Reduce `num_cores`, check device availability |
| Runtime errors | Verify input shapes match specializations |

**Debug tip:** Start with `n_layer=1` and short prompts, then gradually increase complexity.

---

## References

- [Hugging Face Transformers](https://github.com/huggingface/transformers)
- [QEfficient Transformers](https://github.com/quic/efficient-transformers)
- [Contributing Guidelines](../../../CONTRIBUTING.md)
- [Test Suite](../../../tests/transformers/models/test_causal_lm_models.py)
Loading
Loading