Summary
Granite Switch currently works as a vLLM plugin (out-of-tree model) and a standalone HF model. Should we pursue making GraniteSwitchForCausalLM a supported model architecture in upstream vLLM and/or HuggingFace transformers?
Considerations
- Pros: Eliminates plugin maintenance burden; broader visibility and testing; benefits from upstream CI and release process.
- Cons: Upstream acceptance criteria may require changes; ongoing maintenance commitment; slower iteration cycle.
Next steps
- Gauge vLLM community interest and understand acceptance criteria
- Attempt HF model classes upstream via PR
- This is a tracking issue — concrete work depends on community feedback
Consolidates original issues #21 (vLLM upstream), #50 (HF upstream), and #61 (vLLM PR).
Links
https://docs.vllm.ai/en/latest/contributing/model/registration/
Summary
Granite Switch currently works as a vLLM plugin (out-of-tree model) and a standalone HF model. Should we pursue making
GraniteSwitchForCausalLMa supported model architecture in upstream vLLM and/or HuggingFace transformers?Considerations
Next steps
Consolidates original issues #21 (vLLM upstream), #50 (HF upstream), and #61 (vLLM PR).
Links
https://docs.vllm.ai/en/latest/contributing/model/registration/