### What Let's implement prefill-decode logic for LLama-based model quantization. ### How `quantize_full_qmodel_with_gptq.py` should produce two circle models (prefill and decode) on demand with some quality values.