Bug
APR inference fails with PMAT-172: APR file missing embedded tokenizer for models that use SentencePiece (tokenizer.model) instead of tokenizer.json.
Reproduction
apr pull internlm/internlm2_5-7b-chat
apr convert ~/.apr/cache/hf/internlm/internlm2_5-7b-chat/model.safetensors.index.json --format apr
apr run output.apr -p "Hello" --max-tokens 10
# ERROR: APR file missing embedded tokenizer
Root Cause
The APR converter embeds tokenizer data from tokenizer.json (HuggingFace tokenizers format). Models using SentencePiece (tokenizer.model) don't have tokenizer.json — they have tokenizer.model + tokenizer_config.json.
Affected Models
- internlm/internlm2_5-7b-chat
- internlm/internlm2_5-20b-chat
- teknium/OpenHermes-2.5-Mistral-7B
- Potentially others using SentencePiece
Expected Behavior
APR converter should detect tokenizer.model (SentencePiece) as a fallback when tokenizer.json is absent and embed it in the APR file.