debug: correct token order #5

bluebread · 2025-11-23T10:18:42Z

Make sure to read the contributing guidelines before submitting a PR

@sfallah
llama-mtmd-cli applies a chat template to input texts/images and appends image tokens after text tokens by default. However, it looks like DeepSeek-OCR doesn't work if images follow texts. By swapping the order of <image> and the text, the model outputs only two tokens ("}" and EOS). This doesn't look like a bug in the original model, but I don't know if this is intentional or not.

Therefore, I changed a bit of how llama-mtmd-cli tokenizes the input. There could be a better way to do this though.

In correct order:

In reverse order:

mtmd: correct token order

3f71188

sfallah merged commit a594990 into sfallah:sf/deepseek-ocr Nov 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

debug: correct token order #5

debug: correct token order #5

bluebread commented Nov 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

debug: correct token order #5

debug: correct token order #5

Conversation

bluebread commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

In correct order:

In reverse order:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bluebread commented Nov 23, 2025 •

edited

Loading