Skip to content

Japan OCR Mini Benchmark v0.1.1

Choose a tag to compare

@K10124 K10124 released this 07 Jun 13:29
· 16 commits to main since this release

Update release with InternVL3.5-14B comparison results.

Changes from v0.1.0:

  • Added InternVL3.5-14B Q8_0 model output for receipt_005_noisy.png
  • Added compare_custom_model_output.py for evaluating arbitrary model outputs
  • Updated experiment_log.md with InternVL comparison results
  • Updated failure_cases.md with InternVL failure cases
  • Updated README with Qwen vs InternVL comparison
  • Documented that Qwen3.6 35B A3B results were generated using a Q4_K_M GGUF quantized model in LM Studio

Model comparison summary:

  • Qwen3.6 35B A3B Q4_K_M GGUF: mostly correct, with small tax target amount errors and dakuten/handakuten item-name errors
  • InternVL3.5-14B Q8_0 GGUF: more significant structured extraction errors, including missing items and incorrect tax/discount fields