Skip to content

LLM Eval Benchmark v1.0.0

Latest

Choose a tag to compare

@changyufei222 changyufei222 released this 10 Jun 11:38
· 2 commits to main since this release

Fixed-set Direct-versus-RAG benchmark release with controlled model protocols, promoted result summaries, stability analysis, reproducibility guidance, bilingual navigation, and citation metadata.