Skip to content

hrsbench/HRS_Bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

This is the official implementation for our paper

"HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models"

📡 Website                 :book: paper                 :clapper: video

📢 News

📚 Synopsis

  • Holistic skills evaluation. Rather than focus on isolated metrics such as accuracy, we measure 13 skills, which could be categorized into five critical skills; accuracy, robustness, generalization, fairness, and bias.

  • Broad scenarios coverage. HRS-Bench covers 50 applications, e.g., fashion, animals, transportation, food, and clothes.

  • Standardization. We propose a unified benchmark, where we fairly evaluate the existing models across a wide range of metrics.

📌 Covered Models

🔥 Qualitative results

📌 Prerequisites

📌 Data

👉 HRS-Bench

  • First, download our prompts that covers the 13 skills from here.
  • Each skill has its own CSV file that contains the prompt and the GT that will be used during the evaluation phase.

📌 Evaluation

💐 Credits

The project is inspired from the great language benchmark HELM.

☎️ Contact us

📬 Citation

Please consider citing our paper if you find it useful.


About

Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published