HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

This is the official implementation for our ICCV-2023 paper

"HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models"

Eslam Abdelrahman, Pengzhan Sun, Xiaoqian Shen, Faizan Farooq Khan, Li Erran Li, and Mohamed Elhoseiny

📡 :book: :clapper:

📢 News

(July 13, 2023): The paper is accepted at ICCV-2023.
(April 11, 2023): The paper is published on arxiv.

📚 Synopsis

Holistic skills evaluation. Rather than focus on isolated metrics such as accuracy, we measure 13 skills, which could be categorized into five critical skills; accuracy, robustness, generalization, fairness, and bias.

Broad scenarios coverage. HRS-Bench covers 50 applications, e.g., fashion, animals, transportation, food, and clothes.

Standardization. We propose a unified benchmark, where we fairly evaluate the existing models across a wide range of metrics.

Holistic prompts generation.

📌 Covered Models

🔥 Qualitative results

📌 Prerequisites

Python >= 3.7
Pytorch >= 1.7.0
Install other common packages (numpy, pytorch_transformers, etc.)

📌 Data

👉 HRS-Bench:

First, download our prompts that covers the 13 skills from here.
Each skill has its own CSV file that contains the prompt and the GT that will be used during the evaluation phase.

👉 Prompt Generation:

You don't need to run the prompts generation codes as we already provide the generated prompts and can be downloaded from this link.

However, we provide also all the generation codes.

📌 Evaluation

Follow the detailed instructions mentioned in the README file. to be able to run all our eval scripts for the whole skills.

💐 Credits

The project is inspired from the great language benchmark HELM.

☎️ Contact us

eslam.abdelrahman@kaust.edu.sa

📬 Citation

Please consider citing our paper if you find it useful.

@misc{2304.05390,
Author = {Eslam Mohamed Bakr and Pengzhan Sun and Xiaoqian Shen and Faizan Farooq Khan and Li Erran Li and Mohamed Elhoseiny},
Title = {HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models},
Year = {2023},
Eprint = {arXiv:2304.05390},
}

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
Figures		Figures
codes		codes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figures

Figures

codes

codes

README.md

README.md

Repository files navigation

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

📢 News

📚 Synopsis

📌 Covered Models

🔥 Qualitative results

📌 Prerequisites

📌 Data

👉 HRS-Bench:

👉 Prompt Generation:

📌 Evaluation

💐 Credits

☎️ Contact us

📬 Citation

About

Releases

Packages

Contributors 2

Languages

eslambakr/HRS_benchmark

Folders and files

Latest commit

History

Repository files navigation

HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models

📢 News

📚 Synopsis

📌 Covered Models

🔥 Qualitative results

📌 Prerequisites

📌 Data

👉 HRS-Bench:

👉 Prompt Generation:

📌 Evaluation

💐 Credits

☎️ Contact us

📬 Citation

About

Resources

Stars

Watchers

Forks

Languages