Update README.md

Q-Future · Oct 30, 2023 · e485294 · e485294
1 parent 6701c18
commit e485294
Showing 1 changed file with 20 additions and 20 deletions.
diff --git a/leaderboards/README.md b/leaderboards/README.md
@@ -160,27 +160,27 @@ The datasets can be found [here](../a3_iqa_databases/).
 See [IQA_outputs/eval.ipynb](IQA_outputs/eval.ipynb) for our ablation experiments.
 
 
-| **Model Name**|KoNIQ-10k | SPAQ| LIVE-FB| LIVE-itw| CGIQA-6K| AGIQA-3K| KADID-10K|
+| **Model Name**|KoNIQ-10k | SPAQ| LIVE-FB| LIVE-itw| CGIQA-6K| AGIQA-3K| KADID-10K|Average
 | -| -| -| -| -| -| -| -| -| 
-| NIQE | 0.316/0.377 | 0.693/0.669 | 0.211/0.288 | 0.480/0.451 | 0.075/0.056 | 0.562/0.517 | 0.374/0.428 |
-| CLIP-ViT-Large-14 | 0.468/0.505 | 0.385/0.389 | 0.218/0.237 | 0.307/0.308 | 0.285/0.290 | 0.436/0.458 | 0.376/0.388 |
-| LLaVA-v1.5 (Vicuna-v1.5-7B) | 0.463/0.459 | 0.443/0.467 | 0.305/0.321 | 0.344/0.358 | 0.321/0.333 | 0.672/0.738 | 0.417/0.440 |
-| LLaVA-v1.5 (Vicuna-v1.5-13B) | 0.448/0.460 | 0.563/0.584 | 0.310/0.339 | 0.445/0.481 | 0.285/0.297 | 0.664/0.754 | 0.390/0.400 |
-| InternLM-XComposer (InternLM) | 0.568/0.616 | 0.731/0.751 | 0.358/0.413 | 0.619/0.678 | 0.246/0.268 | 0.734/0.777 | 0.540/0.563 |
-| IDEFICS-Instruct (LLaMA-7B) | 0.375/0.400 | 0.474/0.484 | 0.235/0.24 | 0.409/0.428 | 0.244/0.227 | 0.562/0.622 | 0.370/0.373 |
-| Qwen-VL (QwenLM) | 0.470/0.546 | 0.676/0.669 | 0.298/0.338 | 0.504/0.532 | 0.273/0.284 | 0.617/0.686 | 0.486/0.486 |
-| Shikra (Vicuna-7B) | 0.314/0.307 | 0.32/0.337 | 0.237/0.241 | 0.322/0.336 | 0.198/0.201 | 0.640/0.661 | 0.324/0.332 |
-| Otter-v1 (MPT-7B) | 0.406/0.406 | 0.436/0.441 | 0.143/0.142 | -0.008/0.018 | 0.254/0.264 | 0.475/0.481 | 0.557/0.577 |
-| Kosmos-2 | 0.255/0.281 | 0.644/0.641 | 0.196/0.195 | 0.358/0.368 | 0.210/0.225 | 0.489/0.491 | 0.359/0.365 |
-| InstructBLIP (Flan-T5-XL) | 0.334/0.362 | 0.582/0.599 | 0.248/0.267 | 0.113/0.113 | 0.167/0.188 | 0.378/0.400 | 0.211/0.179 |
-| InstructBLIP (Vicuna-7B) | 0.359/0.437 | 0.683/0.689 | 0.200/0.283 | 0.253/0.367 | 0.263/0.304 | 0.629/0.663 | 0.337/0.382 |
-| VisualGLM-6B (GLM-6B) | 0.247/0.234 | 0.498/0.507 | 0.146/0.154 | 0.110/0.116 | 0.209/0.183 | 0.342/0.349 | 0.127/0.131 |
-| mPLUG-Owl (LLaMA-7B) | 0.409/0.427 | 0.634/0.644 | 0.241/0.271 | 0.437/0.487 | 0.148/0.180 | 0.687/0.711 | 0.466/0.486 |
-| LLaMA-Adapter-V2 | 0.354/0.363 | 0.464/0.506 | 0.275/0.329 | 0.298/0.360 | 0.257/0.271 | 0.604/0.666 | 0.412/0.425 |
-| LLaVA-v1 (Vicuna-13B) | 0.462/0.457 | 0.442/0.462 | 0.264/0.280 | 0.404/0.417 | 0.208/0.237 | 0.626/0.684 | 0.349/0.372 |
-| MiniGPT-4 (Vicuna-13B) | 0.239/0.257 | 0.238/0.253 | 0.170/0.183 | 0.339/0.340 | 0.252/0.246 | 0.572/0.591 | 0.239/0.233 |
-
-Overall, `internlm_xcomposer_vl` has the best IQA performance among the models. (15th Oct) with 6 champions among 7 datasets. `qwen-vl` and `llava-v1.5` are good runner-ups. 
+| NIQE | 0.316/0.377 | 0.693/0.669 | 0.211/0.288 | 0.480/0.451 | 0.075/0.056 | 0.562/0.517 | 0.374/0.428 |0.387/0.398|
+| CLIP-ViT-Large-14 | 0.468/0.505 | 0.385/0.389 | 0.218/0.237 | 0.307/0.308 | 0.285/0.290 | 0.436/0.458 | 0.376/0.388 |0.354/0.368|
+| LLaVA-v1.5 (Vicuna-v1.5-7B) | 0.463/0.459 | 0.443/0.467 | 0.305/0.321 | 0.344/0.358 | **0.321/0.333** | 0.672/0.738 | 0.417/0.440 |0.424/0.445|
+| LLaVA-v1.5 (Vicuna-v1.5-13B) | 0.448/0.460 | 0.563/0.584 | 0.310/0.339 | 0.445/0.481 | 0.285/0.297 | 0.664/0.754 | 0.390/0.400 |0.444/0.474|
+| InternLM-XComposer (InternLM) | **0.568/0.616** | **0.731/0.751** | **0.358/0.413** | **0.619/0.678** | 0.246/0.268 | **0.734/0.777** | 0.540/0.563 |**0.542/0.581**|
+| IDEFICS-Instruct (LLaMA-7B) | 0.375/0.400 | 0.474/0.484 | 0.235/0.24 | 0.409/0.428 | 0.244/0.227 | 0.562/0.622 | 0.370/0.373 |0.381/0.396|
+| Qwen-VL (QwenLM) | 0.470/0.546 | 0.676/0.669 | 0.298/0.338 | 0.504/0.532 | 0.273/0.284 | 0.617/0.686 | 0.486/0.486 |0.475/0.506|
+| Shikra (Vicuna-7B) | 0.314/0.307 | 0.32/0.337 | 0.237/0.241 | 0.322/0.336 | 0.198/0.201 | 0.640/0.661 | 0.324/0.332 |0.336/0.345|
+| Otter-v1 (MPT-7B) | 0.406/0.406 | 0.436/0.441 | 0.143/0.142 | -0.008/0.018 | 0.254/0.264 | 0.475/0.481 | **0.557/0.577** |0.323/0.333|
+| Kosmos-2 | 0.255/0.281 | 0.644/0.641 | 0.196/0.195 | 0.358/0.368 | 0.210/0.225 | 0.489/0.491 | 0.359/0.365 |0.359/0.367|
+| InstructBLIP (Flan-T5-XL) | 0.334/0.362 | 0.582/0.599 | 0.248/0.267 | 0.113/0.113 | 0.167/0.188 | 0.378/0.400 | 0.211/0.179 |0.290/0.301|
+| InstructBLIP (Vicuna-7B) | 0.359/0.437 | 0.683/0.689 | 0.200/0.283 | 0.253/0.367 | 0.263/0.304 | 0.629/0.663 | 0.337/0.382 |0.389/0.446|
+| VisualGLM-6B (GLM-6B) | 0.247/0.234 | 0.498/0.507 | 0.146/0.154 | 0.110/0.116 | 0.209/0.183 | 0.342/0.349 | 0.127/0.131 |0.240/0.239|
+| mPLUG-Owl (LLaMA-7B) | 0.409/0.427 | 0.634/0.644 | 0.241/0.271 | 0.437/0.487 | 0.148/0.180 | 0.687/0.711 | 0.466/0.486 |0.432/0.458|
+| LLaMA-Adapter-V2 | 0.354/0.363 | 0.464/0.506 | 0.275/0.329 | 0.298/0.360 | 0.257/0.271 | 0.604/0.666 | 0.412/0.425 |0.381/0.417|
+| LLaVA-v1 (Vicuna-13B) | 0.462/0.457 | 0.442/0.462 | 0.264/0.280 | 0.404/0.417 | 0.208/0.237 | 0.626/0.684 | 0.349/0.372 |0.394/0.416|
+| MiniGPT-4 (Vicuna-13B) | 0.239/0.257 | 0.238/0.253 | 0.170/0.183 | 0.339/0.340 | 0.252/0.246 | 0.572/0.591 | 0.239/0.233 |0.293/0.300|
+
+Overall, `internlm_xcomposer_vl` has the best IQA performance among the models. (30th Oct) with 6 champions among 7 datasets. `qwen-vl` and `llava-v1.5` are good runner-ups. 
 
 We release the results of these models (as well as the post-evaluation code) in [IQA_results](iqa_results/) for reference.