Skip to content

Commit

Permalink
Update correct name of InternLM-XComposer-VL
Browse files Browse the repository at this point in the history
  • Loading branch information
teowu committed Oct 30, 2023
1 parent e485294 commit 2e69690
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions leaderboards/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ _version_: v1.0.2.1030wip; _Timeliness_: Updated on 30th Oct.

| Rank | [A1: Perception](#leaderboards-for-a1-perception) (dev set) | [A1: Perception](#leaderboards-for-a1-perception) (test set) | [A2: Description](#leaderboards-for-a2-description) | [A3: Assessment](#leaderboards-for-a3-assessment) |
|:----:|:-------------------------------------------------------------------:|:-------------------------------------------------------------------:|:-------------------------------------------------------------:|:----------------------------------------------------------:|
| 🥇 | InternLM-XComposer (0.6535) | InternLM-XComposer (0.6435) | InternLM-XComposer (4.21/6) | InternLM-XComposer (0.542,0.581) |
| 🥇 | InternLM-XComposer-VL (0.6535) | InternLM-XComposer-VL (0.6435) | InternLM-XComposer-VL (4.21/6) | InternLM-XComposer-VL (0.542,0.581) |
| 🥈 | LLaVA-v1.5-13B (0.6214) | InstructBLIP-T5-XL (0.6194) | Kosmos-2 (4.03/6) | Qwen-VL (0.475,0.506) |
| 🥉 | InstructBLIP-T5-XL (0.6147) | Qwen-VL (0.6167) | mPLUG-Owl (3.94/6) | LLaVA-v1.5-13B (0.444,0.473) |

Expand All @@ -41,7 +41,7 @@ About the partition of `dev` and `test` subsets, please see [our dataset release
| random guess | 0.5000 | 0.2786 | 0.3331 | 0.3789 | 0.3848 | 0.3828 | 0.3582 | 0.3780 |
| LLaVA-v1.5 (Vicuna-v1.5-7B) | 0.6636 | 0.5819 | 0.5051 | 0.4942 | 0.6574 | 0.5461 | 0.7061 | 0.5866 |
| LLaVA-v1.5 (Vicuna-v1.5-13B) | 0.6527 | 0.6438 | 0.5659 | 0.5603 | 0.6713 | 0.6118 | 0.6735 | 0.6214 |
| InternLM-XComposer (InternLM) | 0.6945 | 0.6527 | 0.6085 | 0.6167 | 0.7014 | 0.5691 | 0.7510 | 0.6535 |
| InternLM-XComposer-VL (InternLM) | 0.6945 | 0.6527 | 0.6085 | 0.6167 | 0.7014 | 0.5691 | 0.7510 | 0.6535 |
| IDEFICS-Instruct (LLaMA-7B) | 0.5618 | 0.4469 | 0.4402 | 0.4280 | 0.5417 | 0.4474 | 0.5633 | 0.4870 |
| Qwen-VL (QwenLM) | 0.6309 | 0.5819 | 0.5639 | 0.5058 | 0.6273 | 0.5789 | 0.7388 | 0.5940 |
| Shikra (Vicuna-7B) | 0.6564 | 0.4735 | 0.4909 | 0.4883 | 0.5949 | 0.5000 | 0.6408 | 0.5465 |
Expand Down Expand Up @@ -76,7 +76,7 @@ Results of Open-source models:
| random guess | 0.5000 | 0.2848 | 0.3330 | 0.3724 | 0.3850 | 0.3913 | 0.3710 | 0.3794 |
| LLaVA-v1.5 (Vicuna-v1.5-7B) | 0.6460 | 0.5922 | 0.5576 | 0.4798 | 0.6730 | 0.5890 | 0.7376 | 0.6007 |
| LLaVA-v1.5 (Vicuna-v1.5-13B) | 0.6496 | 0.6486 | 0.5412 | 0.5355 | 0.6659 | 0.5890 | 0.7148 | 0.6140 |
| InternLM-XComposer (InternLM) | 0.6843 | 0.6204 | 0.6193 | 0.5681 | 0.7041 | 0.5753 | 0.7719 | 0.6435 |
| InternLM-XComposer-VL (InternLM) | 0.6843 | 0.6204 | 0.6193 | 0.5681 | 0.7041 | 0.5753 | 0.7719 | 0.6435 |
| IDEFICS-Instruct (LLaMA-7B) | 0.6004 | 0.4642 | 0.4671 | 0.4038 | 0.5990 | 0.4726 | 0.6477 | 0.5151 |
| Qwen-VL (QwenLM) | 0.6533 | 0.6074 | 0.5844 | 0.5413 | 0.6635 | 0.5822 | 0.7300 | 0.6167 |
| Shikra(Vicuna-7B) | 0.6909 | 0.4793 | 0.4671 | 0.4731 | 0.6086 | 0.5308 | 0.6477 | 0.5532 |
Expand Down Expand Up @@ -138,7 +138,7 @@ Abbreviations for dimensions: *comp: completeness, prec: precision, rele: releva
| **Model Name** | p_{0, comp} | p_{0, comp} | p_{2, comp} | s_{compl} | p_{0, prec} | p_{0, prec} | p_{2, prec} | s_{prec} | p_{0, rele} | p_{0, rele} | p_{2, rele} | s_{rele} | s_{sum} |
| - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| LLaVA-v1.5 (Vicuna-v1.5-13B) | 27.68% | 53.78% | 18.55% | 0.91/2.00 | 25.45% | 21.47% | 53.08% | 1.28/2.00 | 6.31% | 58.75% | 34.94% | 1.29/2.00 | 3.47/6.00 |
| InternLM-XComposer (InternLM) | 19.94% | 51.82% | 28.24% | 1.08/2.00 | 22.59% | 28.99% | 48.42% | 1.26/2.00 | 1.05% | 10.62% | 88.32% | 1.87/2.00 | 4.21/6.00 |
| InternLM-XComposer-VL (InternLM) | 19.94% | 51.82% | 28.24% | 1.08/2.00 | 22.59% | 28.99% | 48.42% | 1.26/2.00 | 1.05% | 10.62% | 88.32% | 1.87/2.00 | 4.21/6.00 |
| IDEFICS-Instruct (LLaMA-7B) | 28.91% | 59.16% | 11.93% | 0.83/2.00 | 34.68% | 27.86% | 37.46% | 1.03/2.00 | 3.90% | 59.66% | 36.44% | 1.33/2.00 | 3.18/6.00 |
| Qwen-VL (QwenLM) | 26.34% | 49.13% | 24.53% | 0.98/2.00 | 50.62% | 23.44% | 25.94% | 0.75/2.00 | 0.73% | 35.56% | 63.72% | 1.63/2.00 | 3.36/6.00 |
| Shikra (Vicuna-7B) | 21.14% | 68.33% | 10.52% | 0.89/2.00 | 30.33% | 28.30% | 41.37% | 1.11/2.00 | 1.14% | 64.36% | 34.50% | 1.33/2.00 | 3.34/6.00 |
Expand Down Expand Up @@ -166,7 +166,7 @@ See [IQA_outputs/eval.ipynb](IQA_outputs/eval.ipynb) for our ablation experiment
| CLIP-ViT-Large-14 | 0.468/0.505 | 0.385/0.389 | 0.218/0.237 | 0.307/0.308 | 0.285/0.290 | 0.436/0.458 | 0.376/0.388 |0.354/0.368|
| LLaVA-v1.5 (Vicuna-v1.5-7B) | 0.463/0.459 | 0.443/0.467 | 0.305/0.321 | 0.344/0.358 | **0.321/0.333** | 0.672/0.738 | 0.417/0.440 |0.424/0.445|
| LLaVA-v1.5 (Vicuna-v1.5-13B) | 0.448/0.460 | 0.563/0.584 | 0.310/0.339 | 0.445/0.481 | 0.285/0.297 | 0.664/0.754 | 0.390/0.400 |0.444/0.474|
| InternLM-XComposer (InternLM) | **0.568/0.616** | **0.731/0.751** | **0.358/0.413** | **0.619/0.678** | 0.246/0.268 | **0.734/0.777** | 0.540/0.563 |**0.542/0.581**|
| InternLM-XComposer-VL (InternLM) | **0.568/0.616** | **0.731/0.751** | **0.358/0.413** | **0.619/0.678** | 0.246/0.268 | **0.734/0.777** | 0.540/0.563 |**0.542/0.581**|
| IDEFICS-Instruct (LLaMA-7B) | 0.375/0.400 | 0.474/0.484 | 0.235/0.24 | 0.409/0.428 | 0.244/0.227 | 0.562/0.622 | 0.370/0.373 |0.381/0.396|
| Qwen-VL (QwenLM) | 0.470/0.546 | 0.676/0.669 | 0.298/0.338 | 0.504/0.532 | 0.273/0.284 | 0.617/0.686 | 0.486/0.486 |0.475/0.506|
| Shikra (Vicuna-7B) | 0.314/0.307 | 0.32/0.337 | 0.237/0.241 | 0.322/0.336 | 0.198/0.201 | 0.640/0.661 | 0.324/0.332 |0.336/0.345|
Expand Down

0 comments on commit 2e69690

Please sign in to comment.