diff --git a/README.md b/README.md index 9551bd9..0c87305 100644 --- a/README.md +++ b/README.md @@ -346,6 +346,7 @@ The first technical report for Gemini vs GPT-4V. A total of 128 pages< | ![Stars](https://img.shields.io/github/stars/BradyFU/Awesome-Multimodal-Large-Language-Models?style=social&label=Star)
[**A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise**](https://arxiv.org/pdf/2312.12436.pdf)
| arXiv | 2023-12-19 | [Github](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models) | | ![Stars](https://img.shields.io/github/stars/AIFEG/BenchLMM?style=social&label=Star)
[**BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models**](https://arxiv.org/pdf/2312.02896.pdf)
| arXiv | 2023-12-05 | [Github](https://github.com/AIFEG/BenchLMM) | | ![Star](https://img.shields.io/github/stars/UCSC-VLAA/vllm-safety-benchmark.svg?style=social&label=Star)
[**How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs**](https://arxiv.org/pdf/2311.16101.pdf)
| arXiv | 2023-11-27 | [Github](https://github.com/UCSC-VLAA/vllm-safety-benchmark) | +| ![Star](https://img.shields.io/github/stars/jonathan-roberts1/charting-new-territories.svg?style=social&label=Star)
[**Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs**](https://arxiv.org/pdf/2311.14656.pdf)
| arXiv | 2023-11-24 | [Github](https://github.com/jonathan-roberts1/charting-new-territories) | | ![Star](https://img.shields.io/github/stars/FreedomIntelligence/MLLM-Bench?style=social&label=Star)
[**MLLM-Bench, Evaluating Multi-modal LLMs using GPT-4V**](https://arxiv.org/pdf/2311.13951)
| arXiv | 2023-11-23 | [Github](https://github.com/FreedomIntelligence/MLLM-Bench) | | [**VLM-Eval: A General Evaluation on Video Large Language Models**](https://arxiv.org/pdf/2311.11865.pdf) | arXiv | 2023-11-20 | [Coming soon]() | | ![Star](https://img.shields.io/github/stars/gzcch/Bingo.svg?style=social&label=Star)
[**Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges**](https://arxiv.org/pdf/2311.03287.pdf)
| arXiv | 2023-11-06 | [Github](https://github.com/gzcch/Bingo) | @@ -531,6 +532,7 @@ The first technical report for Gemini vs GPT-4V. A total of 128 pages< | **ViP-Bench** | [Making Large Multimodal Models Understand Arbitrary Visual Prompts](https://arxiv.org/pdf/2312.00784.pdf) | [Link](https://huggingface.co/datasets/mucai/ViP-Bench) | A benchmark for visual prompts | | **M3DBench** | [M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts](https://arxiv.org/pdf/2312.10763.pdf) | [Link](https://github.com/OpenM3D/M3DBench) | A 3D-centric benchmark | | **Video-Bench** | [Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models](https://arxiv.org/pdf/2311.16103.pdf) | [Link](https://github.com/PKU-YuanGroup/Video-Bench) | A benchmark for video-MLLM evaluation | +| **Charting-New-Territories** | [Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs](https://arxiv.org/pdf/2311.14656.pdf) | [Link](https://github.com/jonathan-roberts1/charting-new-territories) | A benchmark for evaluating geographic and geospatial capabilities | | **MLLM-Bench** | [MLLM-Bench, Evaluating Multi-modal LLMs using GPT-4V](https://arxiv.org/pdf/2311.13951.pdf) | [Link](https://github.com/FreedomIntelligence/MLLM-Bench) | GPT-4V evaluation with per-sample criteria | | **BenchLMM** | [BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models](https://arxiv.org/pdf/2312.02896.pdf) | [Link](https://huggingface.co/datasets/AIFEG/BenchLMM) | A benchmark for assessment of the robustness against different image styles | | **MMC-Benchmark** | [MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning](https://arxiv.org/pdf/2311.10774.pdf) | [Link](https://github.com/FuxiaoLiu/MMC) | A comprehensive human-annotated benchmark with distinct tasks evaluating reasoning capabilities over charts |