UniIR: Training and Benchmarking Universal Multimodal Information Retrievers, Cong Wei+, N/A, arXiv'23 #1166

AkihikoWatanabe · 2023-12-01T02:53:25Z

URL

https://arxiv.org/abs/2311.17136

Affiliations

Cong Wei, N/A
Yang Chen, N/A
Haonan Chen, N/A
Hexiang Hu, N/A
Ge Zhang, N/A
Jie Fu, N/A
Alan Ritter, N/A
Wenhu Chen, N/A

Abstract

Existing information retrieval (IR) models often assume a homogeneous format,limiting their applicability to diverse user needs, such as searching forimages with text descriptions, searching for a news article with a headlineimage, or finding a similar photo with a query image. To approach suchdifferent information-seeking demands, we introduce UniIR, a unifiedinstruction-guided multimodal retriever capable of handling eight distinctretrieval tasks across modalities. UniIR, a single retrieval system jointlytrained on ten diverse multimodal-IR datasets, interprets user instructions toexecute various retrieval tasks, demonstrating robust performance acrossexisting datasets and zero-shot generalization to new tasks. Our experimentshighlight that multi-task training and instruction tuning are keys to UniIR'sgeneralization ability. Additionally, we construct the M-BEIR, a multimodalretrieval benchmark with comprehensive results, to standardize the evaluationof universal multimodal information retrieval.

Translation (by gpt-3.5-turbo)

従来の情報検索（IR）モデルは、一様な形式を前提としているため、テキストの説明を持つ画像を検索したり、見出し画像を持つニュース記事を検索したり、クエリ画像と似た写真を見つけるなど、さまざまなユーザーのニーズには適用できません。このような異なる情報検索の要求に対応するために、私たちはUniIRという統一された指示に基づくマルチモーダルリトリーバーを導入します。UniIRは、モダリティを横断する8つの異なるリトリーバルタスクを処理できるように設計されています。UniIRは、10の多様なマルチモーダルIRデータセットで共同でトレーニングされた単一のリトリーバーシステムであり、ユーザーの指示を解釈してさまざまなリトリーバルタスクを実行します。既存のデータセットでの堅牢なパフォーマンスと新しいタスクへのゼロショット汎化を実証しています。私たちの実験は、マルチタスクトレーニングと指示の調整がUniIRの汎化能力の鍵であることを示しています。さらに、包括的な結果を持つマルチモーダルリトリーバルベンチマークであるM-BEIRを構築し、ユニバーサルなマルチモーダル情報検索の評価を標準化しています。

Summary (by gpt-3.5-turbo)

従来の情報検索モデルは一様な形式を前提としているため、異なる情報検索の要求に対応できない。そこで、UniIRという統一された指示に基づくマルチモーダルリトリーバーを提案する。UniIRは異なるリトリーバルタスクを処理できるように設計され、10のマルチモーダルIRデータセットでトレーニングされる。実験結果はUniIRの汎化能力を示し、M-BEIRというマルチモーダルリトリーバルベンチマークも構築された。

AkihikoWatanabe · 2023-12-01T02:54:10Z

後で読む（画像は元ツイートより

元ツイート: https://x.com/congwei1230/status/1730307767469068476?s=46&t=Y6UuIHB0Lv0IpmFAjlc2-Q

AkihikoWatanabe added the Pocket label Dec 1, 2023

AkihikoWatanabe changed the title あ UniIR: Training and Benchmarking Universal Multimodal Information Retrievers, Cong Wei+, N/A, arXiv'23 Dec 1, 2023

AkihikoWatanabe added Dataset InformationRetrieval MulltiModal labels Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers, Cong Wei+, N/A, arXiv'23 #1166

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers, Cong Wei+, N/A, arXiv'23 #1166

AkihikoWatanabe commented Dec 1, 2023 •

edited

AkihikoWatanabe commented Dec 1, 2023

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers, Cong Wei+, N/A, arXiv'23 #1166

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers, Cong Wei+, N/A, arXiv'23 #1166

Comments

AkihikoWatanabe commented Dec 1, 2023 • edited

URL

Affiliations

Abstract

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)

AkihikoWatanabe commented Dec 1, 2023

AkihikoWatanabe commented Dec 1, 2023 •

edited