Codes and models for our information retrieval research papers.
Knowledge Computing and Service Group, Institute of Information Engineering, Chinese Academy of Sciences.
tDRO: Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval. The tDRO (task-level Distributionally Robust Optimization) algorithm for Large Language Model-based Dense Retrieval (LLM-DR) fine-tuning, targeted at improving the universal domain generalization ability by end-to-end reweighting the data distribution of each task.
bowdpr: Drop your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval. Bag-of-Word Prediction is a new encoder-only pre-training schema for dense retrieval targeted at efficiency and interpretability. (Accepted by SIGIR 2024)
CoT-MAE-qc: Query-as-context Pre-training for Dense Passage Retrieval. A simple yet effective pre-training scheme for single vector Dense Passage Retrieval. (Accepted by EMNLP 2023 Main Conference)
CoT-MAE: ConTextual Mask Auto-Encoder for Dense Passage Retrieval. CoT-MAE is a transformers based Mask Auto-Encoder pre-training architecture designed for Dense Passage Retrieval. (Accepted by AAAI 2022)