FlagOpen · 545999961 · Oct 31, 2024 · Oct 31, 2024 · Oct 31, 2024 · Oct 31, 2024
diff --git a/examples/README.md b/examples/README.md
@@ -1,30 +1,42 @@
-# 1. Introduction
+# Examples
+
+- [1. Introduction](#1-Introduction)
+- [2. Installation](#2-Installation)
+- [3. Inference](#3-Inference)
+- [4. Finetune](#4-Finetune)
+- [5. Evaluation](#5-Evaluation)
+
+## 1. Introduction
 
 In this example, we show how to **inference**, **finetune** and **evaluate** the baai-general-embedding.
 
-# 2. Installation
+## 2. Installation
 
 * **with pip**
+
 ```shell
 pip install -U FlagEmbedding
 ```
 
 * **from source**
+
 ```shell
 git clone https://github.com/FlagOpen/FlagEmbedding.git
 cd FlagEmbedding
 pip install  .
 ```
+
 For development, install as editable:
+
 ```shell
 pip install -e .
 ```
 
-# 3. Inference
+## 3. Inference
 
 We have provided the inference code for two types of models: the **embedder** and the **reranker**. These can be loaded using `FlagAutoModel` and `FlagAutoReranker`, respectively. For more detailed instructions on their use, please refer to the documentation for the [embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/inference/embedder) and [reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/inference/reranker).
 
-## 1. Embedder
+### 1. Embedder
 
 ```python
 from FlagEmbedding import FlagAutoModel
@@ -49,7 +61,7 @@ scores = q_embeddings @ p_embeddings.T
 print(scores)
 ```
 
-## 2. Reranker
+### 2. Reranker
 
 ```python
 from FlagEmbedding import FlagAutoReranker
@@ -65,7 +77,7 @@ scores = model.compute_score(pairs)
 print(scores)
 ```
 
-# 4. Finetune
+## 4. Finetune
 
 We support fine-tuning a variety of BGE series models, including `bge-large-en-v1.5`, `bge-m3`, `bge-en-icl`, `bge-multilingual-gemma2`, `bge-reranker-v2-m3`, `bge-reranker-v2-gemma`, and `bge-reranker-v2-minicpm-layerwise`, among others. As examples, we use the basic models `bge-large-en-v1.5` and `bge-reranker-large`. For more details, please refer to the [embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune/embedder) and [reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune/reranker) sections.
 
@@ -74,7 +86,7 @@ pip install deepspeed
 pip install flash-attn --no-build-isolation
 ```
 
-## 1. Embedder
+### 1. Embedder
 
 ```shell
 torchrun --nproc_per_node 2 \
@@ -109,7 +121,7 @@ torchrun --nproc_per_node 2 \
     --kd_loss_type kl_div
 ```
 
-## 2. Reranker
+### 2. Reranker
 
 ```shell
 torchrun --nproc_per_node 2 \
@@ -139,16 +151,13 @@ torchrun --nproc_per_node 2 \
     --save_steps 1000
 ```
 
-# 5. Evaluation
+## 5. Evaluation
 
 We support evaluations on [MTEB](https://github.com/embeddings-benchmark/mteb), [BEIR](https://github.com/beir-cellar/beir), [MSMARCO](https://microsoft.github.io/msmarco/), [MIRACL](https://github.com/project-miracl/miracl), [MLDR](https://huggingface.co/datasets/Shitao/MLDR), [MKQA](https://github.com/apple/ml-mkqa), [AIR-Bench](https://github.com/AIR-Bench/AIR-Bench), and custom datasets. Below is an example of evaluating MSMARCO passages. For more details, please refer to the [evaluation examples](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/evaluation).
 
 ```shell
 pip install pytrec_eval
 pip install https://github.com/kyamagu/faiss-wheels/releases/download/v1.7.3/faiss_gpu-1.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
-```
-
-```shell
 python -m FlagEmbedding.evaluation.msmarco \
     --eval_name msmarco \
     --dataset_dir ./data/msmarco \

diff --git a/examples/evaluation/README.md b/examples/evaluation/README.md
@@ -8,21 +8,35 @@ This document serves as an overview of the evaluation process and provides a bri
 
 In this section, we will first introduce the commonly used arguments across all datasets. Then, we will provide a more detailed explanation of the specific arguments used for each individual dataset.
 
+- [1. Introduction](#1-Introduction)
+  - [(1) EvalArgs](#1-EvalArgs)
+  - [(2) ModelArgs](#2-ModelArgs)
+- [2. Usage](#2-Usage)
+  - [Requirements](#Requirements)
+  - [(1) MTEB](#1-MTEB)
+  - [(2) BEIR](#2-BEIR)
+  - [(3) MSMARCO](#3-MSMARCO)
+  - [(4) MIRACL](#4-MIRACL)
+  - [(5) MLDR](#5-MLDR)
+  - [(6) MKQA](#6-MKQA)
+  - [(7) AIR-Bench](#7-Air-Bench)
+  - [(8) Custom Dataset](#8-Custom-Dataset)
+
 ## Introduction
 
 ### 1. EvalArgs
 
 **Arguments for evaluation setup:**
 
 - **`eval_name`**: Name of the evaluation task (e.g., msmarco, beir, miracl).
-  
+
 - **`dataset_dir`**: Path to the dataset directory. This can be:
-    1. A local path to perform evaluation on your dataset (must exist). It should contain:
-        - `corpus.jsonl`
-        - `<split>_queries.jsonl`
-        - `<split>_qrels.jsonl`
-    2. Path to store datasets downloaded via API. Provide `None` to use the cache directory.
-  
+  1. A local path to perform evaluation on your dataset (must exist). It should contain:
+     - `corpus.jsonl`
+     - `<split>_queries.jsonl`
+     - `<split>_qrels.jsonl`
+  2. Path to store datasets downloaded via API. Provide `None` to use the cache directory.
+
 - **`force_redownload`**: Set to `True` to force redownload of the dataset. Default is `False`.
 
 - **`dataset_names`**: List of dataset names to evaluate or `None` to evaluate all available datasets. This can be the dataset name (BEIR, etc.) or language (MIRACL, etc.).
@@ -107,11 +121,8 @@ Here is an example for evaluation:
 
 ```shell
 pip install mteb==1.15.0
-```
-
-```shell
 python -m FlagEmbedding.evaluation.mteb \
-	--eval_name mteb \
+    --eval_name mteb \
     --output_dir ./data/mteb/search_results \
     --languages eng \
     --tasks NFCorpus BiorxivClusteringS2S SciDocsRR \
@@ -133,11 +144,8 @@ Here is an example for evaluation:
 pip install beir
 mkdir eval_beir
 cd eavl_beir
-```
-
-```shell
 python -m FlagEmbedding.evaluation.beir \
-	--eval_name beir \
+    --eval_name beir \
     --dataset_dir ./beir/data \
     --dataset_names fiqa arguana cqadupstack \
     --splits test dev \
@@ -168,7 +176,7 @@ Here is an example for evaluation:
 
 ```shell
 python -m FlagEmbedding.evaluation.msmarco \
-	--eval_name msmarco \
+    --eval_name msmarco \
     --dataset_dir ./msmarco/data \
     --dataset_names passage \
     --splits dev dl19 dl20 \
@@ -198,7 +206,7 @@ Here is an example for evaluation:
 
 ```shell
 python -m FlagEmbedding.evaluation.miracl \
-	--eval_name miracl \
+    --eval_name miracl \
     --dataset_dir ./miracl/data \
     --dataset_names bn hi sw te th yo \
     --splits dev \
@@ -228,7 +236,7 @@ Here is an example for evaluation:
 
 ```shell
 python -m FlagEmbedding.evaluation.mldr \
-	--eval_name mldr \
+    --eval_name mldr \
     --dataset_dir ./mldr/data \
     --dataset_names hi \
     --splits test \
@@ -258,7 +266,7 @@ Here is an example for evaluation:
 
 ```shell
 python -m FlagEmbedding.evaluation.mkqa \
-	--eval_name mkqa \
+    --eval_name mkqa \
     --dataset_dir ./mkqa/data \
     --dataset_names en zh_cn \
     --splits test \
@@ -293,11 +301,8 @@ Here is an example for evaluation:
 
 ```shell
 pip install air-benchmark
-```
-
-```shell
 python -m FlagEmbedding.evaluation.air_bench \
-	--benchmark_version AIR-Bench_24.05 \
+    --benchmark_version AIR-Bench_24.05 \
     --task_types qa long-doc \
     --domains arxiv \
     --languages en \
@@ -352,7 +357,7 @@ Please put the above file (`corpus.jsonl`, `test_queries.jsonl`, `test_qrels.jso
 
 ```shell
 python -m FlagEmbedding.evaluation.custom \
-	--eval_name your_data_name \
+    --eval_name your_data_name \
     --dataset_dir ./your_data_path \
     --splits test \
     --corpus_embd_save_dir ./your_data_name/corpus_embd \

diff --git a/examples/finetune/embedder/README.md b/examples/finetune/embedder/README.md
@@ -2,6 +2,16 @@
 
 In this example, we show how to finetune the embedder with your data.
 
+- [1. Installation](#1-Installation)
+- [2. Data format](#2-Data-format)
+  - [Hard Negatives](#Hard-Negatives)
+  - [Teacher Scores](#Teacher-Scores)
+- [3. Train](#3-Train)
+  - [(1) standard model](#1-standard-model)
+  - [(2) bge-m3](#2-bge-m3)
+  - [(3) bge-multilingual-gemma2](#3-bge-multilingual-gemma2)
+  - [(4) bge-en-icl](#4-bge-en-icl)
+
 ## 1. Installation
 
 - **with pip**

diff --git a/examples/finetune/reranker/README.md b/examples/finetune/reranker/README.md
@@ -2,6 +2,15 @@
 
 In this example, we show how to finetune the reranker with your data.
 
+- [1. Installation](#1-Installation)
+- [2. Data format](#2-Data-format)
+  - [Hard Negatives](#Hard-Negatives)
+  - [Teacher Scores](#Teacher-Scores)
+- [3. Train](#3-Train)
+  - [(1) standard model](#1-standard-model)
+  - [(2) bge-reranker-v2-gemma](#2-bge-reranker-v2-gemma)
+  - [(3) bge-reranker-v2-layerwise-minicpm](#3-bge-reranker-v2-layerwise-minicpm)
+
 ## 1. Installation
 
 - **with pip**