# Chapter 2: Build a Minimal RAG System in 10 Minutes

> In chapter 1 we introduced large-model and RAG theories. Now it's time to put that theory into practice and use LazyLLM to build the most basic RAG pipeline.
>
> This tutorial walks you through the steps required to assemble a minimal RAG system with LazyLLM. We'll start with the environment setup, revisit the RAG workflow, explain the document loader, retriever, and generator components, and finally wire them together to observe the output.
>
> Here we goÔºÅüèÉüèÉüèÉ

## Environment Preparation

>**Before building a RAG system with LazyLLM, finish both ‚ÄúDevelopment Environment Setup‚Äù and ‚ÄúEnvironment Variable Configuration.‚Äù**

### 1. Development Environment Setup

You can pick any of the following approaches to prepare the LazyLLM dev environment.

#### Manual configuration‚úèÔ∏è

LazyLLM is implemented in Python, so make sure `Python`, `Pip`, and `Git` are available. Installing them on macOS takes a few extra steps‚Äîsee the appendix for detailed instructions.

Create a virtual environment named `lazyllm-venv` and activate it:

```bash
python -m venv lazyllm-venv
source lazyllm-venv/bin/activate
```

If everything works you should see `(lazyllm-venv)` at the beginning of your prompt. Keep all subsequent commands inside this virtual environment.

Download the `LazyLLM` code from GitHub:

```bash
git clone https://github.com/LazyAGI/LazyLLM.git
```

Then switch into the cloned repository:

```bash
cd LazyLLM
```

Install the base dependencies:

```bash
pip3 install -r requirements.txt
```

Add `LazyLLM` to Python's module search path:

```bash
export PYTHONPATH=$PWD:$PYTHONPATH
```

Now you can import it from any directory.

#### Pull the imageüìë

We also publish a Docker image with the latest LazyLLM release so you can get started immediately:

```bash
docker pull lazyllm/lazyllm
```

You can also browse [https://hub.docker.com/r/lazyllm/lazyllm/tags](https://hub.docker.com/r/lazyllm/lazyllm/tags) and pull the tag you need.

```bash
pip3 install lazyllm
```

Install the minimal dependency bundle that unlocks every LazyLLM feature. It supports fine-tuning and inference for hosted models as well as offline fine-tuning (powered by `LLaMA-Factory`) and offline inference (`vLLM` for LLMs, `LMDeploy` for multimodal models, and `Infinity` for embedding models).

```bash
pip3 install lazyllm
lazyllm install standard
```

Install the full dependency bundle to enable advanced features such as automatic framework selection (`AutoFinetune`, `AutoDeploy`, etc.), more offline inference engines (e.g., `LightLLM`), and additional offline training pipelines (e.g., `AlpacaloraFinetune`, `CollieFinetune`, etc.).

```bash
pip3 install lazyllm
lazyllm install full
```

### 2. Configure API keys

Calling LLMs can happen online or locally. For online calls you must supply the provider's API key. Register an account with that platform if you do not already have one. LazyLLM automatically reads platform API keys from environment variables: set the key once, then simply specify the platform and model name when you invoke the module.

LazyLLM currently supports the following providers:

| **Platform** | **How to get the API key** | **Environment variables** |
| ----------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
| [SenseNova](https://platform.sensenova.cn/) | [Get access keys (ak and sk)](https://console.sensecore.cn/help/docs/model-as-a-service/nova/) (Method 1), [Get access key (API key only)](https://console.sensecore.cn/aistudio/management/api-key) (Method 2) | `LAZYLLM_SENSENOVA_API_KEY`, `LAZYLLM_SENSENOVA_SECRET_KEY` (Method 1) / `LAZYLLM_SENSENOVA_API_KEY` (Method 2) |
| [OpenAI](https://openai.com/index/openai-api/) | [Create an API key](https://platform.openai.com/api-keys) | `LAZYLLM_OPENAI_API_KEY` |
| [Zhipu](https://open.bigmodel.cn/) | [Create an API key](https://open.bigmodel.cn/usercenter/apikeys) | `LAZYLLM_GLM_API_KEY` |
| [Kimi](https://platform.moonshot.cn/) | [Create an API key](https://platform.moonshot.cn/console/api-keys) | `LAZYLLM_KIMI_API_KEY` |
| [Tongyi Qianwen](https://help.aliyun.com/zh/dashscope/developer-reference/use-qwen-by-api) | [Create an API key](https://help.aliyun.com/zh/dashscope/developer-reference/acquisition-and-configuration-of-api-key) | `LAZYLLM_QWEN_API_KEY` |
| [Doubao](https://www.volcengine.com/product/doubao) | [Create an API key](https://console.volcengine.com/ark/region:ark+cn-beijing/apiKey) | `LAZYLLM_DOUBAO_API_KEY` |
| [DeepSeek](https://www.deepseek.com/) | [Create an API key](https://platform.deepseek.com/api_keys/apiKey) | `LAZYLLM_DEEPSEEK_API_KEY` |

To expose your API key as an environment variable:

1.  Obtain the API key for your target platform (SenseNova requires two keys for Method 1).
2.  Run the following command to set the environment variables:

```bash
export LAZYLLM_<name_of_provider_upper_case>_API_KEY=<your_api_key>
```

For example, if you use SenseNova and obtained the key through Method 1, set the variables below:

```bash
export LAZYLLM_SENSENOVA_API_KEY="Your Access Key ID"
export LAZYLLM_SENSENOVA_SECRET_KEY="Your Access Key Secret"
```

If you used Method 2, set only this variable:

```bash
export LAZYLLM_SENSENOVA_API_KEY="Your API Key"
```

Once the environment variables are in place, instantiate `OnelineChatModule`, specify the model source, and LazyLLM will read the corresponding API key automatically.

This makes it easy to manage API keys for multiple providers and keeps the online invocation flow simple.

## RAG in Practice

### 1. Review of the Core Principles ‚úàÔ∏è

After the LazyLLM environment is configured, let's recap the basics of RAG (Retrieval-Augmented Generation). When the model needs to produce an answer, it first retrieves relevant information from a large collection of documents. The retrieved facts are fed into the generation step so the final response is grounded and accurate. The following diagram shows the workflow: the system receives a user query, the retriever fetches similar content from external documents, the query plus retrieved context are sent to the LLM, and the LLM returns the answer.

![image.png](2_images/img2.png)

The offline workflow boils down to three steps:

1.  **Document ingestion and parsing (Reader)**

Load documents of various formats into the system. You can use open-source tools such as MinerU to improve parsing accuracy.

2.  **Chunking and vectorization (Transform and Vectorize)**

Clean, deduplicate, and split the raw data, then convert it into embeddings.

3.  **Indexing and storage (Indexing and Store)**

Store and index the processed text in a vector database or another high-performance retrieval backend.

![image-2.png](2_images/img1.png)
>**Summary**
>
>The end-to-end RAG pipeline can be summarized in three phases:
>
>**1. Retrieval**
>
>Given a user query, look up related information in the **knowledge base**.
>
>**2. Augmentation**
>
>Attach the retrieved text as extra context and send it to the LLM together with the user query.
>
>**3. Generation**
>
>The **LLM** combines the retrieved knowledge with its own pretraining to produce the final answer.
>
>We'll now dig into LazyLLM's **document management, retriever, and generator components** in that order.
>
>**Here we goÔºÅüèÑ**
>

### 2. Document management üìö

The core of RAG is document retrieval from a document collection. This collection may include many different formats: rich-text files such as DOCX, PDF, and PPT; plain-text formats such as Markdown; or content retrieved from an API, such as results returned by a search engine. Because these documents come in diverse formats, we need specialized parsers to extract useful information such as text, images, tables, audio, or video.

In LazyLLM, these parsers are abstracted through the `Document` class. The built-in `Document` component can extract content from common rich-text formats including DOCX, PDF, PPT, and Excel. You can also implement a custom Reader to handle additional formats, which we will cover in later tutorials.

The main parameters of Document are as follows:

**Parameters:**

* **`dataset_path`** (`str`) ‚Äì Path to the dataset directory. The directory must contain the documents to be managed by the module (single-file inputs are not supported yet).
* **`embed`** (`Optional[Union[Callable, Dict[str, Callable]]]`, default: `None`) ‚Äì Embedding generator. Provide a dict if you want to create multiple embeddings per document; the dict keys are embedding names and the values are the embedding callables.
* **`manager`** (`bool`, default: `False`) ‚Äì Whether to spin up a UI for the document module. Defaults to `False`.
* **`launcher`** (`optional`, default: `None`) ‚Äì Object or function that starts the server module. If omitted, LazyLLM uses the default async launcher in `lazyllm.launchers` (`sync=False`).
* **`store_conf`** (`optional`, default: `None`) ‚Äì Configures the storage backend and index backend.
* **`doc_fields`** (`optional`, default: `None`) ‚Äì Describes which fields should be stored and retrieved plus their types (currently used only by the Milvus backend).

We'll focus on the simplest usage here. Advanced parameters such as `embed` and `manager` will be covered later. For now you only need to provide the dataset directory path:

[GitHub code link](https://github.com/LazyAGI/Tutorial/blob/7abc91dbb82a007a78731845dd8c360ac0cc1e75/rag/codes/chapter2/%E5%AE%9E%E6%88%981%EF%BC%9A%E6%9C%80%E5%9F%BA%E7%A1%80%E7%9A%84RAG.py#L19)

In [None]:
# RAG document loading
from lazyllm import Document

# Using an absolute path
doc = Document("path/to/content/docs/")
print(f"Actual path passed in: {}")

# Using a relative path
doc = Document("/content/docs/")

Make sure you pass an absolute path or a path relative to the current directory. Otherwise set the `LAZYLLM_DATA_PATH` environment variable to your document root and then provide a relative path, as shown below:

![image.png](2_images/img4.png)

![image-2.png](2_images/img3.png)

In the snippet above we inspect `doc._manager._dataset_path` to confirm the path that was ultimately passed into the document loader‚Äîboth approaches produce identical results.

### 3. Retriever component üïµ

The documents in your corpus rarely align perfectly with every query, so you need a retriever to filter the relevant context. LazyLLM provides the `Retriever` component, which creates a retrieval module that can search across one or more `Document` instances using the similarity metric you specify. Key parameters include:

**Parameters:**

* **`doc`** (`object`) ‚Äì The document module instance (or list of instances) to search.
* **`group_name`** (`str`) ‚Äì The node group to query.

`group_name` has three built-in chunking strategies, all powered by `SentenceSplitter` but with different chunk sizes:

‚ó¶ CoarseChunk: chunk size 1024 with 100-token overlap<br>
‚ó¶ MediumChunk: chunk size 256 with 25-token overlap<br>
‚ó¶ FineChunk: chunk size 128 with 12-token overlap<br>

* **`similarity`** (`Optional[str]`, default: `None`) ‚Äì Similarity metric used during retrieval. Defaults to `"dummy"`. Options: `"bm25"`, `"bm25_chinese"`, `"cosine"`.
* **`similarity_cut_off`** (`Union[float, Dict[str, float]]`, default: `float('-inf')`) ‚Äì Drop nodes whose similarity falls below this threshold. Provide a dict to set individual thresholds for each embedding.
* **`index`** (`str`, default: `'default'`) ‚Äì Index type used for retrieval. Currently only `'default'` is supported.
* **`topk`** (`int`, default: `6`) ‚Äì Number of nodes to return.
* **`embed_keys`** (`Optional[List[str]]`, default: `None`) ‚Äì Embedding keys to use when retrieving. If omitted, all embeddings are considered.

The following line configures a retriever that works on the `doc` corpus, searches the Coarse chunk node group with `bm25_chinese`, and returns the top 3 most similar nodes. We'll focus on how to call the retriever here; algorithms and tuning tips will be covered later (see Practice 2).

[GitHub code link](https://github.com/LazyAGI/Tutorial/blob/7abc91dbb82a007a78731845dd8c360ac0cc1e75/rag/codes/chapter2/%E5%AE%9E%E6%88%981%EF%BC%9A%E6%9C%80%E5%9F%BA%E7%A1%80%E7%9A%84RAG.py#L28)

In [None]:
from lazyllm import Retriever

# Load documents using an absolute path
doc = Document("/path/to/content/docs/")

# Create a Retriever. Here we use the built-in chunking strategy "CoarseChunk"
# and the BM25-based similarity function for Chinese text.
retriever = Retriever(doc, group_name=Document.CoarseChunk, similarity="bm25_chinese", topk=3)

# Run the retriever with a query
retriever_result = retriever("your query")

# Print the content of the first retrieved chunk
print(retriever_result[0].get_content())

Let's run the code and inspect the output:

<div style="text-align:center; margin:20px 0;">
  <video controls style="width:900px; max-width:100%; height:auto; border:1px solid #ccc; border-radius:8px; box-shadow:0 4px 8px rgba(0,0,0,0.1);">
    <source src="./2_videos/ÊúÄÁÆÄÂçïÁöÑrag2.mp4" type="video/mp4" />
    Your browser does not support the video tag.
  </video>
</div>
<!-- <div style="text-align:center; margin:20px 0;">
  <video 
    controls 
    style="width:900px; max-width:100%; height:auto; border:1px solid #ccc; border-radius:8px; box-shadow:0 4px 8px rgba(0,0,0,0.1);"
    preload="metadata"  
    playsinline        
  >
    <source src="./2_videos/ÊúÄÁÆÄÂçïÁöÑrag2.mp4" type="video/mp4" />
    Your browser does not support the video tag.
  </video>
</div> -->

`retriever_result` is a list in which each element represents a node (we'll explain nodes in Advanced 1). Call `get_content()` on a node to print its text.

### 4. Generator component üôã

Once we have the retrieved context and the user query, we feed both into the generator‚Äîthe LLM‚Äîto produce the final answer. We'll use an online model as an example and show how LazyLLM calls it.

`OnlineChatModule` unifies access to hosted LLMs. Regardless of whether you're targeting the OpenAI, SenseNova, or any other API, LazyLLM wraps the parameters for you: simply specify the provider via `source` and the model name via `model`.

[GitHub code link](https://github.com/LazyAGI/Tutorial/blob/7abc91dbb82a007a78731845dd8c360ac0cc1e75/rag/codes/chapter2/%E5%AE%9E%E6%88%981%EF%BC%9A%E6%9C%80%E5%9F%BA%E7%A1%80%E7%9A%84RAG.py#L42)

In [None]:
llm_prompt = "You are a small cat. After every response, you must add 'ÂñµÂñµÂñµ'."
llm = lazyllm.OnlineChatModule(source="sensenova", model="SenseChat-5-1202").prompt(llm_prompt)

print(llm("Êó©‰∏äÂ•ΩÔºÅ"))
# >>> Good morning! The sun is up~ (stretching) Want to sit by the window and enjoy the sunlight together?
# >>> I'll share half of my dried fish with you~ (nuzzles) ÂñµÂñµÂñµ~

Those three lines are all you need to call an LLM. First define a prompt so the model knows how to respond‚Äîevery subsequent reply will follow that instruction.

Next, configure the model by instantiating `OnlineChatModule`, passing the provider via `source` and the `model` name, and registering the prompt through `.prompt`.

Finally, invoke the configured model by sending your question to `llm` just like any regular chat interface.

You can also replace line 4 above with the following snippet to spawn a simple chat UI:

In [None]:
lazyllm.WebModule(llm, port=23466, history=[llm]).start().wait()

Here we start a lightweight web client with `lazyllm.WebModule`. Paste the host and port into your browser to interact with the chat interface.

![image.png](2_images/img6.png)

## Build the RAG Knowledge Base

With the foundational components in place, we can now assemble a complete RAG workflow. Before that, we need a corpus. We'll construct a knowledge base from the original [cmrc2018](https://huggingface.co/datasets/hfl/cmrc2018) dataset and reuse it throughout the remaining chapters whenever we discuss improvements or optimizations.

### 1. Dataset overview

CMRC 2018 (Chinese Machine Reading Comprehension 2018) [1] is a Chinese span-extraction reading comprehension dataset. Human annotators created nearly 20,000 real questions based on Wikipedia passages to increase linguistic diversity in Chinese MRC tasks.

![image-2.png](2_images/img5.png)

Each record contains four fields: `id`, `context`, `question`, and `answers`. `id` is the identifier, `context` is a descriptive passage (history, news, fiction, etc.), and `answers` lists the human-written spans. `answer_start` marks the token offset inside `context`, and `text` stores the actual answer. In the example below two experts annotated the same question independently to ensure accuracy.

Download the dataset locally with the `datasets` library (install it first via `pip install datasets` if needed):

[GitHub code link](https://github.com/LazyAGI/Tutorial/blob/7abc91dbb82a007a78731845dd8c360ac0cc1e75/rag/codes/chapter2/%E5%AE%9E%E6%88%981%EF%BC%9A%E6%9C%80%E5%9F%BA%E7%A1%80%E7%9A%84RAG.py#L54)

In [None]:
from datasets import load_dataset

# Load the dataset
dataset = load_dataset('cmrc2018')
# Alternatively, specify a custom download path:
# dataset = load_dataset('cmrc2018', cache_dir='path/to/datasets')
print(dataset)

The dataset is downloaded to `.cache/huggingface/datasets` by default. Pass `cache_dir` if you prefer a custom path. After the download completes you'll see `train`, `validation`, and `test` splits, as shown below:

![image.png](2_images/img7.png)

### 2. Build the knowledge base

We'll use only the `test` split for the RAG corpus; the other splits will be used later when we fine-tune models (we'll explain why in those chapters). Each sample contains `id`, `context`, `question`, and `answers`. We'll feed the `context` values into the knowledge base so we can evaluate the RAG system by querying with the paired `question` and comparing the generated answer with the original `answers`. Sound confusing? Don't worry‚Äîfuture chapters will break it down. For now just remember that **the knowledge base is built from the `context` field of the `test` split.** The code is shown below:

[GitHub code link](https://github.com/LazyAGI/Tutorial/blob/7abc91dbb82a007a78731845dd8c360ac0cc1e75/rag/codes/chapter2/%E5%AE%9E%E6%88%981%EF%BC%9A%E6%9C%80%E5%9F%BA%E7%A1%80%E7%9A%84RAG.py#L59)

In [None]:
def create_KB(dataset):
    """Create a knowledge base from the 'context' field in the test set.
    Every 10 entries are saved into one .txt file. The remaining entries
    (if fewer than 10) are saved into a final file.
    """
    Context = []
    for i in dataset:
        Context.append(i['context'])
    Context = list(set(Context))  # Deduplicate and obtain 256 unique items

    # Compute the number of files needed
    chunk_size = 10
    total_files = (len(Context) + chunk_size - 1) // chunk_size  # Ceiling division

    # Create the folder 'data_kb' to store the text files
    os.makedirs("data_kb", exist_ok=True)

    # Write every group of 10 entries into separate files
    for i in range(total_files):
        chunk = Context[i * chunk_size : (i + 1) * chunk_size]
        file_name = f"./data_kb/part_{i+1}.txt"
        with open(file_name, "w", encoding="utf-8") as f:
            f.write("\n".join(chunk))  # Write separated by newline

        # print(f"File {file_name} written successfully.")

Explanation:

* Lines 2‚Äì3 iterate over `data`, grab each entry's `context`, and append it to the `Context` list. By the end `Context` holds every passage.
* Line 6 deduplicates the contexts by wrapping the list in `set()` and then converting it back to a list.
* Lines 9‚Äì10 set `chunk_size = 10`, meaning each text file stores up to 10 contexts. `total_files` uses ceiling division `(len(Context) + chunk_size - 1) // chunk_size` so the remainder still gets written to a file.
* Lines 15‚Äì20 split the contexts into groups of 10 and write each group to a standalone `.txt` file:
  * `for i in range(total_files)`: iterate over the number of files we need.
  * `chunk = Context[i * chunk_size : (i + 1) * chunk_size]`: grab the current batch of 10 contexts.
  * `file_name = f"./data_kb/part_{i+1}.txt"`: generate unique file names such as `part_1.txt`, `part_2.txt`, etc.
  * `with open(file_name, "w", encoding="utf-8") as f:`: open each file for writing with UTF-8 encoding.
  * `f.write("
".join(chunk))`: join the contexts with newlines and flush them to disk.

[GitHub code link](https://github.com/LazyAGI/Tutorial/blob/7abc91dbb82a007a78731845dd8c360ac0cc1e75/rag/codes/chapter2/%E5%AE%9E%E6%88%981%EF%BC%9A%E6%9C%80%E5%9F%BA%E7%A1%80%E7%9A%84RAG.py#L82)

In [None]:
# Create the knowledge base using create_KB()
create_KB(dataset['test'])

# Display the contents of one of the generated text files
with open('data_kb/part_1.txt') as f:
    print(f.read())

After the script finishes you'll see a `data_kb` folder in the current directory containing multiple `.txt` files:

![image.png](2_images/img9.png)

The files look like this:

![image-2.png](2_images/img8.png)

>**Keep track of the `data_kb` path. Unless otherwise noted, all upcoming RAG examples will use this directory as the knowledge base.**

### 3. Environment check

Verify whether the SQLite library on your machine supports multithreading:

In [None]:
from lazyllm.common.queue import sqlite3_check_threadsafety
print(sqlite3_check_threadsafety())

If the result is **False**, reinstall **SQLite** with multithreading support. On macOS you can run:

```bash
brew update
brew install sqlite
which sqlite3
```

If the output path is **not** the Homebrew-installed SQLite, export the variables below and reinstall Python:

```bash
brew uninstall python
export PATH="/opt/homebrew/opt/sqlite/bin:$PATH"
export LDFLAGS="-L/opt/homebrew/opt/sqlite/lib"
export CPPFLAGS="-I/opt/homebrew/opt/sqlite/include‚Äù
brew install python
```

## Build the basic RAG

Now that we understand the three core components and prepared the knowledge base, let's implement the simplest RAG pipeline.

Remember the three building blocks:

* **Document** ‚Äì Loads and manages the corpus; just point it to your documents.
* **Retriever** ‚Äì Searches the document store. Configure which corpus to search, the strategy, and the number of hits to return.
* **LLM** ‚Äì Generates the final answer from the query plus retrieved documents. LazyLLM offers `TrainableModule` for local models and `OnlineChatModule` for hosted models so you can switch between them without changing code.

Chaining these components together yields the simplest RAG system:

[GitHub code link](https://github.com/LazyAGI/Tutorial/blob/7abc91dbb82a007a78731845dd8c360ac0cc1e75/rag/codes/chapter2/%E5%AE%9E%E6%88%981%EF%BC%9A%E6%9C%80%E5%9F%BA%E7%A1%80%E7%9A%84RAG.py#L89)

In [None]:
import lazyllm

# Load documents
documents = lazyllm.Document(dataset_path="/content/docs")

# Define the retriever
retriever = lazyllm.Retriever(
    doc=documents,
    group_name="CoarseChunk",
    similarity="bm25_chinese",
    topk=3
)

# Define the generation module
llm = lazyllm.OnlineChatModule(source="sensenova", model="SenseChat-5")

# Prompt design
prompt = (
    "You will act as an AI question-answering assistant and complete a dialogue task. "
    "In this task, you should provide your answers based on the given context and question."
)
llm.prompt(lazyllm.ChatPrompter(instruction=prompt, extra_keys=['context_str']))

# Inference
query = "Introduce Yushan arrow bamboo for me."

# Retrieve document nodes
doc_node_list = retriever(query=query)

# Build input for the model by combining query and retrieved content
res = llm({
    "query": query,
    "context_str": "".join([node.get_content() for node in doc_node_list])
})

print(f'With RAG Answer: {res}')

Here's how that code works:

1.  Lines 4, 7, and 10 instantiate the document loader (`document`), retriever (`retriever`), and generator (`llm`) respectively.
2.  Line 13 defines the LLM prompt via `llm.prompt`. The built-in `ChatPrompter` formats the RAG instructions so the model knows which parts are references and which part is the user query. It also converts the content into the wire format expected by online APIs.
3.  Line 17 captures the user query, line 19 calls the retriever, and the results are stored in the list `doc_mode_list`.
4.  Line 21 calls the LLM with a dict containing the question (`query`) and the concatenated retrieval context (`content_str`), which we build by calling `.get_content()` on each node and joining the strings.

For comparison, here's what happens when you call the LLM without any RAG context:

In [None]:
# Define the generation module (without RAG)
llm_without_rag = lazyllm.OnlineChatModule(source="sensenova", model="SenseChat-5")

# Query
query = "Introduce Yushan arrow bamboo for me."

# Run the model directly without retrieval
res = llm_without_rag(query)

print(f'Without RAG Answer: {res}')


Let's review the runtime output:

<div style="text-align:center; margin:20px 0;">
  <video controls style="width:900px; max-width:100%; height:auto; border:1px solid #ccc; border-radius:8px; box-shadow:0 4px 8px rgba(0,0,0,0.1);">
    <source src="./2_videos/ÊúÄÁÆÄÂçïÁöÑrag2.mp4" type="video/mp4" />
    Your browser does not support the video tag.
  </video>
</div>

---

References:

[1] A Span-Extraction Dataset for Chinese Machine Reading Comprehension

---

## Appendix: Install `Python`, `Pip`, and `Git` on macOS

1.  Install Xcode first.

![image.png](2_images/img10.png)

Accept the license agreement and install the Xcode command-line tools:

```bash
sudo xcode-select -s /Applications/Xcode.app/Contents/Developer  # Ensure the path is correct
sudo xcodebuild -license accept                                  # Accept the license agreement
xcode-select --install                                           # Install Xcode Command Line Tools
```

2.  Install Homebrew, then use it to install Python and pip:

```bash
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zshrc
source ~/.zshrc
brew install pyenv
pyenv install 3.10.0
```

3.  Install Python 3.10. Do not install Python 3.13 because the `spacy` dependency does not support it yet.

```bash
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrc
echo 'eval "$(pyenv init --path)"' >> ~/.zshrc
echo 'eval "$(pyenv init -)"' >> ~/.zshrc
pyenv global 3.10.0
python3 -m venv lazyllm
source lazyllm/bin/activate
```

