Update RAG example to v0.2.x #349

wizardlancet · 2025-12-02T05:54:34Z

.1 Re-implement RAG agent with Agent-Lightning v2.x
2. Merge environments management into uv
3. Update sample dataset to a tiny dataset for future CI
4. Update readme

Copilot

Pull request overview

This pull request updates the RAG (Retrieval-Augmented Generation) example to be compatible with Agent-Lightning v2.x. The update includes migrating from shell-based training scripts to a Python-based training interface, consolidating environment management into uv, and using a smaller "tiny" dataset for easier testing and CI integration.

Migrated from shell script configuration to Python-based training with train_rag.py
Updated agent implementation to use v2.x API signatures (training_rollout_async and validation_rollout_async now accept Rollout parameter)
Consolidated dependency management by removing conda-based environment setup in favor of uv-managed dependencies

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`pyproject.toml`	Added `rag` dependency group with required packages (fastmcp, faiss-cpu, sentence-transformers) and conflict resolution overrides
`wiki_retriever_mcp/wiki_retriever_mcp.py`	Updated to use tiny dataset files and reduced top_k from 4 to 1 for demonstration purposes
`wiki_retriever_mcp/wiki_retriever_install.sh`	Removed conda-based installation script in favor of uv dependency management
`train_rag.py`	New Python-based training script with configurable training modes (fast/single_gpu) replacing the shell script approach
`train.sh`	Removed shell-based training script, replaced by `train_rag.py`
`rag_run_dev.py`	New development/testing script for running the RAG agent with a local vLLM server
`rag_agent.py`	Updated to v2.x API with new method signatures, improved logging, and type hints
`README.md`	Comprehensive documentation update with step-by-step setup instructions for the tiny dataset

Comments suppressed due to low confidence (2)

examples/rag/rag_run_dev.py:1

Import of 'json' is not used.

import json

examples/rag/train_rag.py:13

Import of 'os' is not used.

import os

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-02T05:58:48Z

examples/rag/train_rag.py

+    python train_rag_agent.py fast        # Fast training for CI/testing
+    python train_rag_agent.py single_gpu  # Optimized for Single GPU (1.5B/7B models)


The usage examples in the docstring reference train_rag_agent.py, but the actual filename is train_rag.py. Please update the docstring to match the correct filename.

Suggested change

python train_rag_agent.py fast # Fast training for CI/testing

python train_rag_agent.py single_gpu # Optimized for Single GPU (1.5B/7B models)

python train_rag.py fast # Fast training for CI/testing

python train_rag.py single_gpu # Optimized for Single GPU (1.5B/7B models)

Copilot · 2025-12-02T05:58:49Z

examples/rag/rag_agent.py

+        val_resources = {
+            "main_llm": agl.LLM(
                endpoint=llm.endpoint,
                model=llm.model,
                sampling_parameters={"temperature": 0.7},
            )
        }


The val_resources is created as a plain dictionary but is passed to training_rollout_async which expects an agl.NamedResources object. This type mismatch could cause runtime errors. Consider wrapping it properly or ensuring it's the correct type.

Suggested change

val_resources = {

"main_llm": agl.LLM(

endpoint=llm.endpoint,

model=llm.model,

sampling_parameters={"temperature": 0.7},

)

}

val_resources = agl.NamedResources({

"main_llm": agl.LLM(

endpoint=llm.endpoint,

model=llm.model,

sampling_parameters={"temperature": 0.7},

)

})

Copilot · 2025-12-02T05:58:49Z

examples/rag/README.md

+| `train_rag.py` | Initiates the GRPO training process |
+| `rag_run_dev.py` | Development run test |
+| `utils.py` | Scoring utilities for exact match, F1 score, and response parsing |
 | `wiki_retriever_mcp/` | Setup scripts and MCP server (`wiki_retriever_install.sh`, `wiki_retriever_mcp.py`) for Wikipedia retrieval |


The description mentions wiki_retriever_install.sh as part of the wiki_retriever_mcp/ directory, but this file has been deleted in this PR. Please update the description to remove the reference to this file.

Suggested change

| `wiki_retriever_mcp/` | Setup scripts and MCP server (`wiki_retriever_install.sh`, `wiki_retriever_mcp.py`) for Wikipedia retrieval |

| `wiki_retriever_mcp/` | MCP server (`wiki_retriever_mcp.py`) for Wikipedia retrieval |

Copilot · 2025-12-02T05:58:49Z

examples/rag/rag_run_dev.py

@@ -0,0 +1,34 @@
+# Copyright (c) Microsoft. All rights reserved.


The json module is imported but never used in this file. Consider removing this unused import.

Suggested change

# Copyright (c) Microsoft. All rights reserved.

Copilot · 2025-12-02T05:58:50Z

examples/rag/train_rag.py

+from __future__ import annotations
+
+import argparse
+import os


The os module is imported but never used in this file. Consider removing this unused import.

Suggested change

import os

Copilot · 2025-12-02T05:58:50Z

examples/rag/README.md

+3. Start the MCP server
+Open a terminal and run:
+```
+cd examples/rag/wiki_retriever_mcp


There's a path navigation inconsistency in the instructions. After step 2, users are left in the examples/rag/wiki_retriever_mcp directory, but step 3 says to cd examples/rag/wiki_retriever_mcp (which assumes they're at the repo root). Consider adding a comment to indicate the expected working directory at each step or adjusting the paths to be consistent.

Suggested change

cd examples/rag/wiki_retriever_mcp

# (Assuming you are still in examples/rag/wiki_retriever_mcp)

ultmaster · 2025-12-03T05:05:16Z

/ci

github-actions · 2025-12-03T05:05:30Z

🚀 CI Watcher for correlation id-3605104551-mipjmxvu triggered by comment 3605104551
🏃‍♀️ Tracking 1 workflow run(s):

🟢 GPU Test - PR #349 - ci-gpu - id-3605104551-mipjmxvu — completed/success

✅ All runs completed.

wizardlancet added 3 commits December 2, 2025 12:18

update RAG Example to v2.x

7df0be3

pre-commit check

9de6b44

Update RAG example to v2.x

5ace9f9

Copilot AI review requested due to automatic review settings December 2, 2025 05:54

Copilot started reviewing on behalf of wizardlancet December 2, 2025 05:55 View session

Copilot finished reviewing on behalf of wizardlancet December 2, 2025 05:57

fix header for rag_run_dev.py

2fe381e

Copilot AI reviewed Dec 2, 2025

View reviewed changes

wizardlancet added 10 commits December 2, 2025 06:07

fix format

41903a7

fix format

790e1de

fix format rag_run_dev

829cd9d

fix type

f64e998

fix df type

3c539cd

fix type ignore pandas df

7100e81

add uv lock file

8a22011

remove wiki_retriver_mcp folder move mcp server out

2bf461c

fix in llm_proxy_gpu test,not GPU_AVAILABLE

d756bac

seperate rag requirements in isolated group

8b11cf0

ultmaster added the ci-gpu label Dec 3, 2025

wizardlancet added 2 commits December 3, 2025 05:02

wider packages version in rag group

8cbca82

update uv lock

f7bedda

ultmaster changed the title ~~Update RAG example to v2.x~~ Update RAG example to v0.2.x Dec 3, 2025

ultmaster merged commit 34811cb into microsoft:main Dec 3, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update RAG example to v0.2.x #349

Update RAG example to v0.2.x #349

wizardlancet commented Dec 2, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

ultmaster commented Dec 3, 2025

Uh oh!

github-actions bot commented Dec 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		python train_rag_agent.py fast # Fast training for CI/testing
		python train_rag_agent.py single_gpu # Optimized for Single GPU (1.5B/7B models)

	\| `wiki_retriever_mcp/` \| Setup scripts and MCP server (`wiki_retriever_install.sh`, `wiki_retriever_mcp.py`) for Wikipedia retrieval \|
	\| `wiki_retriever_mcp/` \| MCP server (`wiki_retriever_mcp.py`) for Wikipedia retrieval \|

		@@ -0,0 +1,34 @@
		# Copyright (c) Microsoft. All rights reserved.

	cd examples/rag/wiki_retriever_mcp
	# (Assuming you are still in examples/rag/wiki_retriever_mcp)

Update RAG example to v0.2.x #349

Update RAG example to v0.2.x #349

Conversation

wizardlancet commented Dec 2, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

ultmaster commented Dec 3, 2025

Uh oh!

github-actions bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Dec 3, 2025 •

edited

Loading