-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Update RAG example to v0.2.x #349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request updates the RAG (Retrieval-Augmented Generation) example to be compatible with Agent-Lightning v2.x. The update includes migrating from shell-based training scripts to a Python-based training interface, consolidating environment management into uv, and using a smaller "tiny" dataset for easier testing and CI integration.
- Migrated from shell script configuration to Python-based training with
train_rag.py - Updated agent implementation to use v2.x API signatures (
training_rollout_asyncandvalidation_rollout_asyncnow acceptRolloutparameter) - Consolidated dependency management by removing conda-based environment setup in favor of uv-managed dependencies
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
pyproject.toml |
Added rag dependency group with required packages (fastmcp, faiss-cpu, sentence-transformers) and conflict resolution overrides |
wiki_retriever_mcp/wiki_retriever_mcp.py |
Updated to use tiny dataset files and reduced top_k from 4 to 1 for demonstration purposes |
wiki_retriever_mcp/wiki_retriever_install.sh |
Removed conda-based installation script in favor of uv dependency management |
train_rag.py |
New Python-based training script with configurable training modes (fast/single_gpu) replacing the shell script approach |
train.sh |
Removed shell-based training script, replaced by train_rag.py |
rag_run_dev.py |
New development/testing script for running the RAG agent with a local vLLM server |
rag_agent.py |
Updated to v2.x API with new method signatures, improved logging, and type hints |
README.md |
Comprehensive documentation update with step-by-step setup instructions for the tiny dataset |
Comments suppressed due to low confidence (2)
examples/rag/rag_run_dev.py:1
- Import of 'json' is not used.
import json
examples/rag/train_rag.py:13
- Import of 'os' is not used.
import os
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
examples/rag/train_rag.py
Outdated
| python train_rag_agent.py fast # Fast training for CI/testing | ||
| python train_rag_agent.py single_gpu # Optimized for Single GPU (1.5B/7B models) |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usage examples in the docstring reference train_rag_agent.py, but the actual filename is train_rag.py. Please update the docstring to match the correct filename.
| python train_rag_agent.py fast # Fast training for CI/testing | |
| python train_rag_agent.py single_gpu # Optimized for Single GPU (1.5B/7B models) | |
| python train_rag.py fast # Fast training for CI/testing | |
| python train_rag.py single_gpu # Optimized for Single GPU (1.5B/7B models) |
examples/rag/rag_agent.py
Outdated
| val_resources = { | ||
| "main_llm": agl.LLM( | ||
| endpoint=llm.endpoint, | ||
| model=llm.model, | ||
| sampling_parameters={"temperature": 0.7}, | ||
| ) | ||
| } |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The val_resources is created as a plain dictionary but is passed to training_rollout_async which expects an agl.NamedResources object. This type mismatch could cause runtime errors. Consider wrapping it properly or ensuring it's the correct type.
| val_resources = { | |
| "main_llm": agl.LLM( | |
| endpoint=llm.endpoint, | |
| model=llm.model, | |
| sampling_parameters={"temperature": 0.7}, | |
| ) | |
| } | |
| val_resources = agl.NamedResources({ | |
| "main_llm": agl.LLM( | |
| endpoint=llm.endpoint, | |
| model=llm.model, | |
| sampling_parameters={"temperature": 0.7}, | |
| ) | |
| }) |
examples/rag/README.md
Outdated
| | `train_rag.py` | Initiates the GRPO training process | | ||
| | `rag_run_dev.py` | Development run test | | ||
| | `utils.py` | Scoring utilities for exact match, F1 score, and response parsing | | ||
| | `wiki_retriever_mcp/` | Setup scripts and MCP server (`wiki_retriever_install.sh`, `wiki_retriever_mcp.py`) for Wikipedia retrieval | |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description mentions wiki_retriever_install.sh as part of the wiki_retriever_mcp/ directory, but this file has been deleted in this PR. Please update the description to remove the reference to this file.
| | `wiki_retriever_mcp/` | Setup scripts and MCP server (`wiki_retriever_install.sh`, `wiki_retriever_mcp.py`) for Wikipedia retrieval | | |
| | `wiki_retriever_mcp/` | MCP server (`wiki_retriever_mcp.py`) for Wikipedia retrieval | |
| @@ -0,0 +1,34 @@ | |||
| # Copyright (c) Microsoft. All rights reserved. | |||
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The json module is imported but never used in this file. Consider removing this unused import.
| # Copyright (c) Microsoft. All rights reserved. |
examples/rag/train_rag.py
Outdated
| from __future__ import annotations | ||
|
|
||
| import argparse | ||
| import os |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The os module is imported but never used in this file. Consider removing this unused import.
| import os |
examples/rag/README.md
Outdated
| 3. Start the MCP server | ||
| Open a terminal and run: | ||
| ``` | ||
| cd examples/rag/wiki_retriever_mcp |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a path navigation inconsistency in the instructions. After step 2, users are left in the examples/rag/wiki_retriever_mcp directory, but step 3 says to cd examples/rag/wiki_retriever_mcp (which assumes they're at the repo root). Consider adding a comment to indicate the expected working directory at each step or adjusting the paths to be consistent.
| cd examples/rag/wiki_retriever_mcp | |
| # (Assuming you are still in examples/rag/wiki_retriever_mcp) |
|
/ci |
|
🚀 CI Watcher for correlation id-3605104551-mipjmxvu triggered by comment 3605104551
✅ All runs completed. |
.1 Re-implement RAG agent with Agent-Lightning v2.x
2. Merge environments management into uv
3. Update sample dataset to a
tinydataset for future CI4. Update readme