Skip to content

Conversation

@enitrat
Copy link
Collaborator

@enitrat enitrat commented Sep 20, 2025

Summary

This PR implements GEPA (Generalized Expectation-Preservation Algorithm) optimization for the query processing program using DSPy's optimization framework. The change enhances the RAG pipeline's retrieval capabilities through optimized prompting (+5 basis points), but notably will make it easy to switch models in the future.

Key Changes

GEPA Optimization Implementation: Applied DSPy's GEPA optimizer to the query processing program to improve search term extraction and documentation source identification
Enhanced Dataset: Added comprehensive user queries dataset (user_queries.json) with thousands of query examples for a good train, val, test set
Optimized Programs: Updated optimized configurations for RAG pipeline, MCP program, and retrieval program with improved prompts
Retrieval Judge Integration: Enhanced document retrieval with better relevance filtering and metadata extraction
Code Cleanup: Removed obsolete optimizer implementations and streamlined the optimization framework

@enitrat enitrat force-pushed the feat/gepa-optimization branch 8 times, most recently from 4d2bfea to 0a126a3 Compare September 23, 2025 18:33
@enitrat enitrat marked this pull request as ready for review September 23, 2025 18:40
@enitrat enitrat force-pushed the feat/gepa-optimization branch from 0a126a3 to aa5d8b0 Compare September 23, 2025 18:53
@enitrat enitrat force-pushed the feat/gepa-optimization branch from c17ef5c to a857785 Compare September 25, 2025 15:10
@enitrat enitrat changed the title feat: GEPA optimization for query-processing program feat: GEPA optimizers Sep 25, 2025
@enitrat enitrat merged commit 26bd531 into main Sep 25, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant