-
Notifications
You must be signed in to change notification settings - Fork 168
Allow for objective and hybrid pareto frontier tracking #92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
In the upstream DSPy issue I coined 'subscores', which is already in use here and can be a bit confusing. Trying to think of the cleanest way to resolve this |
|
Dear @MatsErdkamp, Thank you so much for this PR. This very nicely complements #100, and solves #2, and is a much needed feature in GEPA! I am planning to merge this after #100, however, I think this PR will conflict significantly with it, since both this and #100 make very large (and necessary) improvements. There are some changes to be made to #100 before merging, but could you please plan to rebase these changes over #100 so that all the merge conflicts that will be introduced after merging can be addressed? I expect the timeline for #100 to be within the next 3 days. |
…ive-tracking-to-gepa Add objective-aware Pareto tracking and subscores support
- Reorganized import statements for clarity. - Renamed `pareto_frontier_type` to `frontier_type` for consistency across the codebase. - Enhanced assertion messages for better debugging. - Improved formatting and spacing for better readability in multiple files. - Updated method signatures and docstrings to reflect changes in parameters. - Added support for tracking discovery evaluation counts in `GEPAResult` and `GEPAState` classes.
|
@LakshyAAAgrawal Done. With some interesting preliminary results..! |
|
@MatsErdkamp Dude, that's incredible! What's the path for using this with current DSPy pipelines? Does it rely on Subscores ? Would love to test out the implementation! |
|
Dear @MatsErdkamp, That is an amazing result. I am excited to get to this PR soon. #100 is finally closing in on completion, and I will shift my focus to this soon after. On that note, there have been several more updates to #100, so we will still need to merge/rebase against that. |
|
Dear @MatsErdkamp, #100 has been merged. I will now shift focus towards working on this. Could you please merge from main and fix any conflicts, and then I can start reviewing this? |
…roved validation handling.
…jective_scores' for clarity and consistency in handling evaluation metrics.
|
@LakshyAAAgrawal its been updated! |
|
Hi @MatsErdkamp, I am trying to resolve the merge conflicts from main in preparation to merge this. Could you please add some end-to-end tests with using both using objective and hybrid tracking, so that I can be sure that the merge conflict resolution do not cause any issues? Would be especially useful if the test can be as end-to-end as in https://github.com/gepa-ai/gepa/blob/main/tests/test_aime_prompt_optimization/test_aime_prompt_optimize.py. |
|
@MatsErdkamp, I have tried resolving the merge conflicts and pushed my changes. Could you please implement some tests from your last commit (e7e3fb5), and add them to the PR please? |
|
Dear @MatsErdkamp, The PR looks great. Thank you so much for your kind contribution. Could you please check if my changes are as you'd expect, by comparing the tip of the branch against your last commit (e7e3fb5)? Next, based on that commit itself, could you please implement some e2e tests, to ensure my changes didn't break those. Once the tests are there, I will go ahead and merge! |
…ter for scoring function
… sizes and metric parameters
…different results between frontier types
|
Thought this was quite tricky but got something working it seems. tests should also pass now (at least locally they do). @LakshyAAAgrawal |
|
Hi @MatsErdkamp, Thank you so much for helping with this! Unfortunately the tests are still failing. Would you be able to take a look? |
|
Should work now? Could you approve the workflow please @LakshyAAAgrawal |
|
Dear @MatsErdkamp, Thank you so much for this PR. It adds a very important feature to GEPA. |
|
Thank you for taking the time to correct some stuff @LakshyAAAgrawal. First time I worked on a project like this; learned a lot and still lots to learn! |
|
I apologize this took so long to merge, but look forward to future contributions from you both on here and DSPy/GEPA. Let me know if there's one already that I should direct my attention to. |
|
I do have a draft PR open to handle the DSPy side of this feature. stanfordnlp/dspy#8888 I see it has some merge conflicts now so I'll fix those and let you know. Any feedback you already might have on the implementation is welcome! |

Aims to solve this issue
This is currently downstream from an issue in DSPy. I took a crack at solving it myself but the implementation will probably change. I will setup the PR for this soon. After some final polish.
then I made another branch for DSPy that adapts GEPA in DSPy to support the objective frontier. I'm using this to test the implementation
Quite a mess, with downstream and upstream issues; See this as a potential implementation so we can get a sense of how the full 'subscores' implementation would feel -- both in DSPy and GEPA.