Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement FunSearch in DSPy #253

Open
mattredlon opened this issue Dec 22, 2023 · 4 comments
Open

Implement FunSearch in DSPy #253

mattredlon opened this issue Dec 22, 2023 · 4 comments

Comments

@mattredlon
Copy link

A recent paper by DeepMind demonstrated the use of FunSearch, an "evolutionary procedure" pairing an LLM with a systematic evaluator, on problems in extremal combinatorics (cap set) and algorithm design (online bin packing). The paper, while excellent and containing interesting work, went viral due to sensational media positioning such as: "DeepMind AI outdoes human mathematicians on unsolved problem" and the resulting (predictable) AGI/ASI grifting.

While reading this paper I couldn't help but think of DSPy. I continue to believe this framework is "skating to where the puck will be" in terms of building truly useful, trustworthy, maintainable, LM-based systems, at least until a dramatic breakthrough in LM capabilities (e.g. reasoning, world model, etc.). It recognizes the inherent variability/unreliability and limitations of LMs today and provides useful guardrails around them in production systems. This was further enhanced with Suggest and Assert. It also enables us to take advantage of the LM's ability to generate novel solutions or approaches - a core element of FunSearch.

I suspect replicating FunSearch in DSPy would enable it to address an even larger class of real-world scientific and business problems. As I understand it there are at least three areas which may require additional research/development to implement FunSearch (red in diagram):

  • Signatures would need the ability to use an “immutable” prompt “skeleton”. This is how DeepMind handled it at least, with a simple part of the prompt (a subset of the Python code called the “program”) being varied in each prompt.
  • We would need to dust off dspy.PythonInterpreter to execute the latest code variants.
  • Assuming my diagram is directionally correct, I'm unsure if the teleprompter is where we would want to execute and score each variant and then persist the “program” along with its evaluation score. We would perhaps then implement our genetic algorithm / island strategy as a retriever to select the example "programs" to include from the module, but someone could tell me this belongs in teleprompter as well.

dspy_flowchart

I’ll share additional thoughts as I proceed. I only recently dove into the DSPy code base, so any feedback from the community on my understanding (or lack thereof) and guidance on research would be greatly appreciated!

@darinkishore
Copy link
Collaborator

darinkishore commented Dec 23, 2023

@mattredlon

Spitballing, but take a look at this issue/PR combo. Would this be what you were thinking about for item 1? (Signatures would need the ability to use an “immutable” prompt “skeleton”.)

darinkishore#68

darinkishore#69

@mattredlon
Copy link
Author

These are excellent, @darinkishore! I am not fluent enough in the code base or design principles of DSPy yet to be confident in my evaluation of your issue/PR so please be patient with me, however, there was one element where I wasn't sure if it should be addressed in the Signature, in the Module (perhaps leveraging a new genetic/islands retriever), or in the Teleprompter/compiler. You included it as:

Specifications
Sample Signatures: Create a method to sample k signatures based on certain logic, akin to DeepMind's priority functions sorting and selection.

Action Items
Design and implement the sampling of signature variations and the prompt construction methodology.

Are you envisioning the Signature itself being responsible for sampling k "programs" (to use FunSearch vernacular) each hop? If so, would you envision the Module as the location for executing the Python generated via the latest Signature/"program" variant? If so, where in the process would we persist the latest "program" and what would the role of the Teleprompter/compiler be?

@okhat
Copy link
Collaborator

okhat commented Jan 4, 2024

Hey @mattredlon,

teleprompter is where we would want to execute and score each variant and then persist the “program” along with its evaluation score. We would perhaps then implement our genetic algorithm / island strategy as a retriever to select the example "programs" to include from the module, but someone could tell me this belongs in teleprompter as well.

With the big caveat that I (i) read your post carefully (ii) never read the FunSearch paper or any content about it yet unfortunately, I think this all belongs in a new teleprompter.

We'll rename teleprompters to optimizers so let me start using that term now. Basically I can imagine a dspy.optimize.FunSearch optimizer. It will take FunSearch(**general_config_and_hyperparams) and will have .compile(program, trainset) as usual.

In terms of the original design, Signatures in DSPy are stateless and immutable.* The optimized parts are held inside the modules, basically/mainly dspy.Predict. (Other modules generally use dspy.Predict in different ways.)

The main things that can be optimized are typically instructions and demonstrations (or the LM weights if we're finetuning). I assume FunSearch is more about the instructions?

Footnote: Even I think eventually signatures should be stateless and immutable in principle, we took some shortcuts in SignatureOptimizer where it does mutate the instruction of the signature. Probably not the right long-term thing to do.

@kushinm
Copy link

kushinm commented Apr 5, 2024

Thanks for starting this thread @mattredlon !
I was curious whether you or others have made any progress on this front (maybe building on the suggestions made by @okhat above)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants