Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an infilling DSL #182

Closed
rlouf opened this issue Jul 12, 2023 · 3 comments
Closed

Add an infilling DSL #182

rlouf opened this issue Jul 12, 2023 · 3 comments
Labels
enhancement help wanted text Linked to text generation transformers Linked to the `transformers` integration

Comments

@rlouf
Copy link
Member

rlouf commented Jul 12, 2023

Workflow with LLMs often imply recursively calling models where at each step we concatenate the result of the previous call to the prompt. Consider the following example:

@text.prompt
def determine_goal(question):
    """{{question}}

    In order to solve this problem, we will analyze each of the options and determine
    """

@text.prompt
def solve(memory):
    """{{memory}}. Let's begin."""

complete = models.text_completion.openai(model_name)

prompt = determine_goal(question)
answer = complete(prompt, stop_at=["."])
prompt = solve(prompt + answer)
answer = complete(prompt, stop_at=["."])
completed = prompt + answer

The fact that we have to do the concatenation manually can quickly become cumbersome. Worse, we loose the KV cache across generations. I thus suggest to add a thin infilling DSL to outlines with the new API

import outlines.text as text

@text.infilling
def answer(question, goal, answer):
    """{{ question }}

    In order to solve this problem, we will analyze all of the options and determine [[ goal ]].
   
    Let's begin. [[ answer ]]."""


model = models.transformers("gpt2")
continuation = generate.continuation(model)
result = answer("Where are Apple's headquartes located?", continuation, continuation)
print(result["answer"])

A few quick points:

  1. We first evaluate the template with Jinja2 and then decompose the rendered template into a succession of strings and model calls, and loop over them;
  2. Calling the function returns a dictionary indexed by the names;
  3. We infer the stop_at kwarg from the template;
  4. This needs to be vectorized (for ToT implementation for instance)
@freckletonj
Copy link

I'll tie in one other must-have feature, imo: #305

This is where each infilled generation is accessible via a key, at the end of generation, like in guidance:

prompt = '''
Here's one sentence: {{gen "SENTENCE_1"}}
Now, another sentence: {{gen "SENTENCE_2"}}
'''
out = guidance(prompt)()

print(out['SENTENCE_1'])
# <prints what the LLM generated>

print(out['SENTENCE_2'])
# <prints the other sentence>

As a separate question, could we optionally return, and reuse a kv-cache? This would make your hand-rolled-concatenation example mostly work, already, aside from the nice syntactic sugar you suggest above.

@rlouf
Copy link
Member Author

rlouf commented Sep 30, 2023

As a separate question, could we optionally return, and reuse a kv-cache? This would make your hand-rolled-concatenation example mostly work, already, aside from the nice syntactic sugar you suggest above.

We opened #190 to discuss this. Does the interface there make sense? This is indeed an important feature that we'll re-prioritise.

@rlouf rlouf added the transformers Linked to the `transformers` integration label Jan 26, 2024
@rlouf
Copy link
Member Author

rlouf commented Feb 16, 2024

DSLs generally suck, so closing this in favor of #667 which would provide the same functionalities and more.

@rlouf rlouf closed this as not planned Won't fix, can't repro, duplicate, stale Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement help wanted text Linked to text generation transformers Linked to the `transformers` integration
Projects
None yet
Development

No branches or pull requests

2 participants