Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit: Improve response format reliability #1790

Closed
umpox opened this issue Nov 17, 2023 · 0 comments · Fixed by #1892
Closed

Edit: Improve response format reliability #1790

umpox opened this issue Nov 17, 2023 · 0 comments · Fixed by #1892
Assignees

Comments

@umpox
Copy link
Contributor

umpox commented Nov 17, 2023

Description

As we use a chat model for producing edits, we can often find that the LLM produces non-code responses. Or a response that doesn't match what we want to achieve.

We have observed:

  • Markdown output in fixups
  • Incorrect usage of XML tags in fixups (e.g. <problemCode>)
  • Chatty explanations in fixups (non code text)

We should come up with a better way to ensure that we always produce code responses.

  • Tweak the initial Cody preamble to produce an alternative one that is focused on edits (e.g. doesn't mention using markdown)
  • Force Cody to start the response with a specified XML tag if using a model that supports it (Claude)
  • Improve content sanitisation to ensure tags don't leak into code
  • Consider using a code completion model that has instruct (Code Llama?)
@umpox umpox self-assigned this Nov 17, 2023
umpox added a commit that referenced this issue Nov 28, 2023
closes #1790

## Description

This PR improves edit consistency by:
- "Putting words into the LLM's mouth". As we're using Claude we can
start the transcript with the expected tag so we're more likely to get a
valid output.
- For "Add" intents, we also include the preceding code as part of the
injected transcript
- Using non-HTML related XML tags. Using XML tags to declare different
parts of the prompt does show strong improvements (I tried removing them
all apart from the main one), but the problem I think is that the LLM
can become confused as to if this is HTML code - especially when making
JS/TS edits. We now add numbers to the tags, which would be invalid for
a HTML tag. This seems to help steer the response a lot.
- Various small prompt tweaks

## Test plan

Create fixups:
- Edits from selection
- Adding code from no selection
- Fixing error diagnostics
- Doc command. Code action commands to document symbols

<!-- Required. See
https://docs.sourcegraph.com/dev/background-information/testing_principles.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant