-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add "regenerate" code lens for edits #1383
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems great. It's basically the "Regenerate" button from ChatGPT but applied to code. 👍🏼 on the UX. Left one comment on the impl, and wondering if we can add a tooltip to show that we take into account errors.
const diagnostics = vscode.languages.getDiagnostics(task.fixupFile.uri) | ||
// Find the first error diagnostic that is within the range of the task | ||
const range = task.selectionRange | ||
const error = diagnostics.find(diagnostic => range.contains(diagnostic.range)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there's multiple errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they will just have to keep retrying 🏃♀️
The following errors are sometimes caused by the first identified error, and also, i'm not sure how well Cody can handle multiple errors at once, so I thought we can start with the first error for now, but happy to add more / remove this if this doesn't make sense.
|
||
function getRetryLens(codeLensRange: vscode.Range, id: string): vscode.CodeLens { | ||
const lens = new vscode.CodeLens(codeLensRange) | ||
lens.command = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to add a tooltip here?
- If no errors in range: "Retry"
- If errors in range: "Retry and attempt to fix errors"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we call the first one "Regenerate" like ChatGPT, and re-do the original task, not feed its output back as input?
I am speculating. I would love to see real experiences using this.
// Find the first error diagnostic that is within the range of the task | ||
const range = task.selectionRange | ||
const error = diagnostics.find(diagnostic => range.contains(diagnostic.range)) | ||
const instruction = error | ||
? `${task.instruction}. Start from the beginning and think step-by-step to make sure your code address the following error: ${error.message}` | ||
: task.instruction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we update this to try to fix all errors that are within range?
We do this for the code action and it seems to work well so far:
private getCodeActionInstruction = (diagnostics: vscode.Diagnostic[]): string => {
const intent: FixupIntent = 'edit'
return `/${intent} Fix the following error${diagnostics.length > 1 ? 's' : ''}: ${diagnostics
.map(({ message }) => `\`\`\`${message}\`\`\``)
.join('\n')}`
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@umpox IIRC those errors are all the diagnostics at the point you invoked the menu which could be much narrower than a fixup range.
I recommend you look at diagnostic.message
for all the diagnostics in a wider range. Without knowing where they are, it would be difficult for a human to succeed at the task, let alone an LLM. Try it! The error messages are often much less useful than you think. They're things like ";" expected
or A type predicate is only allowed in return type position for functions and methods
(when I mis-spelled in
as is
in TypeScript just now...) Sure, if it is pointing at "is" you have a chance to work it out, but in a dozen lines of code it is pretty tough.
I think we can address this problem but we need to work out a format for communicating where these error messages are to the LLM. Probably something involving tags (but the errors can overlap in general so it's a huge mess, maybe we need some repetition in that case...)
} | ||
task.spinCount++ | ||
// Get diagnostics from vscode api | ||
const diagnostics = vscode.languages.getDiagnostics(task.fixupFile.uri) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be awesome if we only showed this code lens when errors are present in the fixup. Otherwise users might assume that "Retry" just repeats the previous message and Cody gives a different response.
WDYT about:
- Change the code lens text to "Fix errors"
- Only show when there is at least 1 error within the fixup range
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChatGPT does have a "regenerate" button. I could see myself thinking, meh, I want a new one and clicking "Regenerate." But this is different, it edits the edit, so maybe that is something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is different, it edits the edit, so maybe that is something else.
It doesn't do that right now 😅 This is just the same errors that the users are getting before they submit the fixup request, see reply to the comment below 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts inline.
I think this is a good direction. ChatGPT has a regenerate button and people like it. I often do a fixup to a fixup if the result wasn't quite right (so I do instructions to get it on the right track... WDYT about supporting that?)
On the other hand...
- What if we made it easier to fix any errors with Cody? Then would you care if the errors came from a fixup or not?
- If we do this, do we want/need to have "retry" or "regenerate" buttons on every Cody interaction? Does our UX scale to that? (I don't feel strongly either way. Just curious/nervous.)
} | ||
task.spinCount++ | ||
// Get diagnostics from vscode api | ||
const diagnostics = vscode.languages.getDiagnostics(task.fixupFile.uri) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChatGPT does have a "regenerate" button. I could see myself thinking, meh, I want a new one and clicking "Regenerate." But this is different, it edits the edit, so maybe that is something else.
const diagnostics = vscode.languages.getDiagnostics(task.fixupFile.uri) | ||
// Find the first error diagnostic that is within the range of the task | ||
const range = task.selectionRange | ||
const error = diagnostics.find(diagnostic => range.contains(diagnostic.range)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the user experience here? What is it like using this?
Really if you had a set of errors x, do a fixup and have a set of errors y, then y - x is something you could demand Cody deal with. So those errors might not show up in the range Cody edited, they could be anywhere. Slightly less ambitious, you could use intersection
being non-empty and not contains
so we'll try to handle errors overlapping the start/end.
On the other hand, maybe Cody is just an error fixing machine and you don't "blame" Cody for errors which occur after it has done a fixup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the error we are getting here will be the same error when user starts the initial fixup task because we are not applying fixup on received automatically yet. This is why I was thinking if a user click retry, that means they are not happy with the fixup output that has not been applied, so if we detect any existing error message, we can try to include it and see if that produces a better output for them.
If this does make sense, I'm happy to remove the error from the retry instruction until we can confirm this is actually helpful for retry :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the error we are getting here will be the same error when user starts the initial fixup task because we are not applying fixup on received automatically yet.
👍 Gotcha. I was missing that fact.
This is why I was thinking if a user click retry, that means they are not happy with the fixup output that has not been applied, so if we detect any existing error message, we can try to include it and see if that produces a better output for them.
Interesting. So basically we don't know whether the user is trying to fix errors, and the first time we don't send the error, but if the user hits retry they must be unhappy and we will try to send it.
Can we measure what users are actually doing, and try to fit with that? Maybe we should always send the error, or never send it, or send it the first time and fall back to not sending it on retries... It would be great to know a bit more about what's actually happening. Whether through artificial tests, user studies, instrumentation, experimentation, etc.
// Find the first error diagnostic that is within the range of the task | ||
const range = task.selectionRange | ||
const error = diagnostics.find(diagnostic => range.contains(diagnostic.range)) | ||
const instruction = error | ||
? `${task.instruction}. Start from the beginning and think step-by-step to make sure your code address the following error: ${error.message}` | ||
: task.instruction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@umpox IIRC those errors are all the diagnostics at the point you invoked the menu which could be much narrower than a fixup range.
I recommend you look at diagnostic.message
for all the diagnostics in a wider range. Without knowing where they are, it would be difficult for a human to succeed at the task, let alone an LLM. Try it! The error messages are often much less useful than you think. They're things like ";" expected
or A type predicate is only allowed in return type position for functions and methods
(when I mis-spelled in
as is
in TypeScript just now...) Sure, if it is pointing at "is" you have a chance to work it out, but in a dozen lines of code it is pretty tough.
I think we can address this problem but we need to work out a format for communicating where these error messages are to the LLM. Probably something involving tags (but the errors can overlap in general so it's a huge mess, maybe we need some repetition in that case...)
|
||
function getRetryLens(codeLensRange: vscode.Range, id: string): vscode.CodeLens { | ||
const lens = new vscode.CodeLens(codeLensRange) | ||
lens.command = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we call the first one "Regenerate" like ChatGPT, and re-do the original task, not feed its output back as input?
I am speculating. I would love to see real experiences using this.
What if retry gave you a quickpick input pre-filled with the last instruction, for you to try again? I guess it depends, is this like:
|
I feel like (3) is most common for dev in ChatGPT? I feel like allowing a correctional input is most powerful, so more like 3, but labeled “Retry” and also make it clear we’re sending error messages too. Is it possible to have a quickpick with an input and a checkbox? We could have an empty instruction input, and a pre-checked checkbox labelled “Include problem error messages (5)” (or whatever the correct VS Code terminology is) |
I was thinking, at what point does a user want to hit retry?
I can't think of a situation where a user does not want to let the LLM know about an error in the selection range on Plus, this error exists before Cody has generated an output, because when a fixup is applied, the retry button will go away. I think this would make more sense when we have all fixup auto-applied by default, and then let users choose what other actions do they want to perform 😆 wdyt?
But this is just doing what /edit or quick fix are doing? |
We should understand user behavior here. For me, in Cody, I just start a second fixup at that point. Does that need its own UI? I sometimes do (2) (edit the prompt and retry with the original input.) That feels a bit clunky because I have to hit undo, recompose my selection and instruction. Having the original instruction around to edit would be handy. I never do (1) ("free spin"/try again with same inputs) unless I am messing around with models and temperature and want to get a gut feel for the "range". I don't think I have ever hit the Regenerate button in ChatGPT for text. But this is just me... what do people do? |
@abeatrix, my feeling here is we should slow down and look at a few things:
At this point I'd be fine with anecdata from Sourcegraph teammates about these. It would be great if that included input from someone using not-TypeScript, not-Go in addition. Then we need to close the loop between buffing the LLM performance, giving users what they need, and narrowing the gap between how the UX presents and what people think it will do. I think this is a great direction but I'm uncertain about those three bullets. |
@dominiccooney @toolmantim I've updated the PR to replace Right now the expected behavior is to ask Cody to regenerate a new response using the same instruction, for example Initial Response from Cody:
Response from Cody after hitting
(Cody just straight up copied TS's lyrics lol) Loom: demo_.regenerate.code.lens.in._edit.mp4 |
Adding missed changelog entry for [pull/1383](#1383) ## Test plan <!-- Required. See https://docs.sourcegraph.com/dev/background-information/testing_principles. --> Changelog update
part of #1359
feat: add retry code lens for fixup tasks
This adds a new "regenerate" code lens option for edit/fixup tasks. When clicked, it will try to regenerate a response using the same fixup instruction.
Changes:
regenerate
code lensregenerate
lens in returned arrayTest plan
/edit add comments inline
Retry
code lens againDemo
demo_.regenerate.code.lens.in._edit.mp4
This is expected to work the same when Tom has made the changes so that all Fixup tasks are applied automatically by default.