Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add "regenerate" code lens for edits #1383

Merged
merged 7 commits into from
Oct 18, 2023
Merged

feat: add "regenerate" code lens for edits #1383

merged 7 commits into from
Oct 18, 2023

Conversation

abeatrix
Copy link
Contributor

@abeatrix abeatrix commented Oct 12, 2023

part of #1359

feat: add retry code lens for fixup tasks

This adds a new "regenerate" code lens option for edit/fixup tasks. When clicked, it will try to regenerate a response using the same fixup instruction.

Changes:

  • Register new "cody.fixup.codelens.regenerate" command
  • Add getRegenerateLens() to generateregenerate code lens
  • In getLensesForTask(), includeregenerate lens in returned array
  • Add retry() method to handle retry click
    • Discard previous task and code lens
    • Executes edit-code command with same instructions to create a new task for regenerating code

Test plan

  1. Select a section of code
  2. Run /edit command: /edit add comments inline
  3. Check the diff to see what was edited
  4. Click on the Retry code lens again
  5. Check diff again, you should see the comments were updated by Cody again

image

Demo

demo_.regenerate.code.lens.in._edit.mp4

This is expected to work the same when Tom has made the changes so that all Fixup tasks are applied automatically by default.

@abeatrix abeatrix requested review from toolmantim and a team October 12, 2023 19:32
@abeatrix abeatrix marked this pull request as ready for review October 12, 2023 19:33
Copy link
Contributor

@toolmantim toolmantim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems great. It's basically the "Regenerate" button from ChatGPT but applied to code. 👍🏼 on the UX. Left one comment on the impl, and wondering if we can add a tooltip to show that we take into account errors.

const diagnostics = vscode.languages.getDiagnostics(task.fixupFile.uri)
// Find the first error diagnostic that is within the range of the task
const range = task.selectionRange
const error = diagnostics.find(diagnostic => range.contains(diagnostic.range))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there's multiple errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they will just have to keep retrying 🏃‍♀️
The following errors are sometimes caused by the first identified error, and also, i'm not sure how well Cody can handle multiple errors at once, so I thought we can start with the first error for now, but happy to add more / remove this if this doesn't make sense.


function getRetryLens(codeLensRange: vscode.Range, id: string): vscode.CodeLens {
const lens = new vscode.CodeLens(codeLensRange)
lens.command = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a tooltip here?

  • If no errors in range: "Retry"
  • If errors in range: "Retry and attempt to fix errors"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we call the first one "Regenerate" like ChatGPT, and re-do the original task, not feed its output back as input?

I am speculating. I would love to see real experiences using this.

Comment on lines 544 to 549
// Find the first error diagnostic that is within the range of the task
const range = task.selectionRange
const error = diagnostics.find(diagnostic => range.contains(diagnostic.range))
const instruction = error
? `${task.instruction}. Start from the beginning and think step-by-step to make sure your code address the following error: ${error.message}`
: task.instruction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update this to try to fix all errors that are within range?

We do this for the code action and it seems to work well so far:

    private getCodeActionInstruction = (diagnostics: vscode.Diagnostic[]): string => {
        const intent: FixupIntent = 'edit'
        return `/${intent} Fix the following error${diagnostics.length > 1 ? 's' : ''}: ${diagnostics
            .map(({ message }) => `\`\`\`${message}\`\`\``)
            .join('\n')}`
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umpox IIRC those errors are all the diagnostics at the point you invoked the menu which could be much narrower than a fixup range.

I recommend you look at diagnostic.message for all the diagnostics in a wider range. Without knowing where they are, it would be difficult for a human to succeed at the task, let alone an LLM. Try it! The error messages are often much less useful than you think. They're things like ";" expected or A type predicate is only allowed in return type position for functions and methods (when I mis-spelled in as is in TypeScript just now...) Sure, if it is pointing at "is" you have a chance to work it out, but in a dozen lines of code it is pretty tough.

I think we can address this problem but we need to work out a format for communicating where these error messages are to the LLM. Probably something involving tags (but the errors can overlap in general so it's a huge mess, maybe we need some repetition in that case...)

}
task.spinCount++
// Get diagnostics from vscode api
const diagnostics = vscode.languages.getDiagnostics(task.fixupFile.uri)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be awesome if we only showed this code lens when errors are present in the fixup. Otherwise users might assume that "Retry" just repeats the previous message and Cody gives a different response.

WDYT about:

  • Change the code lens text to "Fix errors"
  • Only show when there is at least 1 error within the fixup range

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChatGPT does have a "regenerate" button. I could see myself thinking, meh, I want a new one and clicking "Regenerate." But this is different, it edits the edit, so maybe that is something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is different, it edits the edit, so maybe that is something else.

It doesn't do that right now 😅 This is just the same errors that the users are getting before they submit the fixup request, see reply to the comment below 👀

Copy link
Contributor

@dominiccooney dominiccooney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts inline.

I think this is a good direction. ChatGPT has a regenerate button and people like it. I often do a fixup to a fixup if the result wasn't quite right (so I do instructions to get it on the right track... WDYT about supporting that?)

On the other hand...

  • What if we made it easier to fix any errors with Cody? Then would you care if the errors came from a fixup or not?
  • If we do this, do we want/need to have "retry" or "regenerate" buttons on every Cody interaction? Does our UX scale to that? (I don't feel strongly either way. Just curious/nervous.)

}
task.spinCount++
// Get diagnostics from vscode api
const diagnostics = vscode.languages.getDiagnostics(task.fixupFile.uri)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChatGPT does have a "regenerate" button. I could see myself thinking, meh, I want a new one and clicking "Regenerate." But this is different, it edits the edit, so maybe that is something else.

const diagnostics = vscode.languages.getDiagnostics(task.fixupFile.uri)
// Find the first error diagnostic that is within the range of the task
const range = task.selectionRange
const error = diagnostics.find(diagnostic => range.contains(diagnostic.range))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the user experience here? What is it like using this?

Really if you had a set of errors x, do a fixup and have a set of errors y, then y - x is something you could demand Cody deal with. So those errors might not show up in the range Cody edited, they could be anywhere. Slightly less ambitious, you could use intersection being non-empty and not contains so we'll try to handle errors overlapping the start/end.

On the other hand, maybe Cody is just an error fixing machine and you don't "blame" Cody for errors which occur after it has done a fixup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the error we are getting here will be the same error when user starts the initial fixup task because we are not applying fixup on received automatically yet. This is why I was thinking if a user click retry, that means they are not happy with the fixup output that has not been applied, so if we detect any existing error message, we can try to include it and see if that produces a better output for them.

If this does make sense, I'm happy to remove the error from the retry instruction until we can confirm this is actually helpful for retry :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the error we are getting here will be the same error when user starts the initial fixup task because we are not applying fixup on received automatically yet.

👍 Gotcha. I was missing that fact.

This is why I was thinking if a user click retry, that means they are not happy with the fixup output that has not been applied, so if we detect any existing error message, we can try to include it and see if that produces a better output for them.

Interesting. So basically we don't know whether the user is trying to fix errors, and the first time we don't send the error, but if the user hits retry they must be unhappy and we will try to send it.

Can we measure what users are actually doing, and try to fit with that? Maybe we should always send the error, or never send it, or send it the first time and fall back to not sending it on retries... It would be great to know a bit more about what's actually happening. Whether through artificial tests, user studies, instrumentation, experimentation, etc.

Comment on lines 544 to 549
// Find the first error diagnostic that is within the range of the task
const range = task.selectionRange
const error = diagnostics.find(diagnostic => range.contains(diagnostic.range))
const instruction = error
? `${task.instruction}. Start from the beginning and think step-by-step to make sure your code address the following error: ${error.message}`
: task.instruction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umpox IIRC those errors are all the diagnostics at the point you invoked the menu which could be much narrower than a fixup range.

I recommend you look at diagnostic.message for all the diagnostics in a wider range. Without knowing where they are, it would be difficult for a human to succeed at the task, let alone an LLM. Try it! The error messages are often much less useful than you think. They're things like ";" expected or A type predicate is only allowed in return type position for functions and methods (when I mis-spelled in as is in TypeScript just now...) Sure, if it is pointing at "is" you have a chance to work it out, but in a dozen lines of code it is pretty tough.

I think we can address this problem but we need to work out a format for communicating where these error messages are to the LLM. Probably something involving tags (but the errors can overlap in general so it's a huge mess, maybe we need some repetition in that case...)


function getRetryLens(codeLensRange: vscode.Range, id: string): vscode.CodeLens {
const lens = new vscode.CodeLens(codeLensRange)
lens.command = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we call the first one "Regenerate" like ChatGPT, and re-do the original task, not feed its output back as input?

I am speculating. I would love to see real experiences using this.

vscode/src/non-stop/FixupController.ts Outdated Show resolved Hide resolved
@toolmantim
Copy link
Contributor

What if retry gave you a quickpick input pre-filled with the last instruction, for you to try again?

I guess it depends, is this like:

  1. Hitting regenerate in ChatGPT
  2. Editing the previous message in ChatGPT (edit and replace last response)
  3. Sending a follow up message like you would with ChatGPT e.g. “Same but using a React hook” (edit on an edit)

@toolmantim
Copy link
Contributor

I feel like (3) is most common for dev in ChatGPT?

I feel like allowing a correctional input is most powerful, so more like 3, but labeled “Retry” and also make it clear we’re sending error messages too.

Is it possible to have a quickpick with an input and a checkbox? We could have an empty instruction input, and a pre-checked checkbox labelled “Include problem error messages (5)” (or whatever the correct VS Code terminology is)

@abeatrix
Copy link
Contributor Author

I was thinking, at what point does a user want to hit retry?

  1. not happy with the output even when it has no error -- GREAT! no error will be added anyway
  2. not happy with the output cause of an error -- we will let the LLM know about the error

I can't think of a situation where a user does not want to let the LLM know about an error in the selection range on retry 🤔

Plus, this error exists before Cody has generated an output, because when a fixup is applied, the retry button will go away.

I think this would make more sense when we have all fixup auto-applied by default, and then let users choose what other actions do they want to perform 😆 wdyt?

Is it possible to have a quickpick with an input and a checkbox? We could have an empty instruction input, and a pre-checked checkbox labelled “Include problem error messages (5)” (or whatever the correct VS Code terminology is)

But this is just doing what /edit or quick fix are doing?

@dominiccooney
Copy link
Contributor

@toolmantim

I feel like (3) is most common for dev in ChatGPT?

We should understand user behavior here. For me, in Cody, I just start a second fixup at that point. Does that need its own UI?

I sometimes do (2) (edit the prompt and retry with the original input.) That feels a bit clunky because I have to hit undo, recompose my selection and instruction. Having the original instruction around to edit would be handy.

I never do (1) ("free spin"/try again with same inputs) unless I am messing around with models and temperature and want to get a gut feel for the "range". I don't think I have ever hit the Regenerate button in ChatGPT for text.

But this is just me... what do people do?

@dominiccooney
Copy link
Contributor

@abeatrix, my feeling here is we should slow down and look at a few things:

  • What's the performance of Cody like, with and without errors. (And later, the syntax for presenting errors to Cody, and how we could show multiple errors to Cody.)
  • What do users think a "Retry" button would do.
  • What users are doing today when Cody doesn't perform well.

At this point I'd be fine with anecdata from Sourcegraph teammates about these. It would be great if that included input from someone using not-TypeScript, not-Go in addition.

Then we need to close the loop between buffing the LLM performance, giving users what they need, and narrowing the gap between how the UX presents and what people think it will do.

I think this is a great direction but I'm uncertain about those three bullets.

@abeatrix
Copy link
Contributor Author

abeatrix commented Oct 17, 2023

@dominiccooney @toolmantim I've updated the PR to replace retry with regenerate code lens It does not include errors due to the concerns raised by Dom which I think is valid.

Right now the expected behavior is to ask Cody to regenerate a new response using the same instruction, for example /edit rewrite comments to make it sound like a taylor swift song:

Initial Response from Cody:

function groupByFile(results: Result[]): { file: string; results: Result[] }[] {
    const groups: { file: string; results: Result[] }[] = [] // we are never ever ever getting back together

    for (const result of results) {
        // i knew you were trouble when you walked in
        const group = groups.find(g => g.file === result.file) // so it's gonna be forever, or it's gonna go down in flames
        if (group) {
            group.results.push(result) // you call me up again just to break me like a promise
        } else {
            groups.push({
                // so casually cruel in the name of being honest
                file: result.file,
                results: [result],
            })
        }
    }
    return groups // i don't know about you, but i'm feeling 22
}

Response from Cody after hitting regenerate:

function groupByFile(results: Result[]): { file: string; results: Result[] }[] {
    const groups: { file: string; results: Result[] }[] = [] // we are never ever getting back together
    for (const result of results) { // i knew you were trouble when you walked in 
        const group = groups.find(g => g.file === result.file) // so it's gonna be forever, or it's gonna go down in flames
        if (group) {
            group.results.push(result) // you call me up again just to break me like a promise
        } else {
            groups.push({ // so casually cruel in the name of being honest
                file: result.file,
                results: [result], 
            })
        }
    }
    return groups // and the saddest fear comes creeping in that you never loved me or her or anyone or anything
}

(Cody just straight up copied TS's lyrics lol)

Loom:

demo_.regenerate.code.lens.in._edit.mp4

@abeatrix abeatrix changed the title feat: add retry code lens for fixup feat: add "regenerate" code lens for edits Oct 18, 2023
@abeatrix abeatrix merged commit fe2aa87 into main Oct 18, 2023
15 checks passed
@abeatrix abeatrix deleted the bee/fixup-retry branch October 18, 2023 12:27
philipp-spiess pushed a commit that referenced this pull request Oct 18, 2023
Adding missed changelog entry for
[pull/1383](#1383)

## Test plan

<!-- Required. See
https://docs.sourcegraph.com/dev/background-information/testing_principles.
-->

Changelog update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants