Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: update post process logic for claude instant #1440

Merged
merged 14 commits into from
Oct 24, 2023
Merged

Conversation

abeatrix
Copy link
Contributor

@abeatrix abeatrix commented Oct 19, 2023

fix: use full infill block and adjust prefix

  • fix prompt structure order, context should be shared after the role assigning
  • empty prefix if it contains the infill block for issue where first line is duplicated
  • Simplify prompt messages to focus on providing clear context
  • Add CLOSING tag for Assistant's infill block to preserve ending whitespace
  • This also fixed issues where multi-line autocomplete gets cut off.
  • remove tags for context code as that might cause confusion for claude

Test plan

Fix: show multi-line completion after {:

image

Fix: show multi-line completions the line after {

image

Copy link
Member

@valerybugakov valerybugakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we separate the bug fix from the prompt changes and put the prompt changes behind the feature flag so that we can test how they affect CAR in production before making it a default?

The prompt updates look logical, and I expect them to perform well, but with LLMs, we never know how newly added tokens will affect the output. We have quite a few changes here, from the sentence structure updates to changing the code block tags. Splitting this PR in two would make it easier to reason about CAR changes later:

  1. Minimal prompt change required to fix a bug + logic updates.
  2. Prompt updates behind the feature flag.

@abeatrix
Copy link
Contributor Author

@valerybugakov that's fair, I can do that next!

Have you tried the change in this PR yet? Is it working for you?

},
{
speaker: 'human',
text: `Below is the code from file path ${relativeFilePath}. Review the code outside the XML tags to detect the functionality, formats, style, patterns, and logics in use. Then, use what you detect and reuse methods/libraries to complete and enclose completed code only inside XML tags precisely without duplicating existing implementations. Here is the code: \n\`\`\`\n${infillPrefix}${OPENING_CODE_TAG}${infillBlock}${CLOSING_CODE_TAG}${infillSuffix}\n\`\`\``,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't split the prompt anymore between "head" and "tail", should we remove the whole getHeadAndTail function?

@philipp-spiess
Copy link
Contributor

TBH I’m happy to move fast with this given how big our regression is right now. My only thinking is that:

  • we should run our completion test suite to ensure we don't add known issues
  • we delete old code that is no longer needed (like the head/tail splitting of the prefix) to avoid introducing code dept

once you're happy with the results we should push out a patch release.

Copy link
Contributor

@philipp-spiess philipp-spiess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preemptive stamp but lets make sure the tests are good

@abeatrix
Copy link
Contributor Author

@philipp-spiess @valerybugakov I tried to make minimal prompt change as possible but the one we had just doesn't work anymore because of the lines and whitespaces removed on claude side. I added <cursor> to the end of the infill block to make sure the spaces are not removed, and it seems to work fine on my side, but i wasn't able to get the eval tests to work to confirm if there is other regression, but definately better than what is in prod right now. If this is good for now, please feel free to merge and make a patch release on monday to make sure our users are not getting affected, and I can do any required follow up works

@philipp-spiess
Copy link
Contributor

@abeatrix I pushed a fix for the local completions test and there seem to be some issues with the newly used XML leaking into the completion:

Screenshot 2023-10-23 at 11 37 46

Also the test case with a comment followed by a \n isn't working well either :/

Screenshot 2023-10-23 at 11 38 47

@abeatrix
Copy link
Contributor Author

abeatrix commented Oct 23, 2023

@valerybugakov After taking a look at the issue with @philipp-spiess , it looks like the root cause of the issue was the new parseAndTruncateCompletion that doesn't work well with the this.postProcess() for anthropic. I tried added some workaround in my lastest commits and that seems to work (see the updated infill test result). Can you take a look and see if this change makes sense to you? Since this is a P1 issue, do you mind taking over of this PR while I'm away to make sure we can have this issue resolved asap?

Update

looks like the issue (at current main) is when the current line ends with non-bracket, we will display the suggestion after trimming start()
image
image

It doesn't happen if prefix ends with bracket (or if we remove the postProcess logic)
image

@valerybugakov the change in my latest commit where I remove the space after the \n in new suggestions to prevent normalizeStartLine from removing the \n seems to work. can you take a look and see if this looks ok to you or something we can update in normalizeStartLine without causing regression for other provider?

@@ -67,7 +67,7 @@ export class AnthropicProvider extends Provider {
const { head, tail, overlap } = getHeadAndTail(this.options.docContext.prefix)

// Infill block represents the code we want the model to complete
const infillBlock = tail.trimmed
const infillBlock = tail.trimmed.endsWith('{\n') ? tail.trimmed.trimEnd() : tail.trimmed
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this allows claude to response with multi-line completion on new line after the { bracket

@@ -216,6 +216,10 @@ export class AnthropicProvider extends Provider {
// leading `\n` followed by whitespace that Claude might add.
completion = completion.replace(/^\s*\n\s*/, '')
} else {
// prevent normalizeStartLine from removing the starting new line
if (completion.startsWith('\n')) {
completion = '\n' + completion.trimStart()
Copy link
Contributor Author

@abeatrix abeatrix Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normalizeStartLine in parseAndTruncateCompletion would remove the \n if it follows by indentation when prefix didn't end with closing bracket (Need Valery to confirm if this is intended ). This solves the aforementioned issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the logic to normalizeStartLine only for multiline completions like before my changes in this PR. It fixes the issue without adding the additional logic. Thank you for narrowing down this regression!

@abeatrix abeatrix changed the title fix: use full infill block and adjust prefix fix: update post process logic for claude instant Oct 23, 2023
@valerybugakov
Copy link
Member

@abeatrix, thank you for sharing the context! I'm debugging it locally and will merge this if I can confirm your findings 👍

Copy link
Member

@valerybugakov valerybugakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After my change, I verified that the cases mentioned here still work locally as expected. Regenerated completions using the generate:completions to ensure we do not have any leaking XML tags.

@valerybugakov
Copy link
Member

The PR is blocked by eslint errors on main. Fixing it here: #1471

@valerybugakov valerybugakov merged commit 05e9039 into main Oct 24, 2023
13 of 14 checks passed
@valerybugakov valerybugakov deleted the bee/fix-infill branch October 24, 2023 05:26
abeatrix added a commit that referenced this pull request Oct 24, 2023
Patch release for p1 bugs

- #1477
- #1440
- Update default prompt mixin

## Test plan

<!-- Required. See
https://docs.sourcegraph.com/dev/background-information/testing_principles.
-->

version bump
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants