Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autocomplete: Fix codebase context inference for embeddings #525

Merged
merged 5 commits into from
Aug 4, 2023

Conversation

philipp-spiess
Copy link
Contributor

Closes #446

In the current state, embedding results would only ever be added if a cody.codebase was configured in the VS Code settings. That was because we didn't account for two things:

  • The version of codebase context we passed into the config is the initial version that is only inferred from the config. We have the class ContextProvider which is actively tracking changes to the current active document and updates the config accordingly (this is necessary because you can have multiple git repos in a workspace and want to infer the remote from the proper repo)
  • Additionally, ContextProvider was only initialized when the chat web view was loaded the first time meaning that context wasn't added if you don't have the sidebar visible (I’m not sure if this affects inline chat too cc @abeatrix @umpox @dominiccooney)

This PR changes the API to accept a getCodebaseContext callback that's only queried when the context is actually needed (so it can always resolve to the latest value) and makes sure that ContextProvider is initialized as part of the startup.

Test plan

  • Open a repo that is indexed but does not have a codebase config set up (I used sourcegraph/sourcegraph but removed the option from the config
  • Observe that we now fetch embeddings and that the regular chat still seems to work.
Screenshot 2023-08-01 at 12 43 25 Screenshot 2023-08-01 at 12 42 24

@sqs
Copy link
Member

sqs commented Aug 2, 2023

Wow, good catch. I wonder how much this will change things. Kinda worried it will slow stuff down, no?

@philipp-spiess
Copy link
Contributor Author

@sqs It shouldn't slow stuff down because we never wait for the network request (we only fire of an embeddings search and then look if we already have results in the cache synchronously) but I am worried that we start sending less relevant context snippets now because we never rank the embedding results against the local editor ones.

Maybe we land this fix in its own release though so we can better understand the impact 😅

@chwarwick
Copy link
Contributor

Maybe we land this fix in its own release though so we can better understand the impact 😅

When this rolls out do we have the ability to measure if using embeddings has a measurable impact on the acceptance rate?

@philipp-spiess
Copy link
Contributor Author

When this rolls out do we have the ability to measure if using embeddings has a measurable impact on the acceptance rate?

We do 🙂 https://sourcegraph.looker.com/x/JsYdDYU6kydx4ZkNvyFfKW

Copy link
Contributor

@chwarwick chwarwick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this make sense to me, however I also haven't worked much within the extension so I think it would be best to have an additional review from someone more familiar with this codebase. It would be very beneficial if we could get this into a release soon so that we can measure the potential impact as we have upcoming trials that are specifically looking to measure success based on acceptance rate.

@philipp-spiess
Copy link
Contributor Author

@chwarwick Thanks 🙇

@umpox / @abeatrix Can I get either of you to glance over this? You've worked on this in the past I think :)

Copy link
Contributor

@umpox umpox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, thank you!! Double checked inline chat + fixes and all seems as expected.

Side note: This makes me think we should remove the automatically focus Cody on load behaviour for local dev, it doesn't replicate a typical user

@philipp-spiess philipp-spiess enabled auto-merge (squash) August 4, 2023 08:54
@philipp-spiess philipp-spiess merged commit 91f1df0 into main Aug 4, 2023
7 of 9 checks passed
@philipp-spiess philipp-spiess deleted the ps/autocomplete-fix-embeddings branch August 4, 2023 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Autocomplete only uses available embeddings if cody.codebase is set
4 participants