Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi-repo search for enterprise, remove remote embeddings #2879

Merged
merged 12 commits into from
Jan 26, 2024

Conversation

dominiccooney
Copy link
Contributor

@dominiccooney dominiccooney commented Jan 24, 2024

  • Removes remote embeddings from consumer and enterprise
  • Adds remote search through the getCodyContext GraphQL endpoint as a new context provider for Enterprise
  • The current file's repository (if any), or cody.codebase, is automatically included by this context source
  • Adds a multi-repo picker to Enterprise chats; up to 9 manually selected remote repos are saved in the chat transcript
  • New chats default to using the workspace roots' repositories for context

Known issues:

  • Needs automated tests for VSCode

Fixes #2622, #2624, #2850

Test plan

This change deletes a lot of unused code so naturally, run the unit tests, e2e tests, etc.

Manual test plan:

Remote embeddings have been removed for consumer users:

  1. Clone a popular open source repository, such as git@github.com/react/react.git
  2. Open VSCode and sign in to a consumer account
  3. Start a chat
  4. Verify that the enhanced context selector advertises Search (that is, symf) and Embeddings with "set up embeddings." You should not see "sourcegraph.com" (that is, remote embeddings.)
  5. Open a file in the repository, Cmd-Shift-P, Cody Command: Edit Code and edit the code. The edit should succeed.

Enterprise changes, so sign into an Enterprise account, then:

Repo selection:

  1. Open a new VSCode session with nothing in the workspace.
  2. Start a chat.
  3. Open the Enhanced Context Selector.
  4. Verify that you see a message about no repositories.
  5. Click Choose Repositories.
  6. Verify the Repository Picker works: that you can pick up to 9 repositories manually, that you can add and remove repositories, that you can search repositories, that ESC or a click outside the picker does not affect the repositories chosen, etc.
  7. Verify that the Enhanced Context Selector works: Mouse over the repository names and see the full name; click X to remove a repository; etc.

Chats use the selected repositories; chats save the selected repositories:

  1. Start a chat and add some repositories.
  2. Chat with Cody and verify that context is fetched from the selected repositories.
  3. Close the chat.
  4. Open a new chat and verify that the repository list is empty again.
  5. Reopen the first chat and verify that the repository list are the ones selected in step 2.

Chats automatically use the repository workspace roots for context:

  1. Open a clone of a repository indexed by the enterprise.
  2. Start a chat.
  3. Verify that the Enhanced Context Selector includes the relevant repository. (If this is a different repository, you may have cody.codebase set.)

Supports multiple workspace roots:

  1. File, Add Folder to Workspace, select another folder which is a clone of a repository indexed by the enterprise.
  2. Start a chat.
  3. Verify that the Enhanced Context Selector includes all the workspace root repositories (max 9) and that you can manually remove them (expect the automatically included one.)
  4. Open a file from the first folder in the workspace; verify the related repository is automatically included (has an "info" mark in the Enhanced Context Selector.)
  5. Open a file from the second folder in the workspace; verify the related repository is automatically included (info mark.) Note, cody.codebase will override this automatic switching behavior. For example, if you use s2 and open the github.com/sourcegraph/sourcegraph repo the .vscode/settings.json file sets "cody.codebase": "github.com/sourcegraph/sourcegraph". If your second folder is the github.com/sourcegraph/cody repository, you won't get automatic inclusion of the Cody repository by design.
  6. Set the cody.codebase setting, start a new chat, and check the automatically included repository matches cody.codebase.

Inline edits:

  1. Open a file.
  2. Cmd-Shift-P, Cody Command: Edit Code
  3. Instruct Cody and check that the edit was carried out

@dominiccooney
Copy link
Contributor Author

@beyang I have some tidy up and testing work to do but I would love your directional feedback on this.

Loom demo: https://www.loom.com/share/d7ca236087fd44988c35a5a0e9e1ae0f

@dominiccooney dominiccooney force-pushed the dpc/multi-repo-enterprise-search branch 2 times, most recently from c24e537 to df61c04 Compare January 25, 2024 08:48
@dominiccooney dominiccooney changed the title WIP: Multi-repo search for Enterprise Add multi-repo search for enterprise, remove remote embeddings Jan 25, 2024
@dominiccooney dominiccooney requested a review from a team January 25, 2024 09:23
@pkukielka
Copy link
Contributor

pkukielka commented Jan 25, 2024

Would you be able to also add some tests to index.test.ts?
Those are agent tests, and agent is used by IntelliJ/NeoVim clients and also as I understand by some ML people recently.
Adding tests there allow us to avoid regressions is future, increase stability, and additionally works as documentation of the API usage.
Even one test with typical use-case scenario goes a long way for us 🙏

@olafurpg
Copy link
Member

@pkukielka I am looking into writing tests cases for this PR 💪🏻

@dominiccooney I just tried this out and was surprised by the behavior of Enter in the quickpick. I expected it to select the highlighted item but it closes the quickpick without selecting the item.

CleanShot.2024-01-25.at.15.36.00.mp4

@dominiccooney
Copy link
Contributor Author

@pkukielka

Would you be able to also add some tests to index.test.ts?

Yes, this needs automated tests! I would love some details about JetBrains' plans for the enhanced context selector and repo picker and simplified chat in general to help with that... I noticed sourcegraph/jetbrains#258 was closed completed but I'd love some pointers.

@olafurpg

I just tried this out and was surprised by the behavior of Enter in the quickpick. I expected it to select the highlighted item but it closes the quickpick without selecting the item.

That's the VSCode quickpick, it is goofy. Space to select items. Enter to "accept" the whole modal. ESC to cancel.

@olafurpg
Copy link
Member

@dominiccooney I pushed a commit to my branch olafurpg/multi-repo-tests adding an integration tests for the agent. https://github.com/sourcegraph/cody/compare/olafurpg/multi-repo-tests?expand=1

I had to tweak the webview protocol a bit to support setting the repo directly without a quickpick, and also to get the remote repo. I have not yet tested chat/restore with remote repos but that's something we need to support for JetBrains (esp. because we restart the process regularly).

@dominiccooney dominiccooney force-pushed the dpc/multi-repo-enterprise-search branch from fae22fa to 5b46c09 Compare January 26, 2024 01:22
Copy link
Contributor

@toolmantim toolmantim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one @dominiccooney!

I pushed up some style cleanups, and I've approved from a design POV.

Is it possible to ensure the Enhanced Popover doesn't dismiss itself after you've added some repositories?

I agree "Choose Repositories…" is a better term for this button label too (I think you've suggested this before). I didn't want to pop that commit on in case it broke tests, but feel free to make that change too.

dominiccooney and others added 7 commits January 26, 2024 14:46
Play around with repo selector.

Move repo picker to a component.

WIP repo picker

Some debug logging.

Basic repo picker complete, without account to account or run to run caching or recents.

Mass deletion of embeddings etc.

Enhanced context status shows add repo button and launches picker.

Add a remote search client.

Rough first cut plumbing in remote search to chat.

Change the default useContext setting to "blended".

Hide multi-repo search from consumer.

Wire up the repo remove button.

Add a title to the quickpick guiding the selection.

Automatically populate the workspace root repos.

Adding and removing workspace folders works.

Show the repositories from the workspace separately.

Provide repo ID lookup to the CodebaseStatusProvider.

Unwind automatically including all workspace repos.

Include the current file codebase automatically.

Move remote repo handlers into method.

Selected repos are pre-selected in the repo picker.

Fix fat-fingered rebase.

Attempt at using flexbox for repo list layout.

Remove button uses pointer.

Show short names in the enhanced context selector.

Putting repo names in "detail" so they are still searchable

Save and restore selected repos in the chat transcript.

Add a "no repositories" message for empty workspaces.

biome

Add a no-op quickpick to agent.

Update agent recordings with repository queries.

Fix URI comparison for codebase update.

Remove the repo fetching delay.

Clean up unused parameter todo

Simpler startup without initial codebase; edit uses repo search context.

Update changelog.
@dominiccooney
Copy link
Contributor Author

Thanks @olafurpg for the agent test and @toolmantim for the style improvements.

Recent changes:

  • Renamed Add Repositories => Choose Repositories
  • Cherry-picked @olafurpg 's agent test for enterprise search
  • Use the single, global SourcegraphGraphQLClient
  • Show links to context with useful tooltips and open them in the browser

@dominiccooney
Copy link
Contributor Author

@toolmantim

Is it possible to ensure the Enhanced Popover doesn't dismiss itself after you've added some repositories?

+1, the little flicker of updated content before it dismisses is particularly distracting. If it is OK, let's do a follow up with popover for toplayer and persistence while the picker is open.

There's also something weird going on with the background color of the remote context links being different (transparent) to the local context buttons (15% mix of foreground link color and transparent) I need some help with.

Copy link
Contributor

@abeatrix abeatrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some questions in line, will continue the review in the morning but lgtm so far!
Exciting feature, well done 🚀

// The name of the repository in the remote provider. For example the
// context group may be "~/projects/frobbler" but the remote name is
// "host.example/universal/frobbler".
remoteName: string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because the provider can have multiple repositories?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, it is unused, so I removed it. In the old days we could have a source show up one way (like ~/projects/frozzler) but be provided by something that named it something else (like github.com/widgets/frozzler.) But now, Enterprise, everything calls it the same thing—the repo name.

repoName: string
commit: string
uri: URI
path: string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is path here = url.path? Or is it referring to something else?

I always find path and fileName confusing 🤣

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not uri.path. The URI points to a Sourcegraph results page which has blob hashes and stuff. path is the path to the file from the root of the repository.

Good feedback that this needs a comment.

@@ -706,6 +722,11 @@ export class SourcegraphGraphQLAPIClient {
headers,
})
.then(verifyResponseCode)
//.then(response => response.text())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, although it is super handy if you want to see the response body, so I was thinking of leaving this here... WDYT?

@@ -160,6 +165,8 @@ export class ChatPanelsManager implements vscode.Disposable {
const provider = this.createProvider()
if (chatID) {
await provider.restoreSession(chatID)
} else {
await provider.newSession()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the provider has just been created, would it has a new session set up already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newSession does steps to set up that session.

@olafurpg
Copy link
Member

The Windows test is failing because the prompt is different on Windows causing the HTTP replay to error. I think it's fine to add a .skipIf(isWindows()) for the newly added test. Alternatively, if the prompt is only different because or ordering of context files, then one option is to add special sorting logic like we had to do for non-stable symf results https://sourcegraph.com/github.com/sourcegraph/cody@450d5ef6fddf176975a6ed33a4fabd3724a1a501/-/blob/vscode/src/chat/chat-view/context.ts?L215

See comment why, it's only the test that doesn't run on Windows. The
feature works fine on Windows otherwise based on a manual test
Copy link
Member

@olafurpg olafurpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏻 I have not reviewed the full diff in detail but I have done the following:

  • Manually verified the feature works as expected on macOS and Windows
  • Added an automated squirrel test powered by multi-repo context. This test is skipped on Windows because there seems to be some platform-specific logic related to how we send the keyword search requests that doesn't affect the functionality in a live VS Code instance, only how it runs in tests.
  • Reviewed the rough structure of the code

I am sure we can address minor issues with separate smaller PRs. Let's merge this as is because it represents a huge milestone for the February launch. Great work @dominiccooney ! 👏🏻

@olafurpg olafurpg merged commit 581734a into main Jan 26, 2024
15 checks passed
@olafurpg olafurpg deleted the dpc/multi-repo-enterprise-search branch January 26, 2024 13:25
olafurpg added a commit that referenced this pull request Jan 26, 2024
Previously, the agent initialized the VS Code extension against
sourcegraph.com with an empty access token. We did this to ensure that
initialization succeeded even with an invalid server endpoint. However,
by doing so we introduced another problem: the extension sent a lot of
network traffic to sourcegraph.com for enterprise accounts.

Now, we use the custom server endpoint even during initialization. This
change eliminates all automatic requests to sourcegraph.com for the
minimized test cases we have in the codebase (covers initialization and
sending a chat message). This currently only works out of the box thanks
to the PR #2879 that removes
embeddings. Before that change, initialization was regularly reporting
errors from `EmbeddingDetector`, which has now been removed.
olafurpg added a commit that referenced this pull request Jan 26, 2024
Previously, the agent initialized the VS Code extension against
sourcegraph.com with an empty access token. We did this to ensure that
initialization succeeded even with an invalid server endpoint. However,
by doing so we introduced another problem: the extension sent a lot of
network traffic to sourcegraph.com for enterprise accounts.

Now, we use the custom server endpoint even during initialization. This
change eliminates all automatic requests to sourcegraph.com for the
minimized test cases we have in the codebase (covers initialization and
sending a chat message). This currently only works out of the box thanks
to the PR #2879 that removes
embeddings. Before that change, initialization was regularly reporting
errors from `EmbeddingDetector`, which has now been removed.


## Test plan

Green CI. Manually search for "sourcegraph.com" in the updated recording
file and confirm that all matches reference demo.sourcegraph.com.
<!-- Required. See
https://sourcegraph.com/docs/dev/background-information/testing_principles.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement an "add repo" button for Enterprise customers to query multiple repositories
5 participants