Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codeintel: Auto-inference sandbox API #33756

Merged
merged 17 commits into from
Apr 22, 2022
Merged

codeintel: Auto-inference sandbox API #33756

merged 17 commits into from
Apr 22, 2022

Conversation

efritz
Copy link
Contributor

@efritz efritz commented Apr 11, 2022

This PR adds an initial internal/codeintel/autoindexing/internal/inference package that encodes the rules of our autoindexing inference in a Lua sandbox. This package is currently unused by any entrypoint (outside of unit tests) and subsequent efforts (#33044 and #33045) will replace this code with the original.

Closes #33041 and #33040.

This PR does not close discussion on API design. This PR has roughed-out what seems to be a very usable interface for us to be able to extend. We're still waiting on customer feedback (see #33047) to help us ensure we're solving the right expressibility problem, but it doesn't seem to be a blocker for moving the PoC forward (but may be a blocker before moving onto or past #33046).

Reviewer hints:

Please review by-commit, paying special attention to the following:

Test plan

This PR adds significantly good coverage and shows that we have one-to-one behavior with our existing indexers (test cases are taken verbatim). Additional testing will be done before this code is hooked up to a production (or even development application) entrypoint.

@cla-bot cla-bot bot added the cla-signed label Apr 11, 2022
@efritz efritz self-assigned this Apr 11, 2022
@efritz efritz force-pushed the ef/playground branch 11 times, most recently from cf65be2 to 8aab4e2 Compare April 13, 2022 19:59
@efritz efritz requested a review from Strum355 April 19, 2022 20:34
@mrnugget
Copy link
Contributor

High-level question: the PR description says this closes #33041 and #33040 which belong to milestones 1 and 4 respectively. This comment in the RFC makes it sound like we still haven't finished milestone 0 yet.

So why do this now? Did we actually have to implement this to get some feedback from customers? Because if we haven't heard from customers yet on RFC 624 and whether it'll solve their problem, this is a lot of code to build on a hunch. What I'm thinking is: could we have written pseudo Lua code and asked them whether they'd use it and whether that would work for their codebase?

@efritz
Copy link
Contributor Author

efritz commented Apr 21, 2022

High-level response: what happened in the week while @mrnugget was OoO and some misc thoughts:

  • We've reached out to four customers this iteration to adopt auto-indexing. One of them is blocked on their internal Java team. One of them uses primarily Go, so we're all set for them. The other two we've asked specific questions about their repository setup and how they would auto-indexing their code by hand. They've both responded this morning so we'll be following up to gather more useful information.
  • Increasing momentum on people setting up auto-indexing and running into the "what now" part when it comes to getting it to configure jobs is putting pressure on having some sort of inference-logic-as-configuration solution. The majority of RFC 624 is a direct solution to this (with some additional work on exporting it for users as polished interface). I feel that M0 (API validation) is a blocker for making it a user-facing feature, but not to do the technical work up front to give us the benefit of modifying a running instance without having to hard-code customer-specific rules in the next patch release.
  • This PR combines parts of M1 (Lua port of lib/codeintel/autoindexing/inference and M4 (update/optimize gitserver communication) as it turns out it would have been more total work to do a direct port in M1 and then change it later for M4. @tjdevries gave a lot of insights to help inform these decisions as well.

What we should do going forward:

  • Land M1 and M4 (this PR)
  • Add additional rate limiting and observation around gitserver calls from Lua (M2: harden gitserver client). We've made the number of requests more efficient across all recognizers, but we're still changing the exact set of things we're running and can't be too careful.
  • Add a backdoor way to insert configuration into an instance so that the next release will have some mid-release way to say "Ooh you want support for pom.xml.rft? No problem!" and can give them a string to insert directly into their database for now.
  • Later (not urgently) polish this and advertise it as a possibility for users (not an ace-in-our-sleeve during experiments).

I'm very confident that the solution here is not far off from the solution we'd get if we blocked on completion of M0. We have enough uses cases we don't support today for Java alone (no build tooling support today, only package repos) that the API in this PR is perfectly suited for.

Completion of M0 is still super important because the Lua sandbox in this PR is not yet ergonomic. It's a bare-bones v1 (that improves on the existing art), but it's not idiomatic and has no standard library attached yet. (I'll be deferring heavily to customer response and @tjdevries's feedback here).

@efritz efritz requested review from olafurpg and removed request for varungandhi-src April 21, 2022 13:56
Copy link
Member

@olafurpg olafurpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 I like the new lua inference logic and it will be a gamechanger for users to be able to roll out custom inference changes.

internal/codeintel/autoindexing/inference/lua/util.lua Outdated Show resolved Hide resolved
internal/codeintel/autoindexing/inference/service_test.go Outdated Show resolved Hide resolved
@efritz efritz requested a review from tjdevries April 22, 2022 00:32
@efritz efritz enabled auto-merge (squash) April 22, 2022 00:43
@efritz
Copy link
Contributor Author

efritz commented Apr 22, 2022

Enabling auto-merge now, but would be very happy to receive another review. We'll be iterating on this in the very near future.

@efritz efritz disabled auto-merge April 22, 2022 01:02
@efritz efritz enabled auto-merge (squash) April 22, 2022 01:02
@efritz efritz merged commit 273d79b into main Apr 22, 2022
@efritz efritz deleted the ef/playground branch April 22, 2022 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC 624: (M1) Re-build PoC recognizers
5 participants