Aral's User Study #38

Ar4l · 2024-03-15T11:50:02Z

This request updates the VSC Plugin and adds new respective APIs for a user study on developer interactions with code completions. The JetBrains plugin and its APIs are unaffected. User study data is stored under a new subdirectory data_aral.

The general motivation behind the modifications proposed in this PR is to more closely resemble the SOTA code-completion tools that continuously generate suggestions; while also preventing the target LLMs from generating suggestions unnecessarily.

Fixes

Plugin build configuration for hot-reloading during development (seemed outdated).
Extension objects are now disposed of when the plugin is disabled.
Fixes rankings in VSCode to show Code4Me completions at the top.
Bugfix CodeGPT generation: prefix was trimmed twice, resulting in errors if the cursor is on an empty newline. This was likely a source of memory leaks as the resulting tensors are created on the GPU and not disposed of.
Debounces automatic invocations to when the user has stopped typing (and labels them as 'auto' instead of manual).
Stored JSON fields to follow Pythonic snake_case convention, as they are going to be used in data analysis anyway.
language field determined using vsc API instead of file extension.
splitTextAtCursor was flaky like 20% of the time, placing one fewer character in the prefix because it was using global position & document (which may not be perfectly synchronised with when the provideCompletionItems method is called).
General prefix & suffix character matching to fit the completion in properly with the surrounding code. This was the cause of (1) hidden/deprioritised completions and (2) very annoying additional brackets being placed. Instead of hard-coded rules, it now uses a general algorithm.

Adds

Tracks manual invocations.
New idle invocation type, after the user has not interacted for 2s. Completions are not shown if none are received (as opposed to IntelliSense's default No Completions indicator).
If a completion is generated while the user is still typing, we try to match the last few characters with the completion so they are not duplicated.
Stores time a suggestion is displayed, and accepted.
Explicitly receives and sends the models, so the model that generated an accepted completion can be retrieved deterministically even if the completion is the same as another model. Also shows the model to the user when they press ⌃Space again.
The context for completions is always stored in the user study. ReadMe is updated to reflect that developers who use the tool, agree to these terms.
Server-side filter for rejecting completions likely to be ignored or useless to the user. The user is assigned one of four filters for a session (where two completions are no more than 30 minutes apart). The filters are:
1. A simple logistic-regression model leveraging telemetry data.
2. A CodeBERTa model fine-tuned on the code context surrounding the cursor.
3. Two JonBERTa models (custom architecture) leveraging both telemetry and code context.
Option for testing deployment locally. Run flask app with env variables CODE4ME_TEST=true and CODEGPT_CHECKPOINT_PATH set to a model from HF (e.g. 'microsoft/CodeGPT-small-py').

Todos

code4me-server/src/api.py

Ar4l and others added 25 commits March 9, 2024 12:31

fix compilation to follow vscode defaults

b8e3269

refactor data flow, fix suggestion priority

9f5afae

fix verification to only be called when a suggestion is accepted

21c8ab5

add manual trigger detection

359a498

impl debounce and caching logic

dede5d6

impl server-side user study API

9b06cd0

add test env option

88f28c3

debug server-side

f6a214e

fix debouncing

eacb4df

impl verification endpoint

eb5c81d

add completion & action time tracking

102d659

bugfix trimmed prefix

3d42a58

update gitignore

be203d1

tweak interactions

6f6d0b5

prepare for deployment

ca33dbd

fix debouncing one more time

c021b6b

update readme

767c36f

update for deployment

f98e4b4

disable globbing 1M files

3f905fa

update docker env

4709b81

update api for larisa

272c3b4

update prefix & suffix matching for completions

1dfb4c9

bugfix flaky doc splitting into prefix/suffix

883b119

compute insertion range for completions

23d87f7

update readme for brevity

db62a75

Ar4l force-pushed the aral_user_study branch from 32570c0 to db62a75 Compare March 17, 2024 15:52

fheijden added 2 commits March 17, 2024 16:58

update compose to include study data dir

ff67f95

use float timedelta instead of int

8f2ea2a

Ar4l marked this pull request as ready for review March 17, 2024 16:25

Ar4l requested a review from FrankHeijden March 17, 2024 16:29

Ar4l requested review from timvandam and MalihehIzadi March 17, 2024 16:29

Ar4l self-assigned this Mar 17, 2024

Ar4l mentioned this pull request Mar 17, 2024

Action for user-study branch to be published to VSC Marketplace #39

Closed

FrankHeijden reviewed Mar 18, 2024

View reviewed changes

code4me-server/src/api.py Outdated Show resolved Hide resolved

Ar4l added 2 commits March 18, 2024 12:43

update error handling in v2/autocomplete

1b13086

add passthrough filter for baseline

1b0af53

Ar4l force-pushed the aral_user_study branch from d233d8a to 1b0af53 Compare March 18, 2024 11:48

Ar4l requested a review from FrankHeijden March 18, 2024 11:57

FrankHeijden merged commit 620bfa3 into main Mar 18, 2024

FrankHeijden deleted the aral_user_study branch March 18, 2024 12:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Aral's User Study #38

Aral's User Study #38

Uh oh!

Ar4l commented Mar 15, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Aral's User Study #38

Aral's User Study #38

Uh oh!

Conversation

Ar4l commented Mar 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Ar4l commented Mar 15, 2024 •

edited

Loading