LLM powered development for VSCode

llm-vscode is an extension for all things LLM. It uses llm-ls as its backend.

We also have extensions for:

Previously huggingface-vscode.

Note

When using the Inference API, you will probably encounter some limitations. Subscribe to the PRO plan to avoid getting rate limited in the free tier.

https://huggingface.co/pricing#pro

Features

Code completion

This plugin supports "ghost-text" code completion, à la Copilot.

Choose your model

Requests for code generation are made via an HTTP request.

You can use the Hugging Face Inference API or your own HTTP endpoint, provided it adheres to the API specified here or here.

The list of officially supported models is located in the config template section.

Always fit within the context window

The prompt sent to the model will always be sized to fit within the context window, with the number of tokens determined using tokenizers.

Code attribution

Hit Cmd+shift+a to check if the generated code is in in The Stack. This is a rapid first-pass attribution check using stack.dataportraits.org. We check for sequences of at least 50 characters that match a Bloom filter. This means false positives are possible and long enough surrounding context is necesssary (see the paper for details on n-gram striding and sequence length). The dedicated Stack search tool is a full dataset index and can be used for a complete second pass.

Installation

Install like any other vscode extension.

By default, this extension uses bigcode/starcoder & Hugging Face Inference API for the inference.

HF API token

You can supply your HF API token (hf.co/settings/token) with this command:

Cmd/Ctrl+Shift+P to open VSCode command palette
Type: Llm: Login

If you previously logged in with huggingface-cli login on your system the extension will read the token from disk.

Configuration

You can check the full list of configuration settings by opening your settings page (cmd+,) and typing Llm.

Endpoint

You can configure the endpoint to which requests will be sent.

Let's say your current code is this:

import numpy as np
import scipy as sp
{YOUR_CURSOR_POSITION}
def hello_world():
    print("Hello world")

The request body will then look like:

const inputs = `{start token}import numpy as np\nimport scipy as sp\n{end token}def hello_world():\n    print("Hello world"){middle token}`
const data = { inputs, parameters: { max_new_tokens: 256 } };

const model = configuration.modelIdOrEndpoint;
let endpoint;
if (model.startswith("https://")) {
  endpoint = model;
} else {
  endpoint = `https://api-inference.huggingface.co/models/${model}`;
}

const res = await fetch(endpoint, {
    body: JSON.stringify(data),
    headers,
    method: "POST"
});

const json = await res.json() as { generated_text: string };

Note that the example above is a simplified version to explain what is happening under the hood.

Suggestion behavior

You can tune the way the suggestions behave:

llm.enableAutoSuggest lets you choose to enable or disable "suggest-as-you-type" suggestions.
llm.documentFilter lets you enable suggestions only on specific files that match the pattern matching syntax you will provide. The object must be of type DocumentFilter | DocumentFilter[]:
- to match on all types of buffers: llm.documentFilter: { pattern: "**" }
- to match on all files in my_project/: llm.documentFilter: { pattern: "/path/to/my_project/**" }
- to match on all python and rust files: llm.documentFilter: { pattern: "**/*.{py,rs}" }

Keybindings

llm-vscode sets two keybindings:

you can trigger suggestions with Cmd+shift+l by default, which corresponds to the editor.action.inlineSuggest.trigger command
code attribution is set to Cmd+shift+a by default, which corresponds to the llm.attribution command

llm-ls

By default, llm-ls is bundled with the extension. When developing locally or if you built your own binary because your platform is not supported, you can set the llm.lsp.binaryPath setting to the path of the binary.

Tokenizer

llm-ls uses tokenizers to make sure the prompt fits the context_window.

To configure it, you have a few options:

No tokenization, llm-ls will count the number of characters instead:

{
  "llm.tokenizer": null
}

from a local file on your disk:

{
  "llm.tokenizer": {
    "path": "/path/to/my/tokenizer.json"
  }
}

from a Hugging Face repository, llm-ls will attempt to download tokenizer.json at the root of the repository:

{
  "llm.tokenizer": {
    "repository": "myusername/myrepo"
  }
}

from an HTTP endpoint, llm-ls will attempt to download a file via an HTTP GET request:

{
  "llm.tokenizer": {
    "url": "https://my-endpoint.example.com/mytokenizer.json",
    "to": "/download/path/of/mytokenizer.json"
  }
}

Code Llama

To test Code Llama 13B model:

Make sure you have the latest version of this extension.
Make sure you have supplied HF API token
Open Vscode Settings (cmd+,) & type: Llm: Config Template
From the dropdown menu, choose codellama/CodeLlama-13b-hf

Phind and WizardCoder

To test Phind/Phind-CodeLlama-34B-v2 and/or WizardLM/WizardCoder-Python-34B-V1.0 :

Make sure you have the latest version of this extension.
Make sure you have supplied HF API token
Open Vscode Settings (cmd+,) & type: Llm: Config Template
From the dropdown menu, choose Phind/Phind-CodeLlama-34B-v2 or WizardLM/WizardCoder-Python-34B-V1.0

Read more about Phind-CodeLlama-34B-v2 here and WizardCoder-15B-V1.0 here.

Developing

Clone this repo: git clone https://github.com/huggingface/llm-vscode
Install deps: cd llm-vscode && npm i
In vscode, open Run and Debug side bar & click Launch Extension

Community

Repository	Description
huggingface-vscode-endpoint-server	Custom code generation endpoint for this repository
llm-vscode-inference-server	An endpoint server for efficiently serving quantized open-source LLMs for code.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
.vscode		.vscode
src		src
.eslintignore		.eslintignore
.eslintrc.js		.eslintrc.js
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
.vscodeignore		.vscodeignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
small_logo.png		small_logo.png
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM powered development for VSCode

Features

Code completion

Choose your model

Always fit within the context window

Code attribution

Installation

HF API token

Configuration

Endpoint

Suggestion behavior

Keybindings

llm-ls

Tokenizer

Code Llama

Phind and WizardCoder

Developing

Community

About

Releases

Packages

Languages

License

noahbald/llm-vscode

Folders and files

Latest commit

History

Repository files navigation

LLM powered development for VSCode

Features

Code completion

Choose your model

Always fit within the context window

Code attribution

Installation

HF API token

Configuration

Endpoint

Suggestion behavior

Keybindings

llm-ls

Tokenizer

Code Llama

Phind and WizardCoder

Developing

Community

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages