Wingman - AI Coding Assistant

The Wingman-AI extension brings high quality AI assisted coding right to your computer, it's 100% free and data never leaves your machine - meaning it's completely private! Since the current release of this extension relies on running the AI models locally using Ollama, it is recommended you are on a machine with capable graphics card (Apple M series/Nvidia cards) for the best performance.

🚀 Getting Started

We recommend starting with Ollama with a deepseek model(s), see why here.

Install this extension from the VS Code Marketplace: Wingman-AI
Install Ollama
Install the supported local models by running the following command(s): Example:
- ollama pull deepseek-coder:6.7b-base-q8_0
- ollama pull deepseek-coder:6.7b-instruct-q8_0
That's it! This extension will validate that the models are configured correctly in it's VSCode settings upon launch. If you wish to customize which models run see the FAQ section.

Features

Code Completion

The AI will look for natural pauses in typing to decide when to offer code suggestions (keep in mind the AI is limited by your machine speed).

Code Completion Disable / HotKey

We understand that sometimes the code completion feature can be too aggressive, which may strain your system's resources during local development. To address this, we have introduced an option to disable automatic code completion. However, we also recognize the usefulness of on-demand completion. Therefore, we've implemented a hotkey that allows you to manually trigger code completion at your convenience.

When you need assistance, simply press Shift + Ctrl + Space. This will bring up a code completion preview right in the editor and a quick action will appear. If you're satisfied with the suggested code, you can accept it by pressing Enter. This provides you with the flexibility to use code completion only when you want it, without the overhead of automatic triggers.

Interactive Chat

Talk to the AI naturally! It will use open files as context to answer your question, or simply select a section of code to use as context.

AI Providers

Ollama

Ollama is a free and open-source AI model provider, allowing users to run their own local models.

Why Ollama?

Ollama was chosen for it's simplicity, allowing users to pull a number of models in different configurations and update them at will. Ollama will pull optimized models based on your system architecture, however if you do not have a GPU accelerated machine, models will be slower.

Setting up Ollama

Follow the directions on the Ollama website. Ollama has a number of open source models available that are capable of writing high quality code. See getting started for how to pull and customize models.

Supported Models

The extension uses a separate model for chat and code completion. This is due to the fact that different types of models have different strengths, mixing and matching offers the best result.

Supported Models for Code Completion:

Deepseek-base (tested with: deepseek-coder:6.7b-base-q8_0)
Codellama-code (tested with: codellama:7b-code-q4_K_M)
Magicoder-DS (tested with wojtek/magicoder:6.7b-s-ds-q8_0)

Supported Models for Chat:

Deepseek-Instruct (tested with: deepseek-coder:6.7b-instruct-q8_0)
Codellama-Instruct (tested with: codellama:7b-instruct)
Phind-CodeLlama - (tested with: phind-codellama:34b-v2-q2_K)
Magicoder-DS (tested with wojtek/magicoder:6.7b-s-ds-q8_0)
Llama3-Instruct (tested with llama3:8b-instruct-q6_K)

Hugging Face

Hugging Face supports hosting and training models, but also supports running many models (under 10GB) for free! All you have to do is create a free account.

NOTE - your data is not private and will not be sanitized prior to being sent.

Setting up Hugging Face

Once you have a Hugging Face account and an API key, all you need to do is open the VSCode settings pane for this extension "Wingman" (see FAQ).

Once it's open, select "HuggingFace" as the AI Provider and add your API key under the HuggingFace section:

Supported Models

The extension uses a separate model for chat and code completion. This is due to the fact that different types of models have different strengths, mixing and matching offers the best result.

Supported Models for Code Completion:

CodeLlama (tested with: codellama/CodeLlama-7b-hf)

Supported Models for Chat:

Mixtral v0.1 (tested with mistralai/Mixtral-8x7B-Instruct-v0.1)
Mistral v0.2 (tested with: mistralai/Mistral-7B-Instruct-v0.2)

OpenAI

OpenAI integration is supported for GPT4-Turbo, allowing you to run code completion, chat or other functionality to run on larger and more powerful models.

NOTE - your data is not private and will not be sanitized prior to being sent.

FAQ

How can I change which models are being used? This extension uses settings like any other VSCode extension, see the examples below. NOTE Changing a model reloads reloading VSCode (on mac cmd+R).

The AI models feel slow, why? As of pre-release 0.0.6 we've added an indicator in the bottom status bar to show you when an AI model is actively processing. If you aren't using GPU accelerated hardware, you may need to look into Quantization].

Why do some models have "q2" or "q4" after the name? Information on model Quantization

Troubleshooting

This extension leverages Ollama due to it's simplicity and ability to deliver the right container optimized for your running environment. However good AI performance relies on your machine specs, so if you do not have the ability to GPU accelerate, responses may be slow. During startup the extension will verify the models you have configured in the VSCode settings pane for this extension, the extension does have some defaults:

Code Model - deepseek-coder:6.7b-base-q8_0

Chat Model - deepseek-coder:6.7b-instruct-q8_0

The models above will require enough RAM to run them correctly, you should have at least 12GB of ram on your machine if you are running these models. If you don't have enough ram, then choose a smaller model but be aware that it won't perform as well. Also see information on model Quantization.

Release Notes

To see the latest release notes - check out our releases page.

Enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
.github		.github
.vscode		.vscode
docs		docs
media		media
src		src
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.npmrc		.npmrc
.vscode-test.mjs		.vscode-test.mjs
.vscodeignore		.vscodeignore
CHANGELOG.md		CHANGELOG.md
LICENSE.txt		LICENSE.txt
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
third-party-licenses.txt		third-party-licenses.txt
tsconfig.json		tsconfig.json
vite.config.mts		vite.config.mts
vsc-extension-quickstart.md		vsc-extension-quickstart.md

License

RussellCanfield/wingman-ai

Folders and files

Latest commit

History

Repository files navigation

Wingman - AI Coding Assistant

🚀 Getting Started

Features

Code Completion

Code Completion Disable / HotKey

Interactive Chat

AI Providers

Ollama

Why Ollama?

Setting up Ollama

Supported Models

Hugging Face

Setting up Hugging Face

Supported Models

OpenAI

FAQ

Troubleshooting

Release Notes

About

Resources

License

Stars

Watchers

Forks

Languages