Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ What we call “AI copilots” are much more than a single LLM. As the Berkeley

As more of software development is automated, we are seeing more human engineering time go into monitoring, maintaining, and improving the different components that make up AI software development systems. That said, most copilots to date have been black box, SaaS solutions with roughly the same components, which you often have little to no ability to understand or improve.

- [Tab model](./where-we-are-today/tab.md)
- [Autocomplete model](./where-we-are-today/autocomplete.md)
- [Chat model](./where-we-are-today/chat.md)
- [Local context engine](./where-we-are-today/local.md)
- [Server context engine](./where-we-are-today/server.md)
Expand Down
3 changes: 3 additions & 0 deletions where-we-are-today/autocomplete.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Autocomplete model

The "autocomplete" model component is used to power code completion suggestions and is typically a 1-15B parameter model. The models are run on your laptop or on a server and have generally been trained with special templates like fill-in-the-middle (FIM) for code infilling. Because developers need a suggestion within 500ms, you generally need to use a smaller model in order to meet the latency requirements. However, the quality of suggestions you get from models that are too small is bad. Thus, the tab-autocomplete model is optimized primarily with these two constraints in mind. Examples of models used for code completion include [Codex](https://arxiv.org/pdf/2107.03374.pdf), [CodeGemma](https://developers.googleblog.com/2024/04/gemma-family-expands.html), [Code Llama](https://arxiv.org/pdf/2308.12950.pdf), [DeepSeek Coder Base](https://deepseekcoder.github.io/), [StarCoder 2](https://arxiv.org/pdf/2402.19173.pdf), etc.
2 changes: 1 addition & 1 deletion where-we-are-today/chat.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Chat model

The “chat” model component is used to power question-answer experiences and is typically a 30B+ parameter model. Latency is not as important as it is for the “tab” model, so most people choose the one that gives them the best possible responses, oftentimes opting for SaaS API endpoints. When SaaS isn’t possible or preferred, open-source models are self-hosted on a server for the entire team to use. Examples of models used for chat experiences include GPT-4, DeepSeek Coder 33B, Claude 3, Code Llama 70B, etc.
The “chat” model component is used to power question-answer experiences and is typically a 30B+ parameter model. Latency is not as important as it is for the “autocomplete” model, so most people choose the one that gives them the best possible responses, oftentimes opting for SaaS API endpoints. When SaaS isn’t possible or preferred, open-source models are self-hosted on a server for the entire team to use. Examples of models used for chat experiences include GPT-4, DeepSeek Coder 33B, Claude 3, Code Llama 70B, Llama 3 70B etc.
3 changes: 0 additions & 3 deletions where-we-are-today/tab.md

This file was deleted.