Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Tab Autocomplete #758

Merged
merged 22 commits into from
Feb 1, 2024
Merged

✨ Tab Autocomplete #758

merged 22 commits into from
Feb 1, 2024

Conversation

sestinj
Copy link
Contributor

@sestinj sestinj commented Jan 17, 2024

WIP tab autocomplete

@cmp-nct
Copy link

cmp-nct commented Jan 22, 2024

nice to see this is being worked on, it will make continue a true replacement for copilot

I've been using Copilot for a couple hundred hours, I found that important regarding autocompletion:

  • word level autocompletion is very important (copilot does that with ctrl + arrow right)
  • the amount of completion output should be configurable (as in lines or characters or guidance)
  • the context given to the llm should have options to configure
  • when working with web projects (PHP for example) and SQL then the table schema in context is extremely helpful to the LLM, copilot currently is not doing that (JB phpstorm) so it only learns from available examples in context and hallucinates a lot
  • it would be nice if the data that is sent to the llm is viewable to the user (as debug/insight)
  • context should be configurable per section (as in how many tokens before, after cursor, how many of SQL, file structure, etc
  • copilot appears to have a "recently edited" temporary section it has in context, so if you worked in 3 tabs, lines from those areas are available for completion (at least for chat, not sure about autocompletion)

Performance
Depending on the model provider there would be massive performance improvements possible.
For example if you have a context of 5000 tokens, once they are processed they could be loaded from a KV cache again, that would save to reprocess everything, only new tokens from the end would need to be evaluated.
Once the context window runs out a re-rope would allow to continue, that's magnitudes faster than a re-evaluation.
Any local running llm would allow to do that.

@meteor199
Copy link

This is fascinating! Any idea when this PR might be merged and subsequently released?

@sestinj
Copy link
Contributor Author

sestinj commented Jan 26, 2024

@meteor199 looking to release an early preview next week

@sestinj
Copy link
Contributor Author

sestinj commented Jan 26, 2024

@cmp-nct thanks for this, definitely a few new ideas here for me. Brief response to your bullets in order:

  • we will do this! Turns out it's built into VS Code
  • planning on giving access to stop words and max tokens, which will allow you to "only complete single lines" if you'd like among other things
  • agreed
  • maybe we even go as far as allowing "plugins". lots to build, but could be very very useful for devs to define links between files themselves
  • agreed
  • not totally sure what each of these mean, but the pattern here is I'm very pro-configuration : )
  • recently edited, copy buffer, open tabs all will be taken into account

Your performance notes are very important—for this reason we'll probably be highly encouraging specific providers that we form-fit at first

@c10l
Copy link

c10l commented Jan 28, 2024

Nice to see this bein worked on!

One feature that would be great to have is what Codeium call Fill-in-the-Middle [0]. I don't know if Copilot has that these days but when I tried it with Codeium the amount of times it kicked in and helped was really great!

[0] https://codeium.com/blog/inline-fim-code-suggestions

@sestinj
Copy link
Contributor Author

sestinj commented Jan 28, 2024

@c10l already been done! Turns out many models are trained by default to do this (they are passed a prefix and suffix, and told to write the code that should be inserted at the cursor in between)

@cmp-nct
Copy link

cmp-nct commented Jan 31, 2024

@cmp-nct thanks for this, definitely a few new ideas here for me. Brief response to your bullets in order:

  • we will do this! Turns out it's built into VS Code
  • planning on giving access to stop words and max tokens, which will allow you to "only complete single lines" if you'd like among other things
  • agreed
  • maybe we even go as far as allowing "plugins". lots to build, but could be very very useful for devs to define links between files themselves
  • agreed
  • not totally sure what each of these mean, but the pattern here is I'm very pro-configuration : )
  • recently edited, copy buffer, open tabs all will be taken into account

Your performance notes are very important—for this reason we'll probably be highly encouraging specific providers that we form-fit at first

Great to hear :)
I don't want to flood you with details, it's overwhelming. Just can't help but share the most concerning things I noticed, missed or found important. Maybe some of those resonate:)

Regarding 'context should be configurable per section (as in how many tokens before, after cursor, how many of SQL, file structure, etc':
What I meant is that the amount of data sent to the llm as "context" should have a broad configuration range, if possible fine tuned on the type of context.
I'm sure you have plenty of own ideas in that regard, for me I see 3 main areas:

  1. The primary context - that's N lines of code BEFORE the cursor
  2. The post cursor context - that's if "fill-in" type is allowed. Basically any smart instruction-llm can do fill in with a good prompt but pure completion llms would need a fine tune for it. - so that's N lines of code AFTER the cursor
  3. Metadata that is not in the direct flow of the context but important for completions:
  • Copilot always sends the function/class prototype to the llm for example, some functions/classes are too large to fit into context, but the prototype (arguments, return, possibly documentation) is especially important
  • SQL schemas (for example if a table is being used, if the schema is available that could be added as a "comment" in begin of llm prompt/context
  • filename of the file, possibly the project structure, possibly imports/includes (file header)
  • Copilot has recently improved a lot and one of those improvements seem to be the addition of "recently changed lines". So if I change 5 lines in 5 places in 3 files (total 25 lines added/changed) and then I go to file 4, copilot will sent those 25 changed lines in some form of "metadata" alongside the main completion request.
  • And maybe a freeform text that can be added to context. Like the current continue.dev Chat feature, where you select lines and add them into context of the question. The same could be used for autocompletion maybe. So if you work on file_a.c and all the API is in file_b.c then maybe file_b.h could be added into context manually.

So what I meant is that those sections/subsections - if they are included - should have some configuration options in how much of them is sent to the LLM.
One of your primary difficulties for auto completion is the wide range of models and APIs available and the wide range of inference hardware and configuration options.
So depending on that some type of requests will need small context, some will need a lot, some will work well with metadata, some not. The more we can configure, the more models and platforms will work well.

The idea of plugins is very interesting too.

I will stay tuned on the upcoming releases, very promising work
For the start I'd guess a minimal version will be fine, just wanted to share some ideas

@sestinj sestinj marked this pull request as ready for review February 1, 2024 02:38
@sestinj
Copy link
Contributor Author

sestinj commented Feb 1, 2024

Here goes the merge! This is still a beta version of tab-autocomplete, and should be understood as such, meaning please share your feedback! The best places to do this are the #feedback channel on Discord, GitHub Issues, or on the Contribution Ideas Board if you have thoughts on how to improve or would like to contribute code.

Over the next few weeks we will be focusing a lot on this feature, so expect significant improvement.

It will use deepseek-coder:1.3b-base on Ollama by default and we recommend this for now. If you'd really like to play around with the settings however, you can learn how to do this here. If there's some option you want and don't see, let us know! The goal is to make Continue's autocomplete entirely configurable.

@sestinj sestinj merged commit 0419da7 into preview Feb 1, 2024
@sestinj sestinj deleted the nate/tab-autocomplete branch February 1, 2024 02:59
@Blackvz
Copy link

Blackvz commented Feb 1, 2024

Tested it for some hours now. Pretty useful so far even with a small model like deepseek-coder:1.3b

Thanks for your work! I will report if I find anything that does not work like expected but for now its really nice.

Btw I am working on a Macbook with M1 and I have even better code completion results with stable-code-3b and also same speed as deepseek-coder:1.3b.

@tungh2
Copy link

tungh2 commented Feb 2, 2024

Great news. Looking forward to IntelliJ plugin update too!

@skydiablo
Copy link

skydiablo commented Feb 5, 2024

@sestinj thats amazing! you will hit all the other plugins, especially all the features from commercial alternatives! can you give an estimation to bring all these features to intellij (in my propose phpstorm) ?!
i'm running a big llm on my 4090: docker exec -it ollama ollama run deepseek-coder:33b

@sestinj
Copy link
Contributor Author

sestinj commented Feb 5, 2024

We'll probably take another 1-2 weeks to further improve tab autocomplete just within VSCode, and then will transfer it over to JetBrains

@ispolin
Copy link

ispolin commented Feb 5, 2024

I'd like to add a request for an optional "idle time" parameter to the config. In the preview build, the autocomplete triggers on every change, resulting in continue.dev calling my local LLM multiple times per second and reprocessing the context each time. Usually it's not a problem because of smart context shift, but sometimes it decides to reprocess the entire context, resulting in some slowdown. I'd like to see an option where autocomplete only fires off if I pause typing for some configurable time period.
Thanks for your work and consideration of this request!

@Blackvz
Copy link

Blackvz commented Feb 5, 2024

I'd like to add a request for an optional "idle time" parameter to the config. In the preview build, the autocomplete triggers on every change, resulting in continue.dev calling my local LLM multiple times per second and reprocessing the context each time. Usually it's not a problem because of smart context shift, but sometimes it decides to reprocess the entire context, resulting in some slowdown. I'd like to see an option where autocomplete only fires off if I pause typing for some configurable time period. Thanks for your work and consideration of this request!

You can set "debounceDelay": The delay in milliseconds before triggering autocomplete after a keystroke. (Number)

See https://continue.dev/docs/walkthroughs/tab-autocomplete

@deanrie
Copy link

deanrie commented Feb 16, 2024

Please add the ability to use other providers besides ollama.

@pykeras
Copy link

pykeras commented Mar 22, 2024

It would be nice to have a shortcut key to disable autocomplete in VSCode, or even a setting to disable it on certain file types.

@dimidagd
Copy link

+1 On using other providers besides ollama.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet