Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server : add "token healing" support #5765

Open
4 tasks done
CyberShadow opened this issue Feb 28, 2024 · 5 comments
Open
4 tasks done

server : add "token healing" support #5765

CyberShadow opened this issue Feb 28, 2024 · 5 comments
Labels
enhancement New feature or request good first issue Good for newcomers server/webui

Comments

@CyberShadow
Copy link

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Feature Description

Hi! I am experimenting with using llama.cpp as a general-purpose code completion backend, similar to TabNine.

I am encountering a small problem: if the completion prompt ends mid-word, the results are not very accurate. For example, for a prompt such as Five, Four, Thre [sic], the model will often ignore the typo and suggest , Two (forming Thre, Two).

I think, as an option to the /completion server API, the following optional behavior would be useful:

  1. Tokenize the text
  2. Chop off the last token
  3. Run the prediction with the remaining tokens, but only consider those tokens whose bytes start with the bytes of the last token.

Thanks!

@CyberShadow CyberShadow added the enhancement New feature or request label Feb 28, 2024
@stduhpf
Copy link
Contributor

stduhpf commented Feb 28, 2024

The usual name for this feature is "token healing". I agree that it would be nice to have it supported here.

@ggerganov ggerganov changed the title Mid-token completion server : add "token healing" support Feb 28, 2024
@ilyannn
Copy link

ilyannn commented Mar 6, 2024

@ggerganov I'd like to try working on it as my first issue!

@ggerganov
Copy link
Owner

Ok. This can be demonstrated in one of the examples. One way would be to add it to main or simple + extend llama_sampling_sample with the necessary functionality

@mare5x
Copy link

mare5x commented May 1, 2024

Hi @ilyannn, do you still want to work on this? I've created a draft PR (#7028) that demonstrates token healing, but I still haven't added it to main or server. We can collaborate on that, if you'd like.

@ilyannn
Copy link

ilyannn commented May 7, 2024

@mare5x Sorry, I have not actually started so please don't wait for me. I'll try to take a look at your PR this week though and will be happy to help in any way I can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers server/webui
Projects
Status: Todo
Development

No branches or pull requests

5 participants