[Local App Snippet] support non conversational LLMs #954

mishig25 · 2024-10-07T09:39:43Z

Description

Most GGUF files on the hub are insutrct/conversational. However, not all of them. Previously, local app snippets assumed that all GGUFs are insutrct/conversational.

vLLM

https://huggingface.co/meta-llama/Llama-3.2-3B?local-app=vllm

mishig@machine:~$ curl -X POST "http://localhost:8000/v1/completions" \
        -H "Content-Type: application/json" \
        --data '{
                "model": "meta-llama/Llama-3.2-3B",
                "prompt": "Once upon a time",
                "max_tokens": 150,
                "temperature": 0.5
        }'

{"id":"cmpl-157aad50ba6d45a5a7e2641a3c8157dd","object":"text_completion","created":1728293162,"model":"meta-llama/Llama-3.2-3B","choices":[{"index":0,"text":" there was a man who was very generous and kind to everyone. He was a good man and a good person. One day he was walking down the street and he saw a man who was very poor and starving. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking. The man was so hungry that he was crying and shaking","logprobs":null,"finish_reason":"length","stop_reason":null,"prompt_logprobs":null}],"usage":{"prompt_tokens":5,"total_tokens":155,"completion_tokens":150}}

llama.cpp

https://huggingface.co/mlabonne/gemma-2b-GGUF?local-app=llama.cpp

llama-cli \
  --hf-repo "mlabonne/gemma-2b-GGUF" \
  --hf-file gemma-2b.Q2_K.gguf \
  -p "Once upon a time "

llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
        repo_id="mlabonne/gemma-2b-GGUF",
        filename="gemma-2b.Q2_K.gguf",
)

output = llm(
        "Once upon a time ",
        max_tokens=512,
        echo=True
)

print(output)

packages/tasks/src/local-apps.ts

packages/tasks/src/model-libraries-snippets.ts

Vaibhavs10

Minor nit, but important specially wrt llama.cpp

packages/tasks/src/local-apps.ts

packages/tasks/src/model-libraries-snippets.ts

packages/tasks/src/local-apps.ts

packages/tasks/src/model-libraries-snippets.ts

packages/tasks/src/local-apps.ts

mishig25 · 2024-10-07T14:18:42Z

Added test cases as the examples are getting more complex and we can be sure not to break any existing examples

packages/tasks/src/local-apps.spec.ts & packages/tasks/src/model-libraries-snippets.spec.ts

Vaibhavs10 · 2024-10-08T05:31:47Z

packages/tasks/src/local-apps.ts

 		`	--data '{`,
 		`		"model": "${model.id}",`,
 		`		"messages": [`,
 		`			{"role": "user", "content": "Hello!"}`,


Suggested change

` {"role": "user", "content": "Hello!"}`,

` {"role": "user", "content": "What is the capital of France?"}`,

Minor suggestion: Hello! looks a bit too terse. Perhaps we can unify the Instruct examples to be the same as llama-cpp-python and so on.

handled in 2e7c080

packages/tasks/package.json

Co-authored-by: vb <vaibhavs10@gmail.com>

Co-authored-by: Victor Muštar <victor.mustar@gmail.com>

mishig25 · 2024-11-18T13:07:43Z

This PR is finally ready to be reviewed.

Besides the changes described in the description, vLLM snippet also supports now vision models:

huggingface.js/packages/tasks/src/local-apps.spec.ts

Lines 98 to 121 in d6c5b5b

    
           vllm serve "meta-llama/Llama-3.2-11B-Vision-Instruct" 
        
           # Call the server using curl: 
        
           curl -X POST "http://localhost:8000/v1/chat/completions" \\ 
        
           	-H "Content-Type: application/json" \\ 
        
           	--data '{ 
        
           		"model": "meta-llama/Llama-3.2-11B-Vision-Instruct", 
        
           		"messages": [ 
        
           			{ 
        
           				"role": "user", 
        
           				"content": [ 
        
           					{ 
        
           						"type": "text", 
        
           						"text": "Describe this image in one sentence." 
        
           					}, 
        
           					{ 
        
           						"type": "image_url", 
        
           						"image_url": { 
        
           							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" 
        
           						} 
        
           					} 
        
           				] 
        
           			} 
        
           		] 
        
           	}'`);

@Vaibhavs10 @pcuenca @julien-c

pcuenca

Conceptually looks great!

pcuenca · 2024-11-18T14:16:07Z

packages/tasks/src/local-apps.ts

-		`		]`,
-		`	}'`,
-	];
+	const messages = getModelInputSnippet(model) as ChatCompletionInputMessage[];


Vaibhavs10

Niceee! All good wrt llama.cpp + llama-cpp-python + vllm snippets. Do we need to standardise TGI snippets too?

mishig25 · 2024-11-18T14:41:18Z

Do we need to standardise TGI snippets too?

Yes, but lets do it in subseq PR once this PR gets merged into main

All good wrt llama.cpp + llama-cpp-python + vllm snippets

approve ?

Vaibhavs10

Merci!

mishig25 marked this pull request as ready for review October 7, 2024 10:07

mishig25 requested review from osanseviero, SBrandeis, gary149, Wauplin, julien-c and pcuenca as code owners October 7, 2024 10:07

mishig25 requested review from ngxson and Vaibhavs10 October 7, 2024 10:07

Base automatically changed from fix_vlmm_snippet to main October 7, 2024 10:08

julien-c reviewed Oct 7, 2024

View reviewed changes

packages/tasks/src/local-apps.ts Outdated Show resolved Hide resolved

packages/tasks/src/model-libraries-snippets.ts Outdated Show resolved Hide resolved

Vaibhavs10 reviewed Oct 7, 2024

View reviewed changes

packages/tasks/src/local-apps.ts Outdated Show resolved Hide resolved

packages/tasks/src/model-libraries-snippets.ts Outdated Show resolved Hide resolved

packages/tasks/src/local-apps.ts Outdated Show resolved Hide resolved

gary149 reviewed Oct 7, 2024

View reviewed changes

packages/tasks/src/model-libraries-snippets.ts Outdated Show resolved Hide resolved

gary149 reviewed Oct 7, 2024

View reviewed changes

packages/tasks/src/local-apps.ts Outdated Show resolved Hide resolved

Vaibhavs10 reviewed Oct 8, 2024

View reviewed changes

julien-c reviewed Oct 8, 2024

View reviewed changes

packages/tasks/package.json Outdated Show resolved Hide resolved

Vaibhavs10 mentioned this pull request Oct 9, 2024

Add Text Generation Inference as a Local App 🧠 #962

Merged

mishig25 and others added 8 commits November 18, 2024 11:44

[Local App Snippet] support non conversational LLMs

8239d41

llama_cpp_python

36761d5

Apply suggestions from code review

c94a986

Co-authored-by: vb <vaibhavs10@gmail.com>

Apply suggestions from code review

f61ac3c

Co-authored-by: Victor Muštar <victor.mustar@gmail.com>

Add test cases

bd09de1

prefer to use array const

968bc02

real examples

524e965

"once upon a time," example

de46212

mishig25 force-pushed the non_conv_models branch from c81dff2 to de46212 Compare November 18, 2024 10:46

mishig25 added 3 commits November 18, 2024 11:48

fix rebase

6761b20

fix imports

04e8f0c

simplify strings

d4e7fcd

mishig25 added 4 commits November 18, 2024 12:16

format

0485bd1

use shared example message

2e7c080

vLLM VLM snippet support

31632a1

match naming

d6c5b5b

mishig25 requested review from Vaibhavs10, gary149 and coyotte508 November 18, 2024 13:07

pcuenca reviewed Nov 18, 2024

View reviewed changes

Vaibhavs10 reviewed Nov 18, 2024

View reviewed changes

Vaibhavs10 approved these changes Nov 18, 2024

View reviewed changes

mishig25 added 3 commits November 20, 2024 10:39

llama_cpp_python use same snippet

fa745eb

Merge branch 'main' into non_conv_models

6bc8650

lint

0e7cf59

mishig25 merged commit f83bbe6 into main Nov 20, 2024
4 of 5 checks passed

mishig25 deleted the non_conv_models branch November 20, 2024 09:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Local App Snippet] support non conversational LLMs #954

[Local App Snippet] support non conversational LLMs #954

mishig25 commented Oct 7, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Vaibhavs10 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mishig25 commented Oct 7, 2024 •

edited

Loading

Uh oh!

Vaibhavs10 Oct 8, 2024

Uh oh!

mishig25 Nov 18, 2024

Uh oh!

Uh oh!

mishig25 commented Nov 18, 2024

Uh oh!

pcuenca left a comment

Uh oh!

pcuenca Nov 18, 2024

Uh oh!

Vaibhavs10 left a comment

Uh oh!

mishig25 commented Nov 18, 2024 •

edited

Loading

Uh oh!

Vaibhavs10 left a comment

Uh oh!

Uh oh!

Uh oh!

	` {"role": "user", "content": "Hello!"}`,
	` {"role": "user", "content": "What is the capital of France?"}`,

[Local App Snippet] support non conversational LLMs #954

[Local App Snippet] support non conversational LLMs #954

Conversation

mishig25 commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

vLLM

llama.cpp

llama-cpp-python

Uh oh!

Uh oh!

Uh oh!

Vaibhavs10 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mishig25 commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Vaibhavs10 Oct 8, 2024

Choose a reason for hiding this comment

Uh oh!

mishig25 Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mishig25 commented Nov 18, 2024

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

pcuenca Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

Vaibhavs10 left a comment

Choose a reason for hiding this comment

Uh oh!

mishig25 commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Vaibhavs10 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mishig25 commented Oct 7, 2024 •

edited

Loading

mishig25 commented Oct 7, 2024 •

edited

Loading

mishig25 commented Nov 18, 2024 •

edited

Loading