Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prompt returns the statment #85

Open
jonny190 opened this issue Jan 9, 2024 · 11 comments
Open

Prompt returns the statment #85

jonny190 opened this issue Jan 9, 2024 · 11 comments

Comments

@jonny190
Copy link

jonny190 commented Jan 9, 2024

When trying to use this with LocalAI it just spits back at me the prompt i sent it. Please see the below Example

image

localai-api-1 | 4:05PM DBG Request received: localai-api-1 | 4:05PM DBG Configuration read: &{PredictionOptions:{Model:luna-ai-llama2-uncensored.Q8_0.gguf Language: N:0 TopP:1 TopK:80 Temperature:0.5 Maxtokens:150 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q8_0.gguf F16:true Threads:10 Debug:true Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:chat ChatMessage: Completion:completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString:auto functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:22 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:1024 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:} localai-api-1 | 4:05PM DBG Response needs to process functions localai-api-1 | 4:05PM DBG Parameters: &{PredictionOptions:{Model:luna-ai-llama2-uncensored.Q8_0.gguf Language: N:0 TopP:1 TopK:80 Temperature:0.5 Maxtokens:150 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q8_0.gguf F16:true Threads:10 Debug:true Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:chat ChatMessage: Completion:completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString:auto functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:22 MMap:false MMlock:false LowVRAM:false Grammar:space ::= " "? localai-api-1 | string ::= "\"" ( localai-api-1 | [^"\\] | localai-api-1 | "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) localai-api-1 | )* "\"" space localai-api-1 | root-0-arguments-list-item-service-data ::= "{" space "\"entity_id\"" space ":" space string "}" space localai-api-1 | root-0-arguments-list-item ::= "{" space "\"domain\"" space ":" space string "," space "\"service\"" space ":" space string "," space "\"service_data\"" space ":" space root-0-arguments-list-item-service-data "}" space localai-api-1 | root-0-arguments-list ::= "[" space (root-0-arguments-list-item ("," space root-0-arguments-list-item)*)? "]" space localai-api-1 | root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"function\"" space ":" space root-0-function "}" space localai-api-1 | root-1-function ::= "\"answer\"" localai-api-1 | root-0-arguments ::= "{" space "\"list\"" space ":" space root-0-arguments-list "}" space localai-api-1 | root-0-function ::= "\"execute_services\"" localai-api-1 | root-1-arguments ::= "{" space "\"message\"" space ":" space string "}" space localai-api-1 | root-1 ::= "{" space "\"arguments\"" space ":" space root-1-arguments "," space "\"function\"" space ":" space root-1-function "}" space localai-api-1 | root ::= root-0 | root-1 StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:1024 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:} localai-api-1 | 4:05PM DBG Prompt (before templating): I want you to act as smart home manager of Home Assistant. localai-api-1 | I will provide information of smart home along with a question, you will truthfully make correction or answer using information provided in one sentence in everyday language. localai-api-1 | localai-api-1 | Current Time: 2024-01-09 16:05:39.239165+00:00 localai-api-1 | localai-api-1 | Available Devices: localai-api-1 | ```csv localai-api-1 | entity_id,name,state,aliases localai-api-1 | scene.office_standard,Office Standard,2024-01-09T08:59:36.686917+00:00, localai-api-1 | scene.office_game,Office Game,2024-01-08T20:36:39.723638+00:00, localai-api-1 | light.bedroom_lamp,Bedroom Lamp,on, localai-api-1 | light.bedside_lamps,Bedside Lamps,on, localai-api-1 | light.shapes_0c36,Shapes 0C36,on, localai-api-1 | light.lines_01a4,Lines 01A4,on, localai-api-1 | climate.hallway,Hallway,heat, localai-api-1 | light.livingroom_corner_2,LivingRoom corner,unavailable, localai-api-1 | light.kitchen_right_up,Kitchen Right UP,on, localai-api-1 | light.desk_downlight,Desk Downlight,on, localai-api-1 | light.controller_rgb_2304a5,Kitchen Left Up,on, localai-api-1 | light.office_light_controller,Office Roof Lights,on,Office Main Light localai-api-1 | light.tv_cabinet,TV Cabinet,off, localai-api-1 | light.kitdownright,Kitchen Right Downlight,on, localai-api-1 | light.kitchen_left_downlight,Kitchen Left Downlight,on, localai-api-1 | light.backwall,BackWall,on, localai-api-1 | light.4,4,on, localai-api-1 | light.controller_rgb_fd2bd8,Bed Downlight,on, localai-api-1 | switch.fps_smasher,Computer,off, localai-api-1 | light.master_bedroom_table_lamp_bathroom,Master Bedroom Table Lamp Bathroom,on, localai-api-1 | light.master_bedroom_table_lamp,Master Bedroom Table Lamp Window,on, localai-api-1 | light.0x4c5bb3fffefcd9d6,En suit shower,off, localai-api-1 | light.ensuit_down,Master Bathroom,off, localai-api-1 | light.hallway,Hallway Downlights,off, localai-api-1 | light.ensuit_downlights,EnSuit Downlights,off, localai-api-1 | switch.tv_power,Air freshener,off, localai-api-1 | light.livingroom_floorlamp,Livingroom Floor Lamp,off, localai-api-1 | ``` localai-api-1 | localai-api-1 | The current state of devices is provided in available devices. localai-api-1 | Use execute_services function only for requested action, not for current states. localai-api-1 | Do not execute service without user's confirmation. localai-api-1 | Do not restate or appreciate what user says, rather make a quick inquiry. localai-api-1 | Turn Computer On localai-api-1 | 4:05PM DBG Prompt (after templating): I want you to act as smart home manager of Home Assistant. localai-api-1 | I will provide information of smart home along with a question, you will truthfully make correction or answer using information provided in one sentence in everyday language. localai-api-1 | localai-api-1 | Current Time: 2024-01-09 16:05:39.239165+00:00 localai-api-1 | localai-api-1 | Available Devices: localai-api-1 | ```csv localai-api-1 | entity_id,name,state,aliases localai-api-1 | scene.office_standard,Office Standard,2024-01-09T08:59:36.686917+00:00, localai-api-1 | scene.office_game,Office Game,2024-01-08T20:36:39.723638+00:00, localai-api-1 | light.bedroom_lamp,Bedroom Lamp,on, localai-api-1 | light.bedside_lamps,Bedside Lamps,on, localai-api-1 | light.shapes_0c36,Shapes 0C36,on, localai-api-1 | light.lines_01a4,Lines 01A4,on, localai-api-1 | climate.hallway,Hallway,heat, localai-api-1 | light.livingroom_corner_2,LivingRoom corner,unavailable, localai-api-1 | light.kitchen_right_up,Kitchen Right UP,on, localai-api-1 | light.desk_downlight,Desk Downlight,on, localai-api-1 | light.controller_rgb_2304a5,Kitchen Left Up,on, localai-api-1 | light.office_light_controller,Office Roof Lights,on,Office Main Light localai-api-1 | light.tv_cabinet,TV Cabinet,off, localai-api-1 | light.kitdownright,Kitchen Right Downlight,on, localai-api-1 | light.kitchen_left_downlight,Kitchen Left Downlight,on, localai-api-1 | light.backwall,BackWall,on, localai-api-1 | light.4,4,on, localai-api-1 | light.controller_rgb_fd2bd8,Bed Downlight,on, localai-api-1 | switch.fps_smasher,Computer,off, localai-api-1 | light.master_bedroom_table_lamp_bathroom,Master Bedroom Table Lamp Bathroom,on, localai-api-1 | light.master_bedroom_table_lamp,Master Bedroom Table Lamp Window,on, localai-api-1 | light.0x4c5bb3fffefcd9d6,En suit shower,off, localai-api-1 | light.ensuit_down,Master Bathroom,off, localai-api-1 | light.hallway,Hallway Downlights,off, localai-api-1 | light.ensuit_downlights,EnSuit Downlights,off, localai-api-1 | switch.tv_power,Air freshener,off, localai-api-1 | light.livingroom_floorlamp,Livingroom Floor Lamp,off, localai-api-1 | ``` localai-api-1 | localai-api-1 | The current state of devices is provided in available devices. localai-api-1 | Use execute_services function only for requested action, not for current states. localai-api-1 | Do not execute service without user's confirmation. localai-api-1 | Do not restate or appreciate what user says, rather make a quick inquiry. localai-api-1 | Turn Computer On localai-api-1 | 4:05PM DBG Grammar: space ::= " "? localai-api-1 | string ::= "\"" ( localai-api-1 | [^"\\] | localai-api-1 | "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) localai-api-1 | )* "\"" space localai-api-1 | root-0-arguments-list-item-service-data ::= "{" space "\"entity_id\"" space ":" space string "}" space localai-api-1 | root-0-arguments-list-item ::= "{" space "\"domain\"" space ":" space string "," space "\"service\"" space ":" space string "," space "\"service_data\"" space ":" space root-0-arguments-list-item-service-data "}" space localai-api-1 | root-0-arguments-list ::= "[" space (root-0-arguments-list-item ("," space root-0-arguments-list-item)*)? "]" space localai-api-1 | root-0 ::= "{" space "\"arguments\"" space ":" space root-0-arguments "," space "\"function\"" space ":" space root-0-function "}" space localai-api-1 | root-1-function ::= "\"answer\"" localai-api-1 | root-0-arguments ::= "{" space "\"list\"" space ":" space root-0-arguments-list "}" space localai-api-1 | root-0-function ::= "\"execute_services\"" localai-api-1 | root-1-arguments ::= "{" space "\"message\"" space ":" space string "}" space localai-api-1 | root-1 ::= "{" space "\"arguments\"" space ":" space root-1-arguments "," space "\"function\"" space ":" space root-1-function "}" space localai-api-1 | root ::= root-0 | root-1 localai-api-1 | 4:05PM DBG Model already loaded in memory: luna-ai-llama2-uncensored.Q8_0.gguf localai-api-1 | 4:05PM DBG Model 'luna-ai-llama2-uncensored.Q8_0.gguf' already loaded localai-api-1 | 4:05PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr slot 0 is processing [task id: 2] localai-api-1 | 4:05PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr slot 0 : kv cache rm - [0, end) localai-api-1 | 4:06PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr localai-api-1 | 4:06PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr print_timings: prompt eval time = 5866.82 ms / 701 tokens ( 8.37 ms per token, 119.49 tokens per second) localai-api-1 | 4:06PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr print_timings: eval time = 50921.90 ms / 25 runs ( 2036.88 ms per token, 0.49 tokens per second) localai-api-1 | 4:06PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr print_timings: total time = 56788.71 ms localai-api-1 | 4:06PM DBG Function return: { "arguments": { "message": "Turn Computer On" } , "function": "answer"} map[arguments:map[message:Turn Computer On] function:answer] localai-api-1 | 4:06PM DBG nothing to do, computing a reply localai-api-1 | 4:06PM DBG Reply received from LLM: Turn Computer On localai-api-1 | 4:06PM DBG Reply received from LLM(finetuned): Turn Computer On localai-api-1 | 4:06PM DBG Response: {"created":1704814397,"object":"chat.completion","id":"56301b4b-9e94-4699-88a7-44e619222601","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q8_0.gguf","choices":[{"index":0,"message":{"role":"assistant","content":"Turn Computer On"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}} localai-api-1 | [192.168.4.44]:35842 200 - POST /v1/chat/completions localai-api-1 | 4:06PM DBG GRPC(luna-ai-llama2-uncensored.Q8_0.gguf-127.0.0.1:40445): stderr slot 0 released (727 tokens in cache)

I'm pretty sure my LocalAI is working as if i ask it how it it it replied as i'd expect
image

@jekalmin
Copy link
Owner

Thanks for reporting an issue.

Currently, there is an issue when using LocalAI. (see #17 (comment))

Let me try this too and see if something can be done to fix.
(I failed to install LocalAI before, but let me try again!)

@jonny190
Copy link
Author

I'm using docker compose for my LocalAI instance
`
version: '3.6'

services:
api:
image: quay.io/go-skynet/local-ai:master-cublas-cuda12
build:
context: .
dockerfile: Dockerfile
env_file:
- .env
volumes:
- /mnt/nfs/appdata/localai/models:/models:cached
- /mnt/nfs/appdata/localai/images/:/tmp/generated/images/
command: ["/usr/bin/local-ai" ]
ports:
- 8080:8080
`
not sure how your trying to run yours but happy to help you troubleshoot

@JonahMMay
Copy link

JonahMMay commented Jan 13, 2024

I think this may have something to do with LocalAI's chat and completion templates. I customized the plugin connection to remove functions and have the simple template of "You are my smart home assistant." Then I told it "Tell me a joke" to which it replied "Tell me a joke".

But if I build a similar query via curl, I get a proper response:
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf", "messages": [{"role": "system", "content": "You are my smart home assistant."},{"role": "user", "content": " Tell me a joke."}], "temperature": 0.7 }'

{"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Sure, I can do that! Here's a joke for you: Why did the scarecrow win an award? Because he was outstanding in his field."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

I started my LocalAI container in debug mode in an SSH window and watched the logs as they came on screen. The big difference I see is this. For the HASS plug-in:
5:01PM DBG Prompt (before templating): You are my smart home assistant.
Tell me a joke.
5:01PM DBG Prompt (after templating): You are my smart home assistant.
Tell me a joke.
5:01PM DBG Grammar: root-0-arguments-list-item ::= "{" space ""domain"" space ":" space string "," space ""service"" space ":" space string "," space ""service_data"" space ":" space root-0-arguments-list-item-service-data "}" space
root-0-arguments-list ::= "[" space (root-0-arguments-list-item ("," space root-0-arguments-list-item))? "]" space
root-0-arguments ::= "{" space ""list"" space ":" space root-0-arguments-list "}" space
root-0 ::= "{" space ""arguments"" space ":" space root-0-arguments "," space ""function"" space ":" space root-0-function "}" space
root-1-arguments ::= "{" space ""message"" space ":" space string "}" space
space ::= " "?
string ::= """ (
[^"\\] |
"\" (["\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
)
""" space
root-0-arguments-list-item-service-data ::= "{" space ""entity_id"" space ":" space string "}" space
root-1-function ::= ""answer""
root-0-function ::= ""execute_services""
root-1 ::= "{" space ""arguments"" space ":" space root-1-arguments "," space ""function"" space ":" space root-1-function "}" space
root ::= root-0 | root-1
5:01PM DBG Model already loaded in memory: luna-ai-llama2-uncensored.Q4_K_M.gguf
5:01PM DBG Model 'luna-ai-llama2-uncensored.Q4_K_M.gguf' already loaded
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 is processing [task id: 14]
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 : kv cache rm - [0, end)
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: prompt eval time = 74.89 ms / 15 tokens ( 4.99 ms per token, 200.30 tokens per second)
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: eval time = 743.44 ms / 30 runs ( 24.78 ms per token, 40.35 tokens per second)
5:01PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: total time = 818.33 ms
5:01PM DBG Function return: { "arguments": { "message": "Tell me a joke." },"function": "answer"} map[arguments:map[message:Tell me a joke.] function:answer]
5:01PM DBG nothing to do, computing a reply
5:01PM DBG Reply received from LLM: Tell me a joke.
5:01PM DBG Reply received from LLM(finetuned): Tell me a joke.
5:01PM DBG Response: {"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"message":{"role":"assistant","content":"Tell me a joke."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

Versus the curl command:
4:55PM DBG Prompt (before templating): You are my smart home assistant.
You are you?
4:55PM DBG Template found, input modified to: Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

You are my smart home assistant.
You are you?

Response:

4:55PM DBG Prompt (after templating): Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

You are my smart home assistant.
You are you?

Response:

4:55PM DBG Model already loaded in memory: luna-ai-llama2-uncensored.Q4_K_M.gguf
4:55PM DBG Model 'luna-ai-llama2-uncensored.Q4_K_M.gguf' already loaded
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 is processing [task id: 12]
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr slot 0 : kv cache rm - [0, end)
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: prompt eval time = 95.88 ms / 48 tokens ( 2.00 ms per token, 500.62 tokens per second)
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: eval time = 268.57 ms / 14 runs ( 19.18 ms per token, 52.13 tokens per second)
4:55PM DBG GRPC(luna-ai-llama2-uncensored.Q4_K_M.gguf-127.0.0.1:43491): stderr print_timings: total time = 364.45 ms
4:55PM DBG Response: {"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"I am your smart home assistant. How can I assist you today?"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

I also converted the curl request to Invoke-WebRequest and ran it on the local PC and it worked fine still. Just to rule out some sort of issue with remotely accessing the model and files.
{"created":1705162992,"object":"chat.completion","id":"99cd5afa-0999-47d5-910c-0473ccd1c0d5","model":"thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"Sure, here's a joke for you: Why did the tomato turn red? Because it saw the salad dressing!"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

@JonahMMay
Copy link

I'm in a bit over my head here, but it might be related to this?

mudler/LocalAI#1187

@jekalmin
Copy link
Owner

@JonahMMay
Thanks for sharing information.
Have you tried curl with functions added?

curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf",
    "messages": [
        {
            "role": "system",
            "content": "You are my smart home assistant."
        },
        {
            "role": "user",
            "content": "tell me a joke"
        }
    ],
    "functions": [
        {
            "name": "execute_services",
            "description": "Execute service of devices in Home Assistant.",
            "parameters": {
                "type": "object",
                "properties": {
                    "domain": {
                        "description": "The domain of the service.",
                        "type": "string"
                    },
                    "service": {
                        "description": "The service to be called",
                        "type": "string"
                    },
                    "service_data": {
                        "description": "The service data object to indicate what to control.",
                        "type": "object"
                    }
                },
                "required": [
                    "domain",
                    "service",
                    "service_data"
                ]
            }
        }
    ],
    "function_call": "auto",
    "temperature": 0.7
}'

Unfortunately, I still didn't get LocalAI to work :(

@JonahMMay
Copy link

If change functions to [] it appears to work fine. Trying the full code in your comment gives
{"error":{"code":500,"message":"Unrecognized schema: map[description:The service data object to indicate what to control. type:object]","type":""}}

@jekalmin
Copy link
Owner

Sorry to bother you.
Could you try this again?

curl --location 'http://localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "model": "thebloke__luna-ai-llama2-uncensored-gguf__luna-ai-llama2-uncensored.q4_k_m.gguf",
    "messages": [
        {
            "role": "system",
            "content": "You are my smart home assistant."
        },
        {
            "role": "user",
            "content": "turn on livingroom light"
        }
    ],
    "functions": [
        {
            "name": "execute_services",
            "description": "Execute service of devices in Home Assistant.",
            "parameters": {
                "type": "object",
                "properties": {
                    "domain": {
                        "description": "The domain of the service.",
                        "type": "string"
                    },
                    "service": {
                        "description": "The service to be called",
                        "type": "string"
                    },
                    "service_data": {
                        "type": "object",
                        "properties": {
                            "entity_id": {
                                "type": "array",
                                "items": {
                                    "type": "string",
                                    "description": "The entity_id retrieved from available devices. It must start with domain, followed by dot character."
                                }
                            }
                        }
                    }
                },
                "required": [
                    "domain",
                    "service",
                    "service_data"
                ]
            }
        }
    ],
    "function_call": "auto",
    "temperature": 0.7
}'

@jekalmin
Copy link
Owner

Never mind.
I just setup LocalAI successfully.

@JonahMMay
Copy link

Awesome! I am headed out of town later today and won't be back until Friday, but I should have remote access to my systems if there's anything you'd like me to test or look at.

@clambertus
Copy link

I also have this problem with the same results as @JonahMMay. If I remove the functions and function_call block from the example above, I do get an expected response, otherwise the response is the same as my input.

@ThePragmaticArt
Copy link

ThePragmaticArt commented Mar 5, 2024

Same issue on my end with OpenAI Extended + LocalAI given a few models I’ve tried. 😞

Vanilla API requests are fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants