Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macbook pro M3 using gpt-4 errors with grpc service not ready #2336

Open
csantanapr opened this issue May 16, 2024 · 1 comment
Open

macbook pro M3 using gpt-4 errors with grpc service not ready #2336

csantanapr opened this issue May 16, 2024 · 1 comment
Labels
bug Something isn't working unconfirmed

Comments

@csantanapr
Copy link

csantanapr commented May 16, 2024

LocalAI version:

v2.15.0 (f69de3b)
Downloaded CLI local-ai-git-Darwin-arm64
Also tried docker image

Environment, CPU architecture, OS, and Version:

Darwin 80a9972927c5 23.4.0 Darwin Kernel Version 23.4.0: Fri Mar 15 00:12:25 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T6030 arm64

Describe the bug

I see error about grpc

To Reproduce

Run from CLI

~/Downloads/local-ai-git-Darwin-arm64

Or docker run

docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu

Then curl

curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt-4", "messages": [{"role": "user","content": "How are you doing?", "temperature": 0.1}] }' 
{"error":{"code":500,"message":"could not load model - all backends returned error: grpc service not ready\ngrpc service not ready\ncould not load model: rpc error: code = Unknown desc = failed loading model\ncould not load model: rpc error: code = Unknown desc = failed loading model\nfork/exec /tmp/localai/backend_data/backend-assets/grpc/default.metallib: exec format error\ngrpc service not ready\ncould not load model: rpc error: code = Unavailable desc = error reading from server: EOF\ncould not load model: rpc error: code = Unknown desc = stat /Users/carrlos/dev/gptscript-ai/models/gpt-4: no such file or directory\ncould not load model: rpc error: code = Unknown desc = failed loading model\ncould not load model: rpc error: code = Unknown desc = no huggingface token provided","type":""}}

I try installing model and I get error

$ ~/Downloads/local-ai-git-Darwin-arm64 models install gpt-4                                                                                                                                                              Administrator@blue.us-west-2.eksctl.io
2:36PM INF Setting logging to info
2:36PM ERR unable to load galleries error="unexpected end of JSON input"
local-ai-git-Darwin-arm64: error: no gallery found with name "gpt-4"

Expected behavior

Logs

$ ~/Downloads/local-ai-git-Darwin-arm64 --debug

logs

2:37PM DBG Setting logging to debug
2:37PM INF Starting LocalAI using 4 threads, with models path: /Users/carrlos/dev/gptscript-ai/models
2:37PM INF LocalAI version: v2.15.0 (f69de3be0d274a676f1d1cd302dc4699f1b5aaf0)
2:37PM INF Preloading models from /Users/carrlos/dev/gptscript-ai/models
2:37PM DBG Extracting backend assets files to /tmp/localai/backend_data
2:37PM DBG processing api keys runtime update
2:37PM DBG processing external_backends.json
2:37PM DBG external backends loaded from external_backends.json
2:37PM ERR error establishing configuration directory watcher error="unable to establish watch on the LocalAI Configuration Directory: lstat /Users/carrlos/dev/gptscript-ai/configuration: no such file or directory"
2:37PM INF core/startup process completed!
2:37PM DBG No configuration file found at /tmp/localai/upload/uploadedFiles.json
2:37PM DBG No configuration file found at /tmp/localai/config/assistants.json
2:37PM DBG No configuration file found at /tmp/localai/config/assistantsFile.json
2:37PM INF LocalAI API is listening! Please connect to the endpoint for API documentation. endpoint=http://0.0.0.0:8080
2:38PM DBG Request received: {"model":"gpt-4","language":"","n":0,"top_p":null,"top_k":null,"temperature":null,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","response_format":{},"size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"user","content":"How are you doing?"}],"functions":null,"function_call":null,"stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"backend":"","model_base_name":""}
2:38PM DBG Configuration read: &{PredictionOptions:{Model:gpt-4 Language: N:0 TopP:0x14000610af8 TopK:0x14000610b00 Temperature:0x14000610b08 Maxtokens:0x14000610b38 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0x14000610b30 TypicalP:0x14000610b28 Seed:0x14000610b50 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name: F16:0x14000610af0 Threads:0x14000610ae8 Debug:0x14000610b48 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName: ParallelCalls:false NoGrammar:false ResponseRegex:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0x14000610b20 MirostatTAU:0x14000610b18 Mirostat:0x14000610b10 NGPULayers:0x14000610b40 MMap:0x14000610b48 MMlock:0x14000610b49 LowVRAM:0x14000610b49 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0x14000610ae0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:}
2:38PM DBG Parameters: &{PredictionOptions:{Model:gpt-4 Language: N:0 TopP:0x14000610af8 TopK:0x14000610b00 Temperature:0x14000610b08 Maxtokens:0x14000610b38 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0x14000610b30 TypicalP:0x14000610b28 Seed:0x14000610b50 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name: F16:0x14000610af0 Threads:0x14000610ae8 Debug:0x14000610b48 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName: ParallelCalls:false NoGrammar:false ResponseRegex:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0x14000610b20 MirostatTAU:0x14000610b18 Mirostat:0x14000610b10 NGPULayers:0x14000610b40 MMap:0x14000610b48 MMlock:0x14000610b49 LowVRAM:0x14000610b49 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0x14000610ae0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:}
2:38PM DBG Prompt (before templating): How are you doing?
2:38PM DBG Prompt (after templating): How are you doing?
2:38PM DBG Loading from the following backends (in order): [llama-cpp llama-cpp-fallback llama-ggml gpt4all default.metallib llama-cpp-noavx rwkv whisper bert-embeddings huggingface]
2:38PM INF Trying to load the model 'gpt-4' with all the available backends: llama-cpp, llama-cpp-fallback, llama-ggml, gpt4all, default.metallib, llama-cpp-noavx, rwkv, whisper, bert-embeddings, huggingface
2:38PM INF [llama-cpp] Attempting to load
2:38PM INF Loading model 'gpt-4' with backend llama-cpp
2:38PM DBG Loading model in memory from file: /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:38PM DBG Loading Model gpt-4 with gRPC (file: /Users/carrlos/dev/gptscript-ai/models/gpt-4) (backend: llama-cpp): {backendString:llama-cpp model:gpt-4 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0x14000367400 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:38PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp
2:38PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:59851'
2:38PM DBG GRPC Service state dir: /var/folders/j1/1_49156j7zl5cts1svk9rkkw0000gq/T/go-processmanager2971834631
2:38PM DBG GRPC Service Started
2:38PM DBG GRPC(gpt-4-127.0.0.1:59851): stderr dyld[67179]: Library not loaded: /opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib
2:38PM DBG GRPC(gpt-4-127.0.0.1:59851): stderr   Referenced from: <800934E2-CA0E-3F96-A299-6412FAAC65AC> /private/tmp/localai/backend_data/backend-assets/grpc/llama-cpp
2:38PM DBG GRPC(gpt-4-127.0.0.1:59851): stderr   Reason: tried: '/opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib' (no such file), '/opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib' (no such file)
2:38PM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:59851: connect: connection refused\""
2:38PM DBG GRPC Service NOT ready
2:38PM INF [llama-cpp] Fails: grpc service not ready
2:38PM INF [llama-cpp-fallback] Attempting to load
2:38PM INF Loading model 'gpt-4' with backend llama-cpp-fallback
2:38PM DBG Loading model in memory from file: /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:38PM DBG Loading Model gpt-4 with gRPC (file: /Users/carrlos/dev/gptscript-ai/models/gpt-4) (backend: llama-cpp-fallback): {backendString:llama-cpp-fallback model:gpt-4 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0x14000367400 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:38PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback
2:38PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:59987'
2:38PM DBG GRPC Service state dir: /var/folders/j1/1_49156j7zl5cts1svk9rkkw0000gq/T/go-processmanager1870294102
2:38PM DBG GRPC Service Started
2:38PM DBG GRPC(gpt-4-127.0.0.1:59987): stderr dyld[68224]: Library not loaded: /opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib
2:38PM DBG GRPC(gpt-4-127.0.0.1:59987): stderr   Referenced from: <7D6613FE-C73C-3662-A151-E429113020D3> /private/tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback
2:38PM DBG GRPC(gpt-4-127.0.0.1:59987): stderr   Reason: tried: '/opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib' (no such file), '/opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib' (no such file)
2:39PM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:59987: connect: connection refused\""
2:39PM DBG GRPC Service NOT ready
2:39PM INF [llama-cpp-fallback] Fails: grpc service not ready
2:39PM INF [llama-ggml] Attempting to load
2:39PM INF Loading model 'gpt-4' with backend llama-ggml
2:39PM DBG Loading model in memory from file: /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:39PM DBG Loading Model gpt-4 with gRPC (file: /Users/carrlos/dev/gptscript-ai/models/gpt-4) (backend: llama-ggml): {backendString:llama-ggml model:gpt-4 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0x14000367400 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:39PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-ggml
2:39PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:60089'
2:39PM DBG GRPC Service state dir: /var/folders/j1/1_49156j7zl5cts1svk9rkkw0000gq/T/go-processmanager4235136453
2:39PM DBG GRPC Service Started
2:39PM DBG GRPC(gpt-4-127.0.0.1:60089): stderr 2024/05/16 14:39:20 gRPC Server listening at 127.0.0.1:60089
2:39PM DBG GRPC Service Ready
2:39PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:gpt-4 ContextSize:512 Seed:2000563042 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:4 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/Users/carrlos/dev/gptscript-ai/models/gpt-4 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type:}
2:39PM DBG GRPC(gpt-4-127.0.0.1:60089): stderr create_gpt_params: loading model /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:39PM DBG GRPC(gpt-4-127.0.0.1:60089): stderr error loading model: failed to open /Users/carrlos/dev/gptscript-ai/models/gpt-4: No such file or directory
2:39PM DBG GRPC(gpt-4-127.0.0.1:60089): stderr llama_load_model_from_file: failed to load model
2:39PM DBG GRPC(gpt-4-127.0.0.1:60089): stderr llama_init_from_gpt_params: error: failed to load model '/Users/carrlos/dev/gptscript-ai/models/gpt-4'
2:39PM DBG GRPC(gpt-4-127.0.0.1:60089): stderr load_binding_model: error: unable to load model
2:39PM INF [llama-ggml] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
2:39PM INF [gpt4all] Attempting to load
2:39PM INF Loading model 'gpt-4' with backend gpt4all
2:39PM DBG Loading model in memory from file: /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:39PM DBG Loading Model gpt-4 with gRPC (file: /Users/carrlos/dev/gptscript-ai/models/gpt-4) (backend: gpt4all): {backendString:gpt4all model:gpt-4 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0x14000367400 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:39PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/gpt4all
2:39PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:60097'
2:39PM DBG GRPC Service state dir: /var/folders/j1/1_49156j7zl5cts1svk9rkkw0000gq/T/go-processmanager1315698930
2:39PM DBG GRPC Service Started
2:39PM DBG GRPC(gpt-4-127.0.0.1:60097): stderr 2024/05/16 14:39:22 gRPC Server listening at 127.0.0.1:60097
2:39PM DBG GRPC Service Ready
2:39PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:gpt-4 ContextSize:512 Seed:2000563042 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:4 LibrarySearchPath:/tmp/localai/backend_data/backend-assets/gpt4all RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/Users/carrlos/dev/gptscript-ai/models/gpt-4 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type:}
2:39PM DBG GRPC(gpt-4-127.0.0.1:60097): stderr load_model: error 'No such file or directory'
2:39PM INF [gpt4all] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
2:39PM INF [default.metallib] Attempting to load
2:39PM INF Loading model 'gpt-4' with backend default.metallib
2:39PM DBG Loading model in memory from file: /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:39PM DBG Loading Model gpt-4 with gRPC (file: /Users/carrlos/dev/gptscript-ai/models/gpt-4) (backend: default.metallib): {backendString:default.metallib model:gpt-4 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0x14000367400 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:39PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/default.metallib
2:39PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:60103'
2:39PM INF [default.metallib] Fails: fork/exec /tmp/localai/backend_data/backend-assets/grpc/default.metallib: exec format error
2:39PM INF [llama-cpp-noavx] Attempting to load
2:39PM INF Loading model 'gpt-4' with backend llama-cpp-noavx
2:39PM DBG Loading model in memory from file: /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:39PM DBG Loading Model gpt-4 with gRPC (file: /Users/carrlos/dev/gptscript-ai/models/gpt-4) (backend: llama-cpp-noavx): {backendString:llama-cpp-noavx model:gpt-4 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0x14000367400 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:39PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-noavx
2:39PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:60104'
2:39PM DBG GRPC Service state dir: /var/folders/j1/1_49156j7zl5cts1svk9rkkw0000gq/T/go-processmanager648173044
2:39PM DBG GRPC Service Started
2:39PM DBG GRPC(gpt-4-127.0.0.1:60104): stderr dyld[69294]: Library not loaded: /opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib
2:39PM DBG GRPC(gpt-4-127.0.0.1:60104): stderr   Referenced from: <1F881712-0DA0-3C72-B661-8DA327AEBDB5> /private/tmp/localai/backend_data/backend-assets/grpc/llama-cpp-noavx
2:39PM DBG GRPC(gpt-4-127.0.0.1:60104): stderr   Reason: tried: '/opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib' (no such file), '/opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib' (no such file)
2:40PM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:60104: connect: connection refused\""
2:40PM DBG GRPC Service NOT ready
2:40PM INF [llama-cpp-noavx] Fails: grpc service not ready
2:40PM INF [rwkv] Attempting to load
2:40PM INF Loading model 'gpt-4' with backend rwkv
2:40PM DBG Loading model in memory from file: /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:40PM DBG Loading Model gpt-4 with gRPC (file: /Users/carrlos/dev/gptscript-ai/models/gpt-4) (backend: rwkv): {backendString:rwkv model:gpt-4 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0x14000367400 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:40PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/rwkv
2:40PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:60221'
2:40PM DBG GRPC Service state dir: /var/folders/j1/1_49156j7zl5cts1svk9rkkw0000gq/T/go-processmanager3272379195
2:40PM DBG GRPC Service Started
2:40PM DBG GRPC(gpt-4-127.0.0.1:60221): stderr 2024/05/16 14:40:04 gRPC Server listening at 127.0.0.1:60221
2:40PM DBG GRPC Service Ready
2:40PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:gpt-4 ContextSize:512 Seed:2000563042 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:4 LibrarySearchPath:/tmp/localai/backend_data/backend-assets/gpt4all RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/Users/carrlos/dev/gptscript-ai/models/gpt-4 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type:}
2:40PM DBG GRPC(gpt-4-127.0.0.1:60221): stderr Failed to open file /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:40PM DBG GRPC(gpt-4-127.0.0.1:60221): stderr /Users/runner/work/LocalAI/LocalAI/sources/go-rwkv.cpp/rwkv.cpp/./rwkv_model_loading.inc:155: file.file
2:40PM DBG GRPC(gpt-4-127.0.0.1:60221): stderr
2:40PM DBG GRPC(gpt-4-127.0.0.1:60221): stderr /Users/runner/work/LocalAI/LocalAI/sources/go-rwkv.cpp/rwkv.cpp/rwkv.cpp:63: rwkv_load_model_from_file(file_path, *ctx->model)
2:40PM DBG GRPC(gpt-4-127.0.0.1:60221): stderr 2024/05/16 14:40:06 InitFromFile /Users/carrlos/dev/gptscript-ai/models/gpt-4 failed
2:40PM INF [rwkv] Fails: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
2:40PM INF [whisper] Attempting to load
2:40PM INF Loading model 'gpt-4' with backend whisper
2:40PM DBG Loading model in memory from file: /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:40PM DBG Loading Model gpt-4 with gRPC (file: /Users/carrlos/dev/gptscript-ai/models/gpt-4) (backend: whisper): {backendString:whisper model:gpt-4 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0x14000367400 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:40PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/whisper
2:40PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:60227'
2:40PM DBG GRPC Service state dir: /var/folders/j1/1_49156j7zl5cts1svk9rkkw0000gq/T/go-processmanager2751136785
2:40PM DBG GRPC Service Started
2:40PM DBG GRPC(gpt-4-127.0.0.1:60227): stderr 2024/05/16 14:40:06 gRPC Server listening at 127.0.0.1:60227
2:40PM DBG GRPC Service Ready
2:40PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:gpt-4 ContextSize:512 Seed:2000563042 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:4 LibrarySearchPath:/tmp/localai/backend_data/backend-assets/gpt4all RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/Users/carrlos/dev/gptscript-ai/models/gpt-4 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type:}
2:40PM INF [whisper] Fails: could not load model: rpc error: code = Unknown desc = stat /Users/carrlos/dev/gptscript-ai/models/gpt-4: no such file or directory
2:40PM INF [bert-embeddings] Attempting to load
2:40PM INF Loading model 'gpt-4' with backend bert-embeddings
2:40PM DBG Loading model in memory from file: /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:40PM DBG Loading Model gpt-4 with gRPC (file: /Users/carrlos/dev/gptscript-ai/models/gpt-4) (backend: bert-embeddings): {backendString:bert-embeddings model:gpt-4 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0x14000367400 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:40PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/bert-embeddings
2:40PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:60239'
2:40PM DBG GRPC Service state dir: /var/folders/j1/1_49156j7zl5cts1svk9rkkw0000gq/T/go-processmanager349604514
2:40PM DBG GRPC Service Started
2:40PM DBG GRPC(gpt-4-127.0.0.1:60239): stderr 2024/05/16 14:40:08 gRPC Server listening at 127.0.0.1:60239
2:40PM DBG GRPC Service Ready
2:40PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:gpt-4 ContextSize:512 Seed:2000563042 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:4 LibrarySearchPath:/tmp/localai/backend_data/backend-assets/gpt4all RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/Users/carrlos/dev/gptscript-ai/models/gpt-4 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type:}
2:40PM DBG GRPC(gpt-4-127.0.0.1:60239): stderr bert_load_from_file: failed to open '/Users/carrlos/dev/gptscript-ai/models/gpt-4'
2:40PM DBG GRPC(gpt-4-127.0.0.1:60239): stderr bert_bootstrap: failed to load model from '/Users/carrlos/dev/gptscript-ai/models/gpt-4'
2:40PM INF [bert-embeddings] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
2:40PM INF [huggingface] Attempting to load
2:40PM INF Loading model 'gpt-4' with backend huggingface
2:40PM DBG Loading model in memory from file: /Users/carrlos/dev/gptscript-ai/models/gpt-4
2:40PM DBG Loading Model gpt-4 with gRPC (file: /Users/carrlos/dev/gptscript-ai/models/gpt-4) (backend: huggingface): {backendString:huggingface model:gpt-4 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0x14000367400 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
2:40PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/huggingface
2:40PM DBG GRPC Service for gpt-4 will be running at: '127.0.0.1:60247'
2:40PM DBG GRPC Service state dir: /var/folders/j1/1_49156j7zl5cts1svk9rkkw0000gq/T/go-processmanager3330325305
2:40PM DBG GRPC Service Started
2:40PM DBG GRPC(gpt-4-127.0.0.1:60247): stderr 2024/05/16 14:40:10 gRPC Server listening at 127.0.0.1:60247
2:40PM DBG GRPC Service Ready
2:40PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:gpt-4 ContextSize:512 Seed:2000563042 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:4 LibrarySearchPath:/tmp/localai/backend_data/backend-assets/gpt4all RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/Users/carrlos/dev/gptscript-ai/models/gpt-4 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type:}
2:40PM INF [huggingface] Fails: could not load model: rpc error: code = Unknown desc = no huggingface token provided
2:40PM ERR Server error error="could not load model - all backends returned error: grpc service not ready\ngrpc service not ready\ncould not load model: rpc error: code = Unknown desc = failed loading model\ncould not load model: rpc error: code = Unknown desc = failed loading model\nfork/exec /tmp/localai/backend_data/backend-assets/grpc/default.metallib: exec format error\ngrpc service not ready\ncould not load model: rpc error: code = Unavailable desc = error reading from server: EOF\ncould not load model: rpc error: code = Unknown desc = stat /Users/carrlos/dev/gptscript-ai/models/gpt-4: no such file or directory\ncould not load model: rpc error: code = Unknown desc = failed loading model\ncould not load model: rpc error: code = Unknown desc = no huggingface token provided" ip=127.0.0.1 latency=2m12.220353916s method=POST status=500 url=/v1/chat/completions
@csantanapr csantanapr added bug Something isn't working unconfirmed labels May 16, 2024
@prabirshrestha
Copy link

I had similar issues on Mac M3 when running phi3-mini and LocalAI-llama3-8b-function-call-v0.2. After installing the following two packages I was able to chat with both. Would be great if these were statically linked so the binary just works.

brew install abseil grpc

I same the same error reported originally.

stderr dyld[69294]: Library not loaded: /opt/homebrew/opt/abseil/lib/libabsl_flags_parse.2401.0.0.dylib

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working unconfirmed
Projects
None yet
Development

No branches or pull requests

2 participants