Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while submitting prompt: TypeError: Failed to fetch #211

Closed
yesbroc opened this issue Jun 4, 2023 · 6 comments
Closed

Error while submitting prompt: TypeError: Failed to fetch #211

yesbroc opened this issue Jun 4, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@yesbroc
Copy link

yesbroc commented Jun 4, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [Y] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [Y] I carefully followed the README.md.
  • [Y] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [Y] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Milieseconds after sending my prompt it crashed and gave no reply.

Current Behavior

Exception occurred during processing of request from ('127.0.0.1', 51868)
Traceback (most recent call last):
File "socketserver.py", line 316, in _handle_request_noblock
File "socketserver.py", line 347, in process_request
File "socketserver.py", line 360, in finish_request
File "koboldcpp.py", line 223, in call
File "http\server.py", line 651, in init
File "socketserver.py", line 747, in init
File "http\server.py", line 425, in handle
File "http\server.py", line 413, in handle_one_request
File "koboldcpp.py", line 324, in do_POST
File "koboldcpp.py", line 171, in generate
TypeError: int expected instead of float

Environment and Context

Windows 11 home
koboldcpp.exe
Guanaco 13b GGML
16gb Ram
no venv

Failure Information (for bugs)

PS C:\Users\orijp\OneDrive\Desktop\chatgpts> ./koboldcpp.exe "C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_windows\oobabooga_windows\text-generation-webui\models\ggml-guanaco-13B.ggmlv3.q5_1.bin" --stream --useclblast 0 0 --gpulayers 7 --threads 12 --smartcontext
Welcome to KoboldCpp - Version 1.24
Attempting to use CLBlast library for faster prompt ingestion. A compatible clblast will be required.
Initializing dynamic library: koboldcpp_clblast.dll

Loading model: C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_windows\oobabooga_windows\text-generation-webui\models\ggml-guanaco-13B.ggmlv3.q5_1.bin
[Threads: 12, BlasThreads: 12, SmartContext: True]


Identified as LLAMA model: (ver 5)
Attempting to Load...

System Info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
llama.cpp: loading model from C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_windows\oobabooga_windows\text-generation-webui\models\ggml-guanaco-13B.ggmlv3.q5_1.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 9 (mostly Q5_1)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size = 0.09 MB
llama_model_load_internal: mem required = 11359.05 MB (+ 1608.00 MB per state)

Initializing CLBlast (First Run)...
Attempting to use: Platform=0, Device=0 (If invalid, program will crash)
Using Platform: NVIDIA CUDA Device: NVIDIA GeForce RTX 3050 Ti Laptop GPU FP16: 0
CL FP16 temporarily disabled pending further optimization.
llama_model_load_internal: [opencl] offloading 7 layers to GPU
llama_model_load_internal: [opencl] total VRAM used: 1588 MB
llama_init_from_file: kv self size = 1600.00 MB
Load Model OK: True
Embedded Kobold Lite loaded.
Starting Kobold HTTP Server on port 5001
Please connect to custom endpoint at http://localhost:5001
127.0.0.1 - - [04/Jun/2023 11:26:59] "GET /?streaming=1 HTTP/1.1" 200 -
127.0.0.1 - - [04/Jun/2023 11:27:00] "GET /api/v1/model HTTP/1.1" 200 -
127.0.0.1 - - [04/Jun/2023 11:27:00] "GET /api/v1/info/version HTTP/1.1" 200 -
127.0.0.1 - - [04/Jun/2023 11:27:00] "GET /sw.js HTTP/1.1" 404 -
127.0.0.1 - - [04/Jun/2023 11:27:00] "GET /manifest.json HTTP/1.1" 404 -

Input: {"n": 1, "max_context_length": 2048, "max_length": 8, "rep_pen": 1.25, "temperature": 0.5, "top_p": 0.3, "top_k": 0.4, "top_a": 0.95, "typical": 1, "tfs": 1, "rep_pen_range": 1024, "rep_pen_slope": 0.7, "sampler_order": [0, 1, 2, 3, 4, 5, 6], "prompt": "[The following is a chat message log between you and an extremely intelligent and knowledgeable AI system named KoboldGPT. KoboldGPT is a state-of-the-art Artificial General Intelligence. You may ask any question, or request any task, and KoboldGPT will always be able to respond accurately and truthfully.]\n\nYou: What are german shepherds?\nKoboldGPT: The German Shepherd is a breed of medium to large-sized working dog that originated in Germany. In the English language, the breed's officially recognized name is German Shepherd Dog. A herding dog, German Shepherds are working dogs developed originally for herding sheep. Since that time, however, because of their strength, intelligence, trainability, and obedience, German Shepherds around the world are often the preferred breed for many types of work, including disability assistance, search-and-rescue, police and military roles and acting.\nYou: Which is heavier, a duck or a car?\nKoboldGPT: A car weighs around 1300 kilograms, while a duck only weighs about 2 kilograms. Therefore, a car is heavier than a duck.\nYou: What is the atomic number of Calcium?\nKoboldGPT: A calcium atom has 20 protons, 20 electrons and 20 neutrons. The atomic number or proton number of a chemical element is the number of protons found in the nucleus of every atom of that element. The atomic number uniquely identifies a chemical element. It is identical to the charge number of the nucleus. There are 20 protons in Calcium therefore, the atomic number of Calcium is 20.\nYou: What is 28 + 13?\nKoboldGPT: 28 + 13 is equal to 41.\nYou: Tell me a joke.\nKoboldGPT: Why did the chicken cross the road? To get to the other side!\nYou: How is an earthquake measured?\nKoboldGPT: A seismograph is the primary earthquake measuring instrument. The seismograph produces a digital graphic recording of the ground motion caused by the seismic waves. The digital recording is called seismogram. A network of worldwide seismographs detects and measures the strength and duration of the earthquake's waves. The magnitude of an earthquake and the intensity of shaking is usually reported on the Richter scale.\n\nKoboldGPT: Hello, I am KoboldGPT, your personal AI assistant. What would you like to know?\nYou: o\nYou: o\nYou: frick\nKoboldGPT:", "quiet": true, "stop_sequence": ["You:"]}

Exception occurred during processing of request from ('127.0.0.1', 51985)
Traceback (most recent call last):
File "socketserver.py", line 316, in _handle_request_noblock
File "socketserver.py", line 347, in process_request
File "socketserver.py", line 360, in finish_request
File "koboldcpp.py", line 223, in call
File "http\server.py", line 651, in init
File "socketserver.py", line 747, in init
File "http\server.py", line 425, in handle
File "http\server.py", line 413, in handle_one_request
File "koboldcpp.py", line 324, in do_POST
File "koboldcpp.py", line 171, in generate
TypeError: int expected instead of float

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

  1. Load the model, smartcontext and clblast on

  2. run the webui with kobold chat
    image

  3. try talking to it ig

Failure Logs

Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.

Install the latest PowerShell for new features and improvements! https://aka.ms/PSWindows

PS C:\Users\orijp\OneDrive\Desktop\chatgpts> koboldcpp.exe "C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_windows\oobabooga_windows\text-generation-webui\models\ggml-guanaco-13B.ggmlv3.q5_1.bin" --stream --useclblast 0 0 --gpulayers 7 --threads 12 --smartcontext
koboldcpp.exe : The term 'koboldcpp.exe' is not recognized as the name of a cmdlet, function, script file, or
operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try
again.
At line:1 char:1

  • koboldcpp.exe "C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_win ...
  •   + CategoryInfo          : ObjectNotFound: (koboldcpp.exe:String) [], CommandNotFoundException
      + FullyQualifiedErrorId : CommandNotFoundException
    
    
    

Suggestion [3,General]: The command koboldcpp.exe was not found, but does exist in the current location. Windows PowerShell does not load commands from the current location by default. If you trust this command, instead type: ".\koboldcpp.exe". See "get-help about_Command_Precedence" for more details.
PS C:\Users\orijp\OneDrive\Desktop\chatgpts> ./koboldcpp.exe "C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_windows\oobabooga_windows\text-generation-webui\models\ggml-guanaco-13B.ggmlv3.q5_1.bin" --stream --useclblast 0 0 --gpulayers 7 --threads 12 --smartcontext
Welcome to KoboldCpp - Version 1.24
Attempting to use CLBlast library for faster prompt ingestion. A compatible clblast will be required.
Initializing dynamic library: koboldcpp_clblast.dll

Loading model: C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_windows\oobabooga_windows\text-generation-webui\models\ggml-guanaco-13B.ggmlv3.q5_1.bin
[Threads: 12, BlasThreads: 12, SmartContext: True]


Identified as LLAMA model: (ver 5)
Attempting to Load...

System Info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
llama.cpp: loading model from C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_windows\oobabooga_windows\text-generation-webui\models\ggml-guanaco-13B.ggmlv3.q5_1.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 9 (mostly Q5_1)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size = 0.09 MB
llama_model_load_internal: mem required = 11359.05 MB (+ 1608.00 MB per state)

Initializing CLBlast (First Run)...
Attempting to use: Platform=0, Device=0 (If invalid, program will crash)
Using Platform: NVIDIA CUDA Device: NVIDIA GeForce RTX 3050 Ti Laptop GPU FP16: 0
CL FP16 temporarily disabled pending further optimization.
llama_model_load_internal: [opencl] offloading 7 layers to GPU
llama_model_load_internal: [opencl] total VRAM used: 1588 MB
llama_init_from_file: kv self size = 1600.00 MB
Load Model OK: True
Embedded Kobold Lite loaded.
Starting Kobold HTTP Server on port 5001
Please connect to custom endpoint at http://localhost:5001
127.0.0.1 - - [04/Jun/2023 11:26:59] "GET /?streaming=1 HTTP/1.1" 200 -
127.0.0.1 - - [04/Jun/2023 11:27:00] "GET /api/v1/model HTTP/1.1" 200 -
127.0.0.1 - - [04/Jun/2023 11:27:00] "GET /api/v1/info/version HTTP/1.1" 200 -
127.0.0.1 - - [04/Jun/2023 11:27:00] "GET /sw.js HTTP/1.1" 404 -
127.0.0.1 - - [04/Jun/2023 11:27:00] "GET /manifest.json HTTP/1.1" 404 -

Input: {"n": 1, "max_context_length": 2048, "max_length": 8, "rep_pen": 1.25, "temperature": 0.5, "top_p": 0.3, "top_k": 0.4, "top_a": 0.95, "typical": 1, "tfs": 1, "rep_pen_range": 1024, "rep_pen_slope": 0.7, "sampler_order": [0, 1, 2, 3, 4, 5, 6], "prompt": "[The following is a chat message log between you and an extremely intelligent and knowledgeable AI system named KoboldGPT. KoboldGPT is a state-of-the-art Artificial General Intelligence. You may ask any question, or request any task, and KoboldGPT will always be able to respond accurately and truthfully.]\n\nYou: What are german shepherds?\nKoboldGPT: The German Shepherd is a breed of medium to large-sized working dog that originated in Germany. In the English language, the breed's officially recognized name is German Shepherd Dog. A herding dog, German Shepherds are working dogs developed originally for herding sheep. Since that time, however, because of their strength, intelligence, trainability, and obedience, German Shepherds around the world are often the preferred breed for many types of work, including disability assistance, search-and-rescue, police and military roles and acting.\nYou: Which is heavier, a duck or a car?\nKoboldGPT: A car weighs around 1300 kilograms, while a duck only weighs about 2 kilograms. Therefore, a car is heavier than a duck.\nYou: What is the atomic number of Calcium?\nKoboldGPT: A calcium atom has 20 protons, 20 electrons and 20 neutrons. The atomic number or proton number of a chemical element is the number of protons found in the nucleus of every atom of that element. The atomic number uniquely identifies a chemical element. It is identical to the charge number of the nucleus. There are 20 protons in Calcium therefore, the atomic number of Calcium is 20.\nYou: What is 28 + 13?\nKoboldGPT: 28 + 13 is equal to 41.\nYou: Tell me a joke.\nKoboldGPT: Why did the chicken cross the road? To get to the other side!\nYou: How is an earthquake measured?\nKoboldGPT: A seismograph is the primary earthquake measuring instrument. The seismograph produces a digital graphic recording of the ground motion caused by the seismic waves. The digital recording is called seismogram. A network of worldwide seismographs detects and measures the strength and duration of the earthquake's waves. The magnitude of an earthquake and the intensity of shaking is usually reported on the Richter scale.\n\nKoboldGPT: Hello, I am KoboldGPT, your personal AI assistant. What would you like to know?\nYou: o\nYou: o\nYou: frick\nKoboldGPT:", "quiet": true, "stop_sequence": ["You:"]}

Exception occurred during processing of request from ('127.0.0.1', 51985)
Traceback (most recent call last):
File "socketserver.py", line 316, in _handle_request_noblock
File "socketserver.py", line 347, in process_request
File "socketserver.py", line 360, in finish_request
File "koboldcpp.py", line 223, in call
File "http\server.py", line 651, in init
File "socketserver.py", line 747, in init
File "http\server.py", line 425, in handle
File "http\server.py", line 413, in handle_one_request
File "koboldcpp.py", line 324, in do_POST
File "koboldcpp.py", line 171, in generate
TypeError: int expected instead of float

@LostRuins
Copy link
Owner

I notice you are running an older version v1.24 of KoboldCpp.

Can you please update to the latest version v1.28 and try again? Show me the full console output log when it has an error. Thanks.

@yesbroc
Copy link
Author

yesbroc commented Jun 4, 2023

C:\Users\orijp\OneDrive\Desktop\chatgpts>koboldcpp.exe "C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_windows\oobabooga_windows\text-generation-webui\models\ggml-guanaco-13B.ggmlv3.q5_1.bin"  --stream --useclblast 0 0 --gpulayers 7 --threads 12
Welcome to KoboldCpp - Version 1.28
Attempting to use CLBlast library for faster prompt ingestion. A compatible clblast will be required.
Initializing dynamic library: koboldcpp_clblast.dll
==========
Loading model: C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_windows\oobabooga_windows\text-generation-webui\models\ggml-guanaco-13B.ggmlv3.q5_1.bin
[Threads: 12, BlasThreads: 12, SmartContext: False]

---
Identified as LLAMA model: (ver 5)
Attempting to Load...
---
System Info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
llama.cpp: loading model from C:\Users\orijp\OneDrive\Desktop\chatgpts\oobabooga_windows\oobabooga_windows\text-generation-webui\models\ggml-guanaco-13B.ggmlv3.q5_1.bin
llama_model_load_internal: format     = ggjt v3 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 5120
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 40
llama_model_load_internal: n_layer    = 40
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 9 (mostly Q5_1)
llama_model_load_internal: n_ff       = 13824
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size =    0.09 MB

Platform:0 Device:0  - NVIDIA CUDA with NVIDIA GeForce RTX 3050 Ti Laptop GPU
Platform:1 Device:0  - AMD Accelerated Parallel Processing with gfx90c
Platform:2 Device:0  - OpenCLOn12 with AMD Radeon(TM) Graphics
Platform:2 Device:1  - OpenCLOn12 with NVIDIA GeForce RTX 3050 Ti Laptop GPU
Platform:2 Device:2  - OpenCLOn12 with Microsoft Basic Render Driver

ggml_opencl: selecting platform: 'NVIDIA CUDA'
ggml_opencl: selecting device: 'NVIDIA GeForce RTX 3050 Ti Laptop GPU'
ggml_opencl: device FP16 support: false
CL FP16 temporarily disabled pending further optimization.
llama_model_load_internal: using OpenCL for GPU acceleration
llama_model_load_internal: mem required  = 9770.65 MB (+ 1608.00 MB per state)
llama_model_load_internal: offloading 7 layers to GPU
llama_model_load_internal: total VRAM used: 1588 MB
...................
llama_init_from_file: kv self size  = 1600.00 MB
Load Model OK: True
Embedded Kobold Lite loaded.
Starting Kobold HTTP Server on port 5001
Please connect to custom endpoint at http://localhost:5001
127.0.0.1 - - [04/Jun/2023 23:42:46] "GET / HTTP/1.1" 302 -
Force redirect to streaming mode, as --stream is set.
127.0.0.1 - - [04/Jun/2023 23:42:46] "GET /?streaming=1 HTTP/1.1" 200 -
127.0.0.1 - - [04/Jun/2023 23:42:46] "GET /api/v1/model HTTP/1.1" 200 -
127.0.0.1 - - [04/Jun/2023 23:42:46] "GET /api/v1/info/version HTTP/1.1" 200 -
127.0.0.1 - - [04/Jun/2023 23:42:46] "GET /sw.js HTTP/1.1" 404 -
127.0.0.1 - - [04/Jun/2023 23:42:46] "GET /manifest.json HTTP/1.1" 404 -

Input: {"n": 1, "max_context_length": 2048, "max_length": 8, "rep_pen": 1.25, "temperature": 0.5, "top_p": 0.3, "top_k": 0.4, "top_a": 0.95, "typical": 1, "tfs": 1, "rep_pen_range": 1024, "rep_pen_slope": 0.7, "sampler_order": [0, 1, 2, 3, 4, 5, 6], "prompt": "[The following is a chat message log between you and an extremely intelligent and knowledgeable AI system named KoboldGPT. KoboldGPT is a state-of-the-art Artificial General Intelligence. You may ask any question, or request any task, and KoboldGPT will always be able to respond accurately and truthfully.]\n\nYou: What are german shepherds?\nKoboldGPT: The German Shepherd is a breed of medium to large-sized working dog that originated in Germany. In the English language, the breed's officially recognized name is German Shepherd Dog. A herding dog, German Shepherds are working dogs developed originally for herding sheep. Since that time, however, because of their strength, intelligence, trainability, and obedience, German Shepherds around the world are often the preferred breed for many types of work, including disability assistance, search-and-rescue, police and military roles and acting.\nYou: Which is heavier, a duck or a car?\nKoboldGPT: A car weighs around 1300 kilograms, while a duck only weighs about 2 kilograms. Therefore, a car is heavier than a duck.\nYou: What is the atomic number of Calcium?\nKoboldGPT: A calcium atom has 20 protons, 20 electrons and 20 neutrons. The atomic number or proton number of a chemical element is the number of protons found in the nucleus of every atom of that element. The atomic number uniquely identifies a chemical element. It is identical to the charge number of the nucleus. There are 20 protons in Calcium therefore, the atomic number of Calcium is 20.\nYou: What is 28 + 13?\nKoboldGPT: 28 + 13 is equal to 41.\nYou: Tell me a joke.\nKoboldGPT: Why did the chicken cross the road? To get to the other side!\nYou: How is an earthquake measured?\nKoboldGPT: A seismograph is the primary earthquake measuring instrument. The seismograph produces a digital graphic recording of the ground motion caused by the seismic waves. The digital recording is called seismogram. A network of worldwide seismographs detects and measures the strength and duration of the earthquake's waves. The magnitude of an earthquake and the intensity of shaking is usually reported on the Richter scale.\n\nKoboldGPT: Hello, I am KoboldGPT, your personal AI assistant. What would you like to know?\nYou: fr\nYou: ok\nYou: hi\nYou: crigne\nKoboldGPT:", "quiet": true, "stop_sequence": ["You:"]}
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 65491)
Traceback (most recent call last):
  File "socketserver.py", line 316, in _handle_request_noblock
  File "socketserver.py", line 347, in process_request
  File "socketserver.py", line 360, in finish_request
  File "koboldcpp.py", line 225, in __call__
  File "http\server.py", line 647, in __init__
  File "socketserver.py", line 747, in __init__
  File "http\server.py", line 427, in handle
  File "http\server.py", line 415, in handle_one_request
  File "koboldcpp.py", line 326, in do_POST
  File "koboldcpp.py", line 172, in generate
TypeError: int expected instead of float
----------------------------------------

@yesbroc
Copy link
Author

yesbroc commented Jun 6, 2023

how could i reinstall?

@LostRuins
Copy link
Owner

LostRuins commented Jun 6, 2023

This bug will be fixed in the new version. For now, you can fix it by setting topK back to 0.

@LostRuins
Copy link
Owner

Hi, can you please try the latest version? This should be fixed now.

@LostRuins LostRuins added the bug Something isn't working label Jun 7, 2023
@yesbroc
Copy link
Author

yesbroc commented Jun 7, 2023

that fixed the issue, thanks

@yesbroc yesbroc closed this as completed Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants