-
Notifications
You must be signed in to change notification settings - Fork 598
Description
Describe the Issue
Using the --cli flag causes this error some times and other times just closes the program. This is the error Message I get if I wait long enough before sending anything.
.................................................................................................
Automatic RoPE Scaling: Using (scale:1.000, base:10000.0).
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 4216
llama_context: n_ctx_per_seq = 4216
llama_context: n_batch = 512
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = 0
llama_context: freq_base = 10000.0
llama_context: freq_scale = 1
llama_context: n_ctx_per_seq (4216) < n_ctx_train (32768) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: CUDA_Host output buffer size = 0.12 MiB
llama_context: n_ctx = 4216
llama_context: n_ctx = 4224 (padded)
init: kv_size = 4224, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 32, can_shift = 1
init: CUDA0 KV buffer size = 528.00 MiB
llama_context: KV self size = 528.00 MiB, K (f16): 264.00 MiB, V (f16): 264.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
llama_context: max_nodes = 65536
llama_context: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 0
llama_context: reserving graph for n_tokens = 512, n_seqs = 1
llama_context: reserving graph for n_tokens = 1, n_seqs = 1
llama_context: reserving graph for n_tokens = 512, n_seqs = 1
llama_context: CUDA0 compute buffer size = 304.25 MiB
llama_context: CUDA_Host compute buffer size = 16.26 MiB
llama_context: graph nodes = 1094
llama_context: graph splits = 2
Load Text Model OK: True
Chat template heuristics failed to identify chat completions format. Alpaca will be used.
Embedded KoboldAI Lite loaded.
Embedded API docs loaded.
Active Modules: TextGeneration
Inactive Modules: ImageGeneration VoiceRecognition MultimodalVision NetworkMultiplayer ApiKeyPassword WebSearchProxy TextToSpeech VectorEmbeddings AdminControl
Enabled APIs: KoboldCppApi OpenAiApi OllamaApi
===
Now running KoboldCpp in Interactive Terminal Chat mode.
Type /quit or /exit to end session.
hello
Traceback (most recent call last):
File "koboldcpp.py", line 6543, in
main(launch_args=parser.parse_args(),default_args=parser.parse_args([]))
File "koboldcpp.py", line 5628, in main
kcpp_main_process(args,global_memory,using_gui_launcher)
File "koboldcpp.py", line 6311, in kcpp_main_process
print(result.strip() + "\n", flush=True)
OSError: [WinError 6] The handle is invalid
[8940] Failed to execute script 'koboldcpp' due to unhandled exception!
Traceback (most recent call last):
File "", line 1, in
OSError: [WinError 6] The handle is invalid
Exception ignored in: <_io.TextIOWrapper name='' mode='w' encoding='utf-8'>
OSError: [WinError 6] The handle is invalid
Additional Information:
I am using a rtx 3060 8gb, intel core i5 13600kf.
The issue I am having has been happening since '--cli' was introduced one major version ago. 1.87.*.