Skip to content

--cli mode causes koboldcpp to close instantly or experience an error once input is sent. #1482

@wildwolf256

Description

@wildwolf256

Describe the Issue
Using the --cli flag causes this error some times and other times just closes the program. This is the error Message I get if I wait long enough before sending anything.

.................................................................................................
Automatic RoPE Scaling: Using (scale:1.000, base:10000.0).
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 4216
llama_context: n_ctx_per_seq = 4216
llama_context: n_batch = 512
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = 0
llama_context: freq_base = 10000.0
llama_context: freq_scale = 1
llama_context: n_ctx_per_seq (4216) < n_ctx_train (32768) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: CUDA_Host output buffer size = 0.12 MiB
llama_context: n_ctx = 4216
llama_context: n_ctx = 4224 (padded)
init: kv_size = 4224, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 32, can_shift = 1
init: CUDA0 KV buffer size = 528.00 MiB
llama_context: KV self size = 528.00 MiB, K (f16): 264.00 MiB, V (f16): 264.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
llama_context: max_nodes = 65536
llama_context: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 0
llama_context: reserving graph for n_tokens = 512, n_seqs = 1
llama_context: reserving graph for n_tokens = 1, n_seqs = 1
llama_context: reserving graph for n_tokens = 512, n_seqs = 1
llama_context: CUDA0 compute buffer size = 304.25 MiB
llama_context: CUDA_Host compute buffer size = 16.26 MiB
llama_context: graph nodes = 1094
llama_context: graph splits = 2
Load Text Model OK: True
Chat template heuristics failed to identify chat completions format. Alpaca will be used.
Embedded KoboldAI Lite loaded.
Embedded API docs loaded.

Active Modules: TextGeneration
Inactive Modules: ImageGeneration VoiceRecognition MultimodalVision NetworkMultiplayer ApiKeyPassword WebSearchProxy TextToSpeech VectorEmbeddings AdminControl
Enabled APIs: KoboldCppApi OpenAiApi OllamaApi

===
Now running KoboldCpp in Interactive Terminal Chat mode.
Type /quit or /exit to end session.

hello
Traceback (most recent call last):
File "koboldcpp.py", line 6543, in
main(launch_args=parser.parse_args(),default_args=parser.parse_args([]))
File "koboldcpp.py", line 5628, in main
kcpp_main_process(args,global_memory,using_gui_launcher)
File "koboldcpp.py", line 6311, in kcpp_main_process
print(result.strip() + "\n", flush=True)
OSError: [WinError 6] The handle is invalid
[8940] Failed to execute script 'koboldcpp' due to unhandled exception!
Traceback (most recent call last):
File "", line 1, in
OSError: [WinError 6] The handle is invalid
Exception ignored in: <_io.TextIOWrapper name='' mode='w' encoding='utf-8'>
OSError: [WinError 6] The handle is invalid

Additional Information:
I am using a rtx 3060 8gb, intel core i5 13600kf.
The issue I am having has been happening since '--cli' was introduced one major version ago. 1.87.*.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions