I moved the wiki into the repository #1473
oobabooga
announced in
Announcements
Replies: 1 comment 1 reply
-
Hey, i load the text-generation-webui with the following parameters: call python server.py --auto-devices --chat --character "Example" --wbits 4 --groupsize 128 --listen --no-stream --model ggml-vicuna-7b-1.1 --verbose and it does not load the character. here is an example of how it loads and the 1st question i ask (what is your name) Starting the web UI...
Gradio HTTP request redirected to localhost :)
C:\repos\oobabooga-windows-CPU\installer_files\env\lib\site-packages\bitsandbytes\cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
Loading ggml-vicuna-7b-1.1...
llama.cpp weights detected: models\ggml-vicuna-7b-1.1\ggml-vicuna-7b-1.0-uncensored-q4_0.bin
llama.cpp: loading model from models\ggml-vicuna-7b-1.1\ggml-vicuna-7b-1.0-uncensored-q4_0.bin
llama_model_load_internal: format = ggjt v1 (latest)
llama_model_load_internal: n_vocab = 32001
llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 4 (mostly Q4_1, some F16)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 59.11 KB
llama_model_load_internal: mem required = 5809.33 MB (+ 1026.00 MB per state)
llama_init_from_file: kv self size = 1024.00 MB
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Loading the extension "gallery"... Ok.
Running on local URL: http://0.0.0.0:7860
To create a public link, set `share=True` in `launch()`.
A chat between a human and an assistant.
### Human: what is your name?
### Assistant:
--------------------
Output generated in 29.55 seconds (6.77 tokens/s, 200 tokens, context 27, seed 1437260233) and when i try to load a character i get this:
what am i doingh wrong? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It's located here now and contributions are welcome: https://github.com/oobabooga/text-generation-webui/tree/main/docs
Beta Was this translation helpful? Give feedback.
All reactions