-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When using StatelessExecutor, llama_new_context logs to console every time InferAsync is called #363
Comments
The reason seems to be that every time calling |
Newbie question.. Is there a way to prevent LlamaSharp/llama.ccp from logging these values to Console? I have read the docs but I wasnt able to answer the question myself. |
I believe you can do something like this: NativeApi.llama_log_set((level, message) =>
{
// This will be called when llama.cpp wants to log a message. Do whatever you like!
Console.WriteLine($"[{level}]: {message}");
}); |
Thanks for that pointer! |
What's the planned fix here? As far as I'm aware we want to create a new context every time, so that it's truly stateless. Or should we try to re-use the context, but clean up all the state between inferences? |
Hello I just found this amazing library and am adapting it to my needs. I also have to create a new context every time and would like to remove the output logs. I would use something like this, as pointed out earlier in this thread: I use llava model (vision), and I can still see the output every time I create a new context: encode_image_with_clip: 4 segments encoded in 700.78 ms encode_image_with_clip: image encoded in 729.47 ms by CLIP ( 0.32 ms per image patch) I would love to switch it off somehow |
llama_new_context_with_model: n_ctx = 7168
llama_new_context_with_model: freq_base = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: offloading v cache to GPU
llama_kv_cache_init: offloading k cache to GPU
llama_kv_cache_init: VRAM kv self = 8680.00 MiB
llama_new_context_with_model: kv self size = 8680.00 MiB
llama_build_graph: non-view tensors processed: 1430/1430
llama_new_context_with_model: compute buffer total size = 78.56 MiB
llama_new_context_with_model: VRAM scratch buffer: 75.50 MiB
llama_new_context_with_model: total VRAM used: 22148.73 MiB (model: 13393.23 MiB, context: 8755.50 MiB)
With the other executors, these logs only appear when the model is loaded. With StatelessExecutor they are output every time InferAsync is called. It seems to ignore the ILogger passed into the constructor, as passing NullLogger.lnstance has no effect on this behavior.
The text was updated successfully, but these errors were encountered: