Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using StatelessExecutor, llama_new_context logs to console every time InferAsync is called #363

Open
andymartin opened this issue Dec 13, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@andymartin
Copy link

andymartin commented Dec 13, 2023

llama_new_context_with_model: n_ctx = 7168
llama_new_context_with_model: freq_base = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: offloading v cache to GPU
llama_kv_cache_init: offloading k cache to GPU
llama_kv_cache_init: VRAM kv self = 8680.00 MiB
llama_new_context_with_model: kv self size = 8680.00 MiB
llama_build_graph: non-view tensors processed: 1430/1430
llama_new_context_with_model: compute buffer total size = 78.56 MiB
llama_new_context_with_model: VRAM scratch buffer: 75.50 MiB
llama_new_context_with_model: total VRAM used: 22148.73 MiB (model: 13393.23 MiB, context: 8755.50 MiB)

With the other executors, these logs only appear when the model is loaded. With StatelessExecutor they are output every time InferAsync is called. It seems to ignore the ILogger passed into the constructor, as passing NullLogger.lnstance has no effect on this behavior.

@martindevans martindevans added the bug Something isn't working label Dec 14, 2023
@AsakusaRinne
Copy link
Collaborator

The reason seems to be that every time calling InferAsync, there's a context loaded. Thank you for reporting us this BUG, we'll fix it soon.

@chatbuildcontact
Copy link

Newbie question.. Is there a way to prevent LlamaSharp/llama.ccp from logging these values to Console? I have read the docs but I wasnt able to answer the question myself.

@martindevans
Copy link
Member

I believe you can do something like this:

NativeApi.llama_log_set((level, message) =>
{
    // This will be called when llama.cpp wants to log a message. Do whatever you like!
    Console.WriteLine($"[{level}]: {message}");
});

@chatbuildcontact
Copy link

I believe you can do something like this:

NativeApi.llama_log_set((level, message) =>
{
    // This will be called when llama.cpp wants to log a message. Do whatever you like!
    Console.WriteLine($"[{level}]: {message}");
});

Thanks for that pointer!

@martindevans
Copy link
Member

we'll fix it soon.

What's the planned fix here? As far as I'm aware we want to create a new context every time, so that it's truly stateless. Or should we try to re-use the context, but clean up all the state between inferences?

@micoraweb
Copy link

Hello I just found this amazing library and am adapting it to my needs. I also have to create a new context every time and would like to remove the output logs.

I would use something like this, as pointed out earlier in this thread:
NativeLogConfig.llama_log_set(NullLogger.Instance);

I use llava model (vision), and I can still see the output every time I create a new context:

encode_image_with_clip: 4 segments encoded in 700.78 ms
encode_image_with_clip: image embedding created: 2304 tokens

encode_image_with_clip: image encoded in 729.47 ms by CLIP ( 0.32 ms per image patch)
llava_eval_image_embed : failed to eval

I would love to switch it off somehow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants