[BUG]: KernelMemory.AskAsync() does not work - exception: object reference not set to an instance of an object #891

aropb · 2024-08-03T17:01:11Z

Description

I use KernelMemory. LogiBits is empty.

The error occurs at the time of the call:
memory.AskAsync()

Debug with clone classes: BaseSamplingPipeline, DefaultSamplingPipeline

Reproduction Steps

The error occurs at the time of the call:

MemoryAnswer answer = await memory.AskAsync(question: question, filters: filters);

Environment & Configuration

Operating system: Windows 10/11
.NET runtime version: 8.0.7
LLamaSharp version: 0.15.0
KernelMemory: 0.70.240803.1, 0.69.240727.1
CUDA version (if you are using cuda backend): -
CPU & GPU device: CPU Intel Core Ultra 9

Known Workarounds

jwangga · 2024-08-08T22:31:00Z

I encountered the same issue when running the sample code: "Kernel Memory: Document Q&A" or "Kernel Memory: Save and Load" from the LLama.Examples project.

tusharmevl · 2024-08-13T08:47:12Z

@aropb @jwangga I am facing the same issue running the example 'Kernel Memory: Document Q&A'. Did you find a fix for this? I am trying to implement a RAG system using this. Is there any other way to implement it apart from Kernel Memory?

jwangga · 2024-08-13T14:19:05Z

@tusharmevl I have not found a fix for the Kernel Memory issue. It seems that the integration with Semantic Kernel Memory works. You may try using that as an alternative if your system only needs to support Text.

tusharmevl · 2024-08-13T14:23:48Z

@jwangga Ok Thanks! Yes I need to support text only for now, will try that.

GalactixGod · 2024-08-14T11:52:27Z

Thanks @jwangga !!

I'm seeing that you can use Semantic Kernel Memory (SKM) as well.

Doesn't appear that you can "chat" with SKM to discuss results unfortunately. Have you been able to figure out a way to "ask" questions of SKM?

nicholusi2021 · 2024-08-15T17:14:46Z

I'm also having the same issue with the Kernel Memory: Document Q&A example.

aropb · 2024-08-21T07:55:08Z

Please, I really need to fix the error.

So far, I can only use such versions:
Microsoft.KernelMemory.Core = 0.62.240605.1
LLamaSharp = 0.13.0

Any newer versions do not work.

I think the mistake is here:

ISamplingPipelineExtensions.Sample()
...
var span = CollectionsMarshal.AsSpan(lastTokens);
--->!
return pipeline.Sample(ctx, logits, span);
...

aropb · 2024-09-02T09:40:57Z

I found the place where the error occurs.

llama_get_logits_ith suddenly return null.
// returns NULL for invalid ids.

    public Span<float> SafeLLamaContextHandle.GetLogitsIth(int i)
    {
        var model = ThrowIfDisposed();

        unsafe
        {
            var logits = llama_get_logits_ith(this, i);
            return new Span<float>(logits, model.VocabCount);
        }
    }

Stack:

StatelessExecutor.InferAsync()
...
var id = pipeline.Sample(Context.NativeHandle, Context.NativeHandle.GetLogitsIth(_batch.TokenCount - 1), lastTokens);
...

But I don't understand what to do next and how to fix the error. Apparently, null shouldn't be there. Can anyone help with this error?
Due to this error, it is impossible to use Kernel memory.

Thanks.

martindevans · 2024-09-02T12:40:28Z

That's probably indicative of two bugs in LLamaSharp.

Wrapper Error

The docs for llama_get_logits_ith (see here) say:

// Logits for the ith token. For positive indices, Equivalent to:
// llama_get_logits(ctx) + ctx->output_ids[i]*n_vocab
// Negative indicies can be used to access logits in reverse order, -1 is the last logit.
// returns NULL for invalid ids.
LLAMA_API float * llama_get_logits_ith(struct llama_context * ctx, int32_t i);

So it is valid for llama_get_logits_ith to return null! That means this SafeLLamaContextHandle.GetLogitsIth is incorrectly written, it should check for null and raise some kind of error in that case (throw an exception most likely). It is never valid to pass a null pointer into a span constructor!

This is why you get a hard crash instead of an exception.

Higher Level Error

llama_get_logits_ith returns null if an invalid value for i is passed in. There must be a bug somewhere in the higher level that is causing an incorrect value to be passed in. Since this error only seems to affect kernel memory it must be something specific to the KM wrapper.

aropb · 2024-09-02T13:39:11Z

@martindevans I have found a solution.

Embeddings = false

...
public static IKernelMemoryBuilder WithLLamaSharp(this IKernelMemoryBuilder builder, LLamaSharpConfig config)
 {
     ModelParams parameters = new(config.ModelPath)
     {
         Embeddings = false,
         ...

set the values: UBatchSize, BatchSize
...
public LLamaSharpTextEmbeddingGenerator(LLamaSharpConfig config, LLamaWeights weights)
{
ModelParams @params = new(config.ModelPath)
{
Embeddings = true,
...
UBatchSize = 2000,
BatchSize = 2000
};

aropb · 2024-09-02T14:06:56Z

While testing, I noticed that it became slower to work, about 2 times after 0.13.0. Why is this interesting?

martindevans · 2024-09-02T15:12:49Z

Embeddings = false

Aha, I think you've cracked it! A while ago the behaviour of the embeddings flag was changed, so logits can no longer be extracted if embeddings=true.

aropb · 2024-09-02T15:20:26Z

And in LLamaSharpTextEmbeddingGenerator must specify the values UBatchSize, BatchSize!

martindevans · 2024-09-02T15:26:06Z

I'm not sure about that - there should be sensible defaults for those values. In LLamaSharp they're set to default values here. It's possible KernelMemory is overriding those defaults with something incorrect though (I don't really know the KM stuff, so I can't be certain).

aropb · 2024-09-02T15:32:04Z

Without these values, there will be an error "Input contains more tokens than configured batch size". That is, the value must be greater than 512. And now you can only define them by rewriting the LLamaSharpTextEmbeddingGenerator class.

aropb · 2024-09-02T17:01:43Z

Apparently it is necessary to add UBatchSize, BatchSize to LLamaSharpConfig.

It seems that embeddings=false should always be done.

martindevans · 2024-09-04T01:19:05Z

I'm super busy this month, but I will try to make time to fix the issues you found that I summarised here when I get a chance (soon, hopefully. Definitely before the next release).

martindevans · 2024-09-04T01:40:15Z

#920 Fixes the lowest level wrapper error, so at least it throws an exception. Hopefully that might help debug the higher level issue.

aropb · 2024-09-04T10:51:29Z

The problem has been found. You need to force embeddings=false.

martindevans · 2024-09-04T12:48:07Z

I wasn't sure if there's more going on, since you also mentioned a need to change the batch size. Is that just because of the size of your request (you need a larger batch to fit it all in), or is there more going on there?

aropb · 2024-09-04T12:50:50Z

Yes, the block size is larger than batchSize, but now this value cannot be changed except to rewrite the class LLamaSharpTextEmbeddingGenerator.

eocron · 2024-09-13T07:06:38Z

Any update on this?

aropb · 2024-09-13T07:10:56Z

Any update on this?

There is a solution above, Embeddings = false!

aropb mentioned this issue Aug 21, 2024

[BUG]: Null reference in AskAsync() #870

Open

aropb changed the title ~~[BUG]: Object reference not set to an instance of an object~~ [BUG]: KernelMemory.AskAsync() does not work - exception: Object reference not set to an instance of an object! Aug 23, 2024

aropb changed the title ~~[BUG]: KernelMemory.AskAsync() does not work - exception: Object reference not set to an instance of an object!~~ [BUG]: KernelMemory.AskAsync() does not work - exception: object reference not set to an instance of an object Aug 23, 2024

This was referenced Aug 28, 2024

LLamaSharp v0.15.0 broke cuda backend #909

Open

How do i use RAG by kernel memory and Semantic kernel Handlebar Planner with llama3 #899

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: KernelMemory.AskAsync() does not work - exception: object reference not set to an instance of an object #891

[BUG]: KernelMemory.AskAsync() does not work - exception: object reference not set to an instance of an object #891

aropb commented Aug 3, 2024 •

edited

Loading

jwangga commented Aug 8, 2024

tusharmevl commented Aug 13, 2024

jwangga commented Aug 13, 2024

tusharmevl commented Aug 13, 2024

GalactixGod commented Aug 14, 2024

nicholusi2021 commented Aug 15, 2024

aropb commented Aug 21, 2024 •

edited

Loading

aropb commented Sep 2, 2024 •

edited

Loading

martindevans commented Sep 2, 2024 •

edited

Loading

aropb commented Sep 2, 2024 •

edited

Loading

aropb commented Sep 2, 2024

martindevans commented Sep 2, 2024

aropb commented Sep 2, 2024

martindevans commented Sep 2, 2024

aropb commented Sep 2, 2024 •

edited

Loading

aropb commented Sep 2, 2024 •

edited

Loading

martindevans commented Sep 4, 2024

martindevans commented Sep 4, 2024

aropb commented Sep 4, 2024

martindevans commented Sep 4, 2024

aropb commented Sep 4, 2024

eocron commented Sep 13, 2024

aropb commented Sep 13, 2024 •

edited

Loading

[BUG]: KernelMemory.AskAsync() does not work - exception: object reference not set to an instance of an object #891

[BUG]: KernelMemory.AskAsync() does not work - exception: object reference not set to an instance of an object #891

Comments

aropb commented Aug 3, 2024 • edited Loading

Description

Reproduction Steps

Environment & Configuration

Known Workarounds

jwangga commented Aug 8, 2024

tusharmevl commented Aug 13, 2024

jwangga commented Aug 13, 2024

tusharmevl commented Aug 13, 2024

GalactixGod commented Aug 14, 2024

nicholusi2021 commented Aug 15, 2024

aropb commented Aug 21, 2024 • edited Loading

aropb commented Sep 2, 2024 • edited Loading

martindevans commented Sep 2, 2024 • edited Loading

Wrapper Error

Higher Level Error

aropb commented Sep 2, 2024 • edited Loading

aropb commented Sep 2, 2024

martindevans commented Sep 2, 2024

aropb commented Sep 2, 2024

martindevans commented Sep 2, 2024

aropb commented Sep 2, 2024 • edited Loading

aropb commented Sep 2, 2024 • edited Loading

martindevans commented Sep 4, 2024

martindevans commented Sep 4, 2024

aropb commented Sep 4, 2024

martindevans commented Sep 4, 2024

aropb commented Sep 4, 2024

eocron commented Sep 13, 2024

aropb commented Sep 13, 2024 • edited Loading

aropb commented Aug 3, 2024 •

edited

Loading

aropb commented Aug 21, 2024 •

edited

Loading

aropb commented Sep 2, 2024 •

edited

Loading

martindevans commented Sep 2, 2024 •

edited

Loading

aropb commented Sep 2, 2024 •

edited

Loading

aropb commented Sep 2, 2024 •

edited

Loading

aropb commented Sep 2, 2024 •

edited

Loading

aropb commented Sep 13, 2024 •

edited

Loading