Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: storeResultVector in massive arrays #3234

Open
t2tx opened this issue Feb 16, 2025 · 3 comments · May be fixed by #3311
Open

[BUG]: storeResultVector in massive arrays #3234

t2tx opened this issue Feb 16, 2025 · 3 comments · May be fixed by #3311
Labels
needs info / can't replicate Issues that require additional information and/or cannot currently be replicated, but possible bug possible bug Bug was reported but is not confirmed or is unable to be replicated.

Comments

@t2tx
Copy link
Contributor

t2tx commented Feb 16, 2025

How are you running AnythingLLM?

Docker (local)

What happened?

failed with message:

error: addDocumentToNamespace Invalid string length

Are there known steps to reproduce?

Try to add a big csv file to document which 42M.

Chunks created from document: 296607

use ollama + bge-m3:latest

@t2tx t2tx added the possible bug Bug was reported but is not confirmed or is unable to be replicated. label Feb 16, 2025
@timothycarambat
Copy link
Member

Invalid string length would be related to the issue of how long your embedder model can support in terms of tokens. What do you have set for the embedding model token length?

@timothycarambat timothycarambat added the needs info / can't replicate Issues that require additional information and/or cannot currently be replicated, but possible bug label Feb 16, 2025
@t2tx
Copy link
Contributor Author

t2tx commented Feb 17, 2025

The problem is here.

fs.writeFileSync(writeTo, JSON.stringify(vectorData), "utf8");

My array is too big, will try to make some workaround later. (split to multiple cache files if vector data array is too long?)

@timothycarambat
Copy link
Member

Since that is simply and I/O function - are you limited on disk space? That function should work fine as long as your system can handle it memory wise.

@timothycarambat timothycarambat changed the title [BUG]: can't handle huge array in storeVectorResult [BUG]: storeResultVector in massive arrays Feb 17, 2025
@t2tx t2tx linked a pull request Feb 21, 2025 that will close this issue
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs info / can't replicate Issues that require additional information and/or cannot currently be replicated, but possible bug possible bug Bug was reported but is not confirmed or is unable to be replicated.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants