Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - net::ERR_INSUFFICIENT_RESOURCES Upon Uploading Large Corpus #176

Closed
peachkeel opened this issue Nov 8, 2023 · 8 comments · Fixed by #244
Closed

[BUG] - net::ERR_INSUFFICIENT_RESOURCES Upon Uploading Large Corpus #176

peachkeel opened this issue Nov 8, 2023 · 8 comments · Fixed by #244
Labels
type: bug 🐛 Something isn't working

Comments

@peachkeel
Copy link
Contributor

peachkeel commented Nov 8, 2023

Bug description

Ragna UI hangs while attempting to upload 2261 documents via the file chooser.

How to reproduce the bug?

  1. Prepare a corpus of more than 1349 files.
  2. Start Ragna UI.
  3. Use browser to navigate to UI.
  4. Start a conversation by uploading the corpus via the file chooser.
  5. Wait for browser to hang.
  6. Inspect JavaScript on the browser side.
  7. Look for "net::ERR_INSUFFICIENT_RESOURCES" message(s).

Versions and dependencies used

  • Ragna v0.1.1
  • Chromium v117.0.5938.132 (Official Build) Linux

Anything else?

It seems like this might be a known bug in Chrome/Chromium that has to do with making too many requests at once.

See also:
https://scratch.mit.edu/discuss/m/topic/88418/

@peachkeel peachkeel added the type: bug 🐛 Something isn't working label Nov 8, 2023
@pmeier
Copy link
Member

pmeier commented Nov 8, 2023

We are currently uploading with

function upload(files, token, informationsEndpoint, final_callback) {
Promise.all(
Array.from(files).map((file) => {
return uploadFile(file, token, informationsEndpoint);
}),
).then(final_callback);
}

My JS knowledge is not what you would call advanced. From a quick Google search, there seems to be no way to limit the number of concurrent requests when using Promise.all. Happy for someone to prove me wrong.

That being said, uploading 2k+ documents seems like an odd use case to begin with. Could you elaborate here to help me understand?

A workaround is always to perform the upload manually by hitting the API directly and just use the UI to ask questions.

@peachkeel
Copy link
Contributor Author

LoL, I thought 2k+ documents was actually kind of small. It's like 1/9th of what I'm dealing with right now. I'm just trying to get Ragna deployed so my bosses can try out different vector stores and LLMs in a user-friendly manner.

Anyway, my JS knowledge is also quite weak; however, I remembered a little bit of recursion from CS and, with the help of GPT-4, I was able to write some code that fixes this bug. I've tested it and have gotten 2252 files into Ragna successfully via the file chooser on Chromium.

The replacement to upload() is as follows:

function upload(files, token, informationsEndpoint, final_callback) {
  const batchSize = 500; // Maximum number of concurrent uploads
  let currentIndex = 0; // Tracks the index of the current file to be uploaded
  let successfullyUploaded = []; // Array to hold the results

  // Function to upload a single batch of files
  function uploadBatch() {
    // Get the next batch of files based on the current index and batch size
    const batch = Array.from(files).slice(currentIndex, currentIndex + batchSize);
    currentIndex += batchSize;
    // Map over the batch to create an array of upload promises
    return Promise.all(
      batch.map(file => uploadFile(file, token, informationsEndpoint))
    ).then(results => {
      // Concatenate the results of the current batch to the main array
      successfullyUploaded = successfullyUploaded.concat(results);
      // If there are more files to upload, recursively call uploadBatch again
      if (currentIndex < files.length) {
        return uploadBatch();
      } else {
        // If all files are uploaded, invoke the final callback with the results
        final_callback(successfullyUploaded);
      }
    });
  }

  // Start uploading the first batch
  uploadBatch().catch(error => {
    console.error("An error occurred during the upload process", error);
    // You might want to call your final callback with an error or partial results
    final_callback(successfullyUploaded);
  });
}

@pmeier
Copy link
Member

pmeier commented Nov 9, 2023

LoL, I thought 2k+ documents was actually kind of small. It's like 1/9th of what I'm dealing with right now.

Indeed 2k+ documents for RAG is not an issue. It is an issue for the current default use case for Ragna however.

Our main focus for the first release was to enable easy experimentation. Meaning you can set parameters like the chunk size and chunk overlap for each chat to compare the results. This requires us to treat everything on a per-chat basis. Meaning, for each chat that you have, you'll need to upload documents and they will be embedded. This is not feasible for 2k+ documents.

However, your use case seems to be: "I have a large corpus of documents. I want to upload them once and later on only select a few or potentially all of them for questioning." Is that correct? If so, you could trade in some of flexibility, i.e. the ability to configure stuff on the per-chat basis, to make it work. Still, we need to implement a few things to have a good UX for this:

  • Attach and ID to each document that we have
  • Write a new source storage that only has a single collection / table / ....
  • Write a script to store all documents on the new source storage for all the documents
  • Make the .store() method a no-op
  • Let the .retrieve() method filter on the document ID metadata rather than using all documents in the collection.
  • Swap out the file uploader in the UI to a component that allows users to select files, afterwards gets the IDs of the selected documents and starts the chat right away, skipping the upload.

That is not an easy, but certainly a valid use case. In fact we had an offline request for this scenario already.

@peachkeel
Copy link
Contributor Author

I think you've summarized things well. The main use-case is to compare various LLM/Vector combinations, where Vector is kind of both the storage mechanism and the embedding (Aside: I think they're more concerned about embedding performance than storage performance). So, they want to experiment as follows:

  • LLM x1 vs. Vector y1
  • LLM x1 vs. Vector y2
  • LLM x1 vs. Vector y3
  • LLM x2 vs. Vector y1

To mimic this setup, I was just going to create X*Y independent conversations by re-uploading the data each time and letting them try out the various combinations. Obviously, being able to upload a corpus once and then instantiate conversations off of it would be ideal, but I don't think I have the time or inclination to customize Ragna that much right now.

@pmeier
Copy link
Member

pmeier commented Nov 10, 2023

What I would do for your use case right now, is to hit the API directly to upload the documents and embed them and afterwards hand the UI over to your bosses. We have the add_chats.py script that we have been using while building the UI. With that, you only need to upload the documents once although you still have to embed them for every chat. But you could do that in parallel at least.

In case you are using a non-memory queue, I would also start the worker (and in turn the API) manually. This gives you control over the number of worker threads. So basically you want ragna worker --num-threads 4 && ragna api && ragna ui. When you just use ragna ui, by default it starts with only one worker thread.

@pmeier
Copy link
Member

pmeier commented Nov 10, 2023

(Aside: I think they're more concerned about embedding performance than storage performance)

You might want to chime in on #191 in that case.

@peachkeel
Copy link
Contributor Author

Hey, @pmeier, I'll chime in on #191 in a bit.

FYI, ragna worker --num-threads 4 && ragna api && ragna ui blocks as && waits for each process to finish. I tried using single ampersands (i.e., ragna worker --num-threads 4 & ragna api & ragna ui) to background things but the different processes seem to want to step on each other (e.g., a total of 5 threads are started on the queue in two different process, contention for ports, etc.) and I got scared.

@pmeier
Copy link
Member

pmeier commented Nov 14, 2023

Sorry, yeah, it needs to be single &. It is only the UI that needs the API to be available on start. Meaning, you can potentially just run the commands one after each other and wait a little in between. This is exactly what we are doing as well, if you don't do it yourself:

ragna/ragna/_cli/core.py

Lines 167 to 182 in 4962665

if start_api is None:
start_api = not check_api_available()
if start_api:
process = subprocess.Popen(
[
sys.executable,
"-m",
"ragna",
"api",
"--config",
config.__ragna_cli_config_path__, # type: ignore[attr-defined]
],
stdout=sys.stdout,
stderr=sys.stderr,
)

ragna/ragna/_cli/core.py

Lines 122 to 134 in 4962665

if start_worker:
process = subprocess.Popen(
[
sys.executable,
"-m",
"ragna",
"worker",
"--config",
config.__ragna_cli_config_path__, # type: ignore[attr-defined]
],
stdout=sys.stdout,
stderr=sys.stderr,
)

peachkeel added a commit to peachkeel/ragna that referenced this issue Dec 11, 2023
This was referenced Dec 11, 2023
pmeier added a commit that referenced this issue Dec 15, 2023
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
pmeier added a commit that referenced this issue Dec 20, 2023
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug 🐛 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants