prompts get too large #16

pieroit · 2023-03-09T19:28:25Z

Use more deeply langchain routines to keep the prompt at limited length (CombineDcoumentsChain etc.).
Summarization may also be appropriated when documents are uploaded.

calebgcc · 2023-03-26T15:55:49Z

I closed issue #49 since this issue is more specific, we can use this issue to discuss how to further implement summarization 🙌.

About your comment in PR #52:

I can test other chain_type to see if I get the same problem with large files.

I'm going to dig a little deeper into the docs that you left (about llama-index) to understand better how to implement the custom summary chain, but if I understand correctly the basic idea is:

get a list of strings in input
group them in different docs
get summary from docs (which becomes new input)
repeat until we have one single short summary

pieroit · 2023-03-27T10:55:59Z

@calebgcc I introduced in the rabbit_hole a TextSplitter that can be customized (chunk_size and chunk_overlap). So Cat users can decide themselves how long they want their text chunks.

You find in docs here a list of langchain Documents (which is just an object with text and metadata) to experiment with file summarization.

pieroit · 2023-04-03T09:17:40Z

PR #68 merged and now file uploads do summarization.
Next step is do summaries when the list of memories recalled here makes the prompt too large.
Leaving this issue open

calebgcc · 2023-04-09T10:04:38Z

@pieroit I was trying to trigger this error, but I think summarizing and chunking the documents solved it.

The documents that are retrieved from the cat are often too small to cause problems, and this adds up to the fact that k is by default 5.

Maybe prompt summarization is no longer necessary, let me know how to proceed, for example we can try increasing the value of k to see how it affects the prompt.

pieroit · 2023-04-09T18:34:41Z

Increasing k is a good test, also if somebody uploads a doc and chooses a large chunk size the problem remains.

There should be a check before inserting memories in the prompt, if they are "too long" they should be summarized.

We can postpone the problem and close this issue as we are mostly covered, or if you feel like it also tackle the above.

Thanks 🙏

pieroit added bug Something isn't working enhancement New feature or request backend labels Mar 9, 2023

pieroit mentioned this issue Mar 27, 2023

Report in frontend the specific server error #61

Closed

calebgcc mentioned this issue Mar 31, 2023

Feature/summarization #68

Merged

pieroit closed this as completed Apr 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prompts get too large #16

prompts get too large #16

pieroit commented Mar 9, 2023

calebgcc commented Mar 26, 2023

pieroit commented Mar 27, 2023 •

edited

pieroit commented Apr 3, 2023

calebgcc commented Apr 9, 2023

pieroit commented Apr 9, 2023

prompts get too large #16

prompts get too large #16

Comments

pieroit commented Mar 9, 2023

calebgcc commented Mar 26, 2023

pieroit commented Mar 27, 2023 • edited

pieroit commented Apr 3, 2023

calebgcc commented Apr 9, 2023

pieroit commented Apr 9, 2023

pieroit commented Mar 27, 2023 •

edited