Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for summarization #50

Open
taranjeet opened this issue Jun 24, 2023 · 3 comments
Open

Add support for summarization #50

taranjeet opened this issue Jun 24, 2023 · 3 comments
Labels
easy Easy difficulty enhancement New feature or request size:M This PR changes 30-99 lines, ignoring generated files.

Comments

@taranjeet
Copy link
Member

  • Since the data is splitted in chunks, is it possible to implement a summarize function?
  • opened on behalf of discord user edo, message link
@edops1
Copy link

edops1 commented Jun 26, 2023

Hi!
I'll try to explain the context where I would use the tool.
For work I write a lot of interviews (for TV or podcasts). I would like to upload a varied documents about a character (interviews from youtube, books, articles). Then I would ask for these things:

  • the summary of all the documents together
  • the summary of a single document (specifying a compression ratio to the length of the original document). Sometimes I would like a summary of a few sentences, other times I would like to have a 50% reduction so as not to miss important passages

With a set of summaries, I am clear on the big picture and can start asking specific questions to sharpen certain passages in that character's life

@edops1
Copy link

edops1 commented Jun 28, 2023

Hi, I was reading the documentation of langchain and found this document:

Vector store-augmented text generation
https://python.langchain.com/docs/modules/chains/additional/vector_db_text_generation

Is this something that can also be implemented in embedchain?

Since documents (pdf, web pages etc) are already divided into chunks and added to the vector database for q&a, could it be simple to generate new text from these? @taranjeet

Thanks!

@sv-ochis
Copy link

sv-ochis commented Jan 5, 2024

any updates on this ?

@Dev-Khant Dev-Khant added easy Easy difficulty size:M This PR changes 30-99 lines, ignoring generated files. enhancement New feature or request labels Jun 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
easy Easy difficulty enhancement New feature or request size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

No branches or pull requests

4 participants