Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Roadmap] RAG #1657

Open
30 of 39 tasks
thinkall opened this issue Feb 13, 2024 · 33 comments
Open
30 of 39 tasks

[Roadmap] RAG #1657

thinkall opened this issue Feb 13, 2024 · 33 comments
Assignees
Labels
0.2 Issues which were filed before re-arch to 0.4 rag retrieve-augmented generative agents roadmap Issues related to roadmap of AutoGen

Comments

@thinkall
Copy link
Collaborator

thinkall commented Feb 13, 2024

Why RAG

Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of LLMs by incorporating a retrieval mechanism into the generative process. This approach allows the model to leverage a vast amount of relevant information from a pre-existing knowledge base, which can significantly improve the quality and accuracy of its generated responses. Thus, for agents chat, incorporating a RAG agent offers several compelling advantages that can significantly enhance the performance and utility of your agent system.

RAG in AutoGen

AutoGen has provided RetrieveUserProxyAgent and RetrieveAssistantAgent for performing RetrieveChat in Aug, 2023 and announced it in blog in Oct, 2023. Given a set of documents, the Retrieval-augmented User Proxy first automatically processes documents—splits, chunks, and stores them in a vector database. Then for a given user input, it retrieves relevant chunks as context and sends it to the Retrieval-augmented Assistant, which uses LLM to generate code or text to answer questions. Agents converse until they find a satisfactory answer.

retrievechat-arch-old

As both AutoGen and RAG are evolving very fast, we find that many users are asking for supports on customized vector databases, incremental document ingesting, customized retrieve/re-ranking algorithms, customized RAG pattern/workflow, etc. We've adjusted some of the issues and feature requests, such as we've added QdrantRetrieveUserProxyAgent for using qdrant as the vector db; we've integrated UNSTRUCTURED to support many unstructed documents. However, there are many more to do.

Our Plan

In order to better support RAG in AutoGen, we plan to refactor the existing RetrieveChat agents. The goals includes:

Primary goals

  • Support launching RAG with one agent instead of two
  • Support customizing vector databases with a parameter instead of extending agent class
  • Support RAG in AutoGen Studio
  • Support leveraging 3rd-party OSS tools
  • Make RAG a capability for any conversable agent
  • Support RAG as a tool like in OpenAI Assistant
  • Make vector db dependency optional
  • the chat interface of the RAG agent is the same as any other conversable agent

Optional goals

  • Support async functions
  • Support benchmarks
  • Support evaluation

Tasks

  1. bug rag
  2. feature rag
  3. bug group chat/teams tool-usage
  4. rag tool-usage
  5. 4 of 4
    rag
    thinkall
  6. 0.2 feature
    thinkall
  7. rag
  8. bug rag
  9. 0.2 rag
    thinkall
  10. 0.2 rag
    thinkall
  11. 0.2 rag
    thinkall
  12. 0.2 proj-studio rag
    Knucklessg1 victordibia
  13. feature rag
  14. rag
  15. rag
    thinkall
  16. proj-studio rag vectordb
    Knucklessg1
  17. rag
  18. rag vectordb
    Knucklessg1
  19. rag vectordb
    Knucklessg1
  20. bug rag vectordb
    Knucklessg1
  21. rag
  22. rag
  23. 0.2 needs-triage
  24. rag
  25. rag
  26. rag
  27. rag
    thinkall
  28. 3 of 4
    roadmap
@Knucklessg1
Copy link
Contributor

Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.

@julianakiseleva
Copy link
Contributor

@thinkall

@thinkall
Copy link
Collaborator Author

thinkall commented Mar 6, 2024

Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.

Hi @Knucklessg1 , contribution is welcome, thank you for your interest!

@WaelKarkoub
Copy link
Contributor

Hi @thinkall, would this PR, #2046, help out with Automatically decide whether RAG is needed?

I was thinking if the agent adds a tag like <rag context="some context"> in the message, we can intercept that by one of the hooks or even a reply, and then perform some rag operations

@thinkall
Copy link
Collaborator Author

Hi @thinkall, would this PR, #2046, help out with Automatically decide whether RAG is needed?

I was thinking if the agent adds a tag like <rag context="some context"> in the message, we can intercept that by one of the hooks or even a reply, and then perform some rag operations

Thank you @WaelKarkoub , interesting idea! Would adding mean RAG is already performed?

@WaelKarkoub
Copy link
Contributor

WaelKarkoub commented Mar 20, 2024

@thinkall we could define what that tag means by adding attributes (e.g. <rag context="some context" task="search">could mean it needs to look through some databases) I'm not fully familiar with how rag works, but that tag system should be general enough for multiple use cases.

@ChristianWeyer
Copy link

Great initiative @thinkall.
How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?

@thinkall
Copy link
Collaborator Author

Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?

Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?

@ChristianWeyer
Copy link

Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?

Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?

One thing is that there are so many connector & retriever implementations in LangChain, that it would not make sense to reinvent the wheel and trying to keep up. Same goes for embedding support.

@thinkall
Copy link
Collaborator Author

Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?

Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?

One thing is that there are so many connector & retriever implementations in LangChain, that it would not make sense to reinvent the wheel and trying to keep up. Same goes for embedding support.

Agree!

Would you like to have a quick chat on this? It would be great to hear more from you!

@dsalas-crogl
Copy link

@thinkall Will the upcoming RAG update still require using message_generator in groupchat scenarios? It's my understanding that currently, the RAG agent has to initiate chat and message_generator has to be used, which results in all initial prompt messages being sent through retrieve_docs in RetrieveUserProxyAgent.

@ChristianWeyer
Copy link

Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?

Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?

One thing is that there are so many connector & retriever implementations in LangChain, that it would not make sense to reinvent the wheel and trying to keep up. Same goes for embedding support.

Agree!

Would you like to have a quick chat on this? It would be great to hear more from you!

Sure. I am cethewe in AG Discord.

@thinkall
Copy link
Collaborator Author

Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.

Hi @Knucklessg1 , contribution is welcome, thank you for your interest!

Hi @Knucklessg1 , are you in our Discord channel? Could we have a quick chat? Thanks.

@thinkall
Copy link
Collaborator Author

thinkall commented Mar 30, 2024

@thinkall Will the upcoming RAG update still require using message_generator in groupchat scenarios? It's my understanding that currently, the RAG agent has to initiate chat and message_generator has to be used, which results in all initial prompt messages being sent through retrieve_docs in RetrieveUserProxyAgent.

Hi @dsalas-crogl , I'd like to remove the usage of message_generator, would that benefit your use case? Thanks.

Are you in our Discord channel?

@Knucklessg1
Copy link
Contributor

Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.

Hi @Knucklessg1 , contribution is welcome, thank you for your interest!

Hi @Knucklessg1 , are you in our Discord channel? Could we have a quick chat? Thanks.

Yes absolutely. I reached out on Discord.

@jamesliu
Copy link
Contributor

@thinkall any flow diagram regarding the rag?

@thinkall
Copy link
Collaborator Author

@thinkall any flow diagram regarding the rag?

Hi @jamesliu , there's one diagram here, you can find the workflow details in the Introduction section.

@thinkall thinkall mentioned this issue Apr 3, 2024
3 tasks
@Josephrp
Copy link

Josephrp commented Apr 3, 2024

interesting roadmap , and i'm very happy with chromadb , looking forward to in memory vector store too , now. if anyone is interested it could be a good opportunity to collaborate and break down complex tasks .

i'll also consider creating + sharing an "advanced upsert" agent , which enriches the text chunks to improve retrieval performance.

@raolak
Copy link

raolak commented Apr 8, 2024

Are RAG applications limited to document processing, or do they extend to code-related tasks as well? For instance:

  • Existing code on which we need to run fix? or
  • New codes (eg. new service) which goes through incremental development

Usecase:

  • In a system comprised of numerous microservices that are regularly updated with new features or fixes, how can an agent acquire knowledge about these changes? Is RAG a viable method for this?
  • For the development of new microservices within an existing system, how can knowledge be transferred to agents to enhance the design and implementation process using RAG?

@thinkall
Copy link
Collaborator Author

thinkall commented Apr 9, 2024

Are RAG applications limited to document processing, or do they extend to code-related tasks as well? For instance:

  • Existing code on which we need to run fix? or
  • New codes (eg. new service) which goes through incremental development

Usecase:

  • In a system comprised of numerous microservices that are regularly updated with new features or fixes, how can an agent acquire knowledge about these changes? Is RAG a viable method for this?
  • For the development of new microservices within an existing system, how can knowledge be transferred to agents to enhance the design and implementation process using RAG?

RAG can help if the documents are well organized. For pure code, currently, you can use 3rd party code chunk methods to help load the code into the vector dbs.

@ekzhu
Copy link
Collaborator

ekzhu commented Apr 12, 2024

Let's also add documentation task to the roadmap? We should have a rag category under https://microsoft.github.io/autogen/docs/topics

@ChristianWeyer
Copy link

ChristianWeyer commented Apr 12, 2024

Do we also have a task on the roadmap for using custom embeddings? This is a very vital and important requirement for successful RAG.
Another would be to use a re-ranking model optionally to improve RAG results. @thinkall

@thinkall
Copy link
Collaborator Author

Do we also have a task on the roadmap for using custom embeddings? This is a very vital and important requirement for successful RAG. Another would be to use a re-ranking model optionally to improve RAG results. @thinkall

Custom embeddings are already supported and will also be supported in the new version.

Re-ranking may also be supported, but we may not implement the algorithms, instead we could support plugin different re-ranking models.

@maximedupre
Copy link

Are RAG applications limited to document processing, or do they extend to code-related tasks as well? For instance:

  • Existing code on which we need to run fix? or
  • New codes (eg. new service) which goes through incremental development

Usecase:

  • In a system comprised of numerous microservices that are regularly updated with new features or fixes, how can an agent acquire knowledge about these changes? Is RAG a viable method for this?
  • For the development of new microservices within an existing system, how can knowledge be transferred to agents to enhance the design and implementation process using RAG?

RAG can help if the documents are well organized. For pure code, currently, you can use 3rd party code chunk methods to help load the code into the vector dbs.

@thinkall Is it possible for you to provide an example of a 3rd party code chunk method? I'm very interested in extending the knowledge of an agent to my whole codebase :)

@raolak
Copy link

raolak commented Apr 13, 2024 via email

@cforce
Copy link

cforce commented Apr 13, 2024

@raolak

A solution for this has just been released
Checkout https://github.com/princeton-nlp/SWE-agent

@thinkall
Copy link
Collaborator Author

@thinkall Is it possible for you to provide an example of a 3rd party code chunk method? I'm very interested in extending the knowledge of an agent to my whole codebase :)

Hi @maximedupre , please check out an example of using 3rd party chunk method here: https://microsoft.github.io/autogen/blog/2023/10/18/RetrieveChat/#customizing-text-split-function

@randombet randombet self-assigned this Jul 2, 2024
@tianlinzx
Copy link

Any plan to integrate with GraphRAG ??

@wammar
Copy link

wammar commented Aug 11, 2024

@thinkall @qingyun-wu

Some of my team members recently started using AutoGen with RAG and are interested in contributing, but it is unclear what you're actively working on and what tasks do you need help with. It'd be great to collaborate if there's good alignment between this work stream and Holistic Intelligence for Global Good.

@thinkall
Copy link
Collaborator Author

Any plan to integrate with GraphRAG ??

We're working on it.
cc @randombet , @2152505

@thinkall
Copy link
Collaborator Author

@thinkall @qingyun-wu

Some of my team members recently started using AutoGen with RAG and are interested in contributing, but it is unclear what you're actively working on and what tasks do you need help with. It'd be great to collaborate if there's good alignment between this work stream and Holistic Intelligence for Global Good.

Thank you very much, @wammar, for your feedbacks! The tasks list contains RAG related issues and PRs we're working on. You're very welcome to raise PRs for resolving existing issues or propose new features such as new vector dbs, new retrieve util functions (file parsing, chunking, etc.), or review PRs.

Any thoughts, suggestions, comments are very welcome.

@rysweet rysweet added 0.2 Issues which were filed before re-arch to 0.4 needs-triage labels Oct 2, 2024
@zhwuwuwu
Copy link

zhwuwuwu commented Oct 18, 2024

Any plan to integrate with GraphRAG ??

We're working on it. cc @randombet

Is there any progress on it?

@devspacenine
Copy link

How will the RAG pattern change with the new v0.4 architecture?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.2 Issues which were filed before re-arch to 0.4 rag retrieve-augmented generative agents roadmap Issues related to roadmap of AutoGen
Projects
No open projects
Status: In Progress
Development

No branches or pull requests