## Autogen Discord QA

I want to develop a LLM Application that will let me ask any question with regards to Autogen. The application should be able to:
1. Search discord history for answers
2. Multi model capabilities to retrieve images also.
3. Search the internet to get additional information if needed.

To pull message history, I am using an extension called [Discrub](https://chrome.google.com/webstore/detail/discrub/plhdclenpaecffbcefjmpkkbdpkmhhbj/related) as i had issues setting up the Discord API. <br>
Data pull is up until `15th November 2023`.

In [1]:
import os
import sys
import json
import tempfile
from pathlib import Path
sys.path.append('./')

import openai
import autogen
import chromadb

config_list = autogen.config_list_from_dotenv(dotenv_file_path='../../.env')

In [2]:
from autogen.agentchat.contrib.retrieve_assistant_agent import RetrieveAssistantAgent
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent

# Create an instance of RetrieveAssistantAgent
assistant = RetrieveAssistantAgent(
    name="assistant",
    system_message="You are a helpful assistant.",
    llm_config={
        "timeout": 600,
        "seed": 42,
        "config_list": config_list,
    },
)

path = Path(os.getcwd(), 'docs')
client = chromadb.PersistentClient(path=f"{os.getcwd()}/chromadb")

# Create an instance of RetrieveUserProxyAgent
ragproxyagent = RetrieveUserProxyAgent(
    name="ragproxyagent",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    retrieve_config={
        "task": "default",
        "docs_path": str(path), 
        "chunk_token_size": 2000,
        "model": config_list[0]["model"],
        "client": client, 
        "embedding_model": "all-mpnet-base-v2",
    },
)


In [3]:
# NOTE: Delete dir each instantiation
client.delete_collection('autogen-docs')

In [4]:
# From a txt document 
user_question = "How can Langchain be used with Autogen? Provide a code snippet example in your response."
ragproxyagent.initiate_chat(assistant, problem=user_question, clear_history=True) 
# ragproxyagent.send(recipient=assistant, message=user_question)

Trying to create collection.


INFO:autogen.retrieve_utils:Found 281 chunks.


doc_ids:  [['doc_159', 'doc_241', 'doc_205', 'doc_244', 'doc_135', 'doc_213', 'doc_12', 'doc_166', 'doc_141', 'doc_181', 'doc_106', 'doc_233', 'doc_195', 'doc_61', 'doc_229', 'doc_49', 'doc_152', 'doc_48', 'doc_93', 'doc_214']]
[32mAdding doc_id doc_159 to context.[0m
[32mAdding doc_id doc_241 to context.[0m
[32mAdding doc_id doc_205 to context.[0m
[32mAdding doc_id doc_244 to context.[0m
[33mragproxyagent[0m (to assistant):

You're a retrieve augmented chatbot. You answer user's questions based on your own knowledge and the
context provided by the user. You should follow the following steps to answer a question:
Step 1, you estimate the user's intent based on the question and context. The intent can be a code generation task or
a question answering task.
Step 2, you reply based on the intent.
If you can't answer the question with or without the current context, you should reply exactly `UPDATE CONTEXT`.
If user's intent is code generation, you must obey the following rules:


In [None]:
# Print documents retrieved from Query
ragproxyagent.retrieve_docs("Chainlit")
ragproxyagent._results['documents']

In [None]:
# Print documents retrieved from Query
ragproxyagent.retrieve_docs("Langchain")
ragproxyagent._results['documents']