In [None]:
!pip install -Uq phidata groq tavily-python wikipedia pypdf beautifulsoup4 mistralai fastembed lancedb tantivy

In [None]:
import os
from google.colab import userdata
os.environ['MISTRAL_API_KEY'] = userdata.get("MISTRAL_API_KEY")
os.environ['TAVILY_API_KEY'] = userdata.get("TAVILY_API_KEY")

In [None]:
from phi.agent import Agent
from phi.tools.wikipedia import WikipediaTools
from phi.tools.tavily import TavilyTools
from phi.model.mistral import MistralChat
from phi.knowledge.pdf import PDFKnowledgeBase,PDFReader
from phi.vectordb.lancedb import LanceDb,SearchType
from phi.embedder.sentence_transformer import SentenceTransformerEmbedder
from phi.document.chunking.document import DocumentChunking

In [None]:
knowledge_base = PDFKnowledgeBase(
    path="/content/docs",
    vector_db=LanceDb(
        table_name="pdfs",
        uri="tmp/lancedb",
        search_type=SearchType.vector,

        embedder=SentenceTransformerEmbedder()
    ),
    chunking_strategy=DocumentChunking(chunk_size=500),
    reader=PDFReader(chunk=True)
)

In [None]:
knowledge_base.load(recreate=True,upsert=True)

In [None]:
rag_agent = Agent(
    model=MistralChat(id="mistral-large-latest"),
    role="Search the documents for information",
    knowledge=knowledge_base,
    search_knowledge=True,
    show_tool_calls=True,
    markdown=True,
    instructions=["Answer Question Accurately","Do not Hallucinate", "Always include source"]
)

In [None]:
tool_agent = Agent(
    model=MistralChat(id="mistral-large-latest"),
    role="Search the web for information",
    tools = [TavilyTools()],
    instructions=["Answer in summarized form",'Always include sources'],
    show_tool_calls=True,
    markdown=True
)

In [None]:
tool_agent.print_response("What is Agentic AI",stream=True)

Output()

In [None]:
query = "Summarize all these documents in bullet points"
from IPython.display import Markdown,display

In [None]:
multi_agent = Agent(
    model=MistralChat(id="mistral-large-latest"),
    team=[rag_agent,tool_agent],
    knowledge=knowledge_base,
    show_tool_calls=True,
    markdown=True,
    instructions=["Answer Question Accurately","Do not Hallucinate", "Always include source","Answer in summarized form",'Always include sources']
)
multi_agent.print_response("What are the documents about?",stream=True)


Output()

In [None]:
response = multi_agent.run(query)

In [None]:
display(Markdown("# " + query))
display(Markdown(response.content))

# Summarize all these documents in bullet points


 - Running: transfer_task_to_agent_0(task_description=Summarize the documents in bullet points., expected_output=A summarized list of bullet points., additional_information=The documents are attached here.)

- **Implementation Details for Open-Domain QA**:
  - RAG-Token models use 15 retrieved documents.
  - RAG-Sequence models use 50 retrieved documents with Thorough Decoding.
  - Greedy decoding is employed for QA.
  - For Open-MSMarco and Jeopardy question generation, 10 retrieved documents are used.
  - A BART-large model is trained as a baseline.
  - Beam size of four and Fast Decoding are applied for RAG-Sequence models.

- **Human Evaluation**:
  - Figure 4 shows the user interface for evaluating factuality.
  - Annotators used the internet for research and followed detailed instructions.
  - Annotations from two underperforming annotators were removed.

- **Training Setup**:
  - All models are trained using Fairseq with mixed precision.
  - Training can be run on one GPU but is distributed across 8 NVIDIA V100 GPUs.
  - Maximum Inner Product Search with FAISS is used for document indexing.
  - Code is open-sourced and ported to HuggingFace Transformers.
  - Document index compression reduces CPU memory requirement to 36GB.

- **Document Posterior for Jeopardy Generation**:
  - Figure 2 illustrates the document posterior for each generated token.
  - High posteriors are observed for specific document generations.

- **Examples from Generation Tasks**:
  - RAG models generate more specific and factually accurate responses.
  - Examples include MS-MARCO, currency in Scotland, and Jeopardy questions.

- **Comparison with Thorne and Vlachos**:
  - RAG achieves accuracy within 2.7% of Thorne and Vlachos's model.
  - Top retrieved documents are from gold articles in 71% of cases.

- **Additional Results**:
  - RAG models are more factual and diverse than BART.
  - Learned retrieval improves results across tasks.
  - RAG's dense retriever outperforms BM25 on most tasks.

- **Index Hot-Swapping**:
  - RAG can update knowledge by swapping the index.
  - Comparisons show model adaptation to new information.

- **Models and Training**:
  - RAG-Sequence uses the same document for each target token.
  - RAG-Token can use different documents for each target token.
  - The retriever is based on DPR with a bi-encoder architecture.
  - The generator uses BART-large.
  - Training is done jointly without direct supervision.

- **Further Details on Open-Domain QA**:
  - Multiple answer annotations are used for training.
  - Answer candidates are filtered for TriviaQA.
  - Results are reported using DPR dataset splits.

- **Further Details on FEVER**:
  - FEVER classification involves regenerating the claim.
  - The final hidden state representation is used for classification.

- **Null Document Probabilities**:
  - Experimented with a "Null document" mechanism.
  - Different methods were tried but did not improve performance.

- **Parameters**:
  - RAG models have 626M trainable parameters, including BERT-base and BART-large components.