## RAG: loading multiple URLs and bs4


In [None]:
import bs4
from IPython.core.display import Markdown
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from rich.jupyter import display

# Web Documents

In [None]:
urls = ['https://research.aimultiple.com/retrieval-augmented-generation/', 'https://kbourne.github.io/chapter1.html']

## Load Documents
### Target specific class from documents

In [None]:
strainer = bs4.SoupStrainer(class_=["article_articleDetail__dMzTY", "post-single"])

loader = WebBaseLoader(
    web_paths=urls,
    bs_kwargs=dict(
        parse_only=strainer
    ),
)


In [None]:
loader.web_paths

['https://research.aimultiple.com/retrieval-augmented-generation/',
 'https://kbourne.github.io/chapter1.html']

In [None]:
docs = loader.load()

In [None]:
for _ in docs:
    print(f"Metadata: {_.metadata}")
    print(f"Content: {_.page_content[:100].strip()}")
    print("\n")

Metadata: {'source': 'https://research.aimultiple.com/retrieval-augmented-generation/'}
Content: Generative AI stats show that Gen AI tools and models like ChatGPT have the potential to automate kn


Metadata: {'source': 'https://kbourne.github.io/chapter1.html'}
Content: Introduction to Retrieval Augmented Generation (RAG)
    
Date: March 10, 2024  |  Estimate




In [None]:
'70%' in docs[0].page_content

True

# Embedding

In [None]:
embeddingOllama = OllamaEmbeddings(
    model='nomic-embed-text',
    show_progress=False,
    # persist_directory=persist_directory
)

# Split texts

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [None]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=2024,
    chunk_overlap=200
)

In [None]:
# Split

# text_splitter = SemanticChunker(embeddingOllama)
splits = text_splitter.split_documents(docs)

In [None]:
len(splits)

41

# Vector Database

In [None]:
# Embed
vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=embeddingOllama,
    persist_directory=None,  # path
)

# vectorstore.persist() # if 'persist_directory' path is provided

retriever = vectorstore.as_retriever()

## retriever search types

In [None]:
retriever.allowed_search_types

('similarity', 'similarity_score_threshold', 'mmr')

In [None]:
retriever.search_type

'similarity'

# Prompt Template

In [None]:
prompt_template = """
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know.

Question: {question}

Context: {context}

Answer:
"""

In [None]:
from langchain.prompts import PromptTemplate

prompt = PromptTemplate.from_template(prompt_template)

In [None]:
# Post-processing

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

## LLM from Ollama

In [None]:
from langchain_community.chat_models import ChatOllama

llm = ChatOllama(
    model="llama3.2",
    temperature=0.5
)

### Chain it all together with LangChain

In [None]:
rag_chain = (
        {"context": retriever | format_docs, "question": RunnablePassthrough()}
        | prompt
        | llm
        | StrOutputParser()
)

## QnA Section

In [None]:
def ask_rag(question: str) -> None:
    """
    Asks a question to the RAG chain and displays the answer in Markdown format.

    Parameters:
    question (str): The question to be asked.

    Returns:
    None
    """
    # Invoke the RAG chain with the question
    response = rag_chain.invoke(question)

    # Display the response in Markdown format
    display(Markdown(response))

In [None]:
question = ''
ask_rag(question)

I can help you answer questions about Retrieval Augmented Generation (RAG). What is your question?

In [None]:
# Question - run the chain
question = "What are the advantages of using RAG?"
ask_rag(question)

The potential advantages of using RAG include:

1. Improved accuracy and relevance
2. Customization
3. Flexibility
4. Expanding the model's knowledge beyond the training data.

These advantages are further explored in the context of leveraging LLM (Large Language Model) within a company's private or specific data needs.

In [None]:
ask_rag("What are the disadvantages of using RAG?")

I don't know the disadvantages of using RAG. The retrieved context only discusses its advantages, such as being superior for retrieving factual information that is not present in the LLM's training data or is private, and allowing for dynamic integration of external knowledge without modifying the model's weights. It does not mention any potential drawbacks or disadvantages of using RAG.

In [None]:
ask_rag("What are the limitations of using RAG?")

The limitations of using RAG (Retrieval-Augmented Generation) are not explicitly stated in the provided context. However, I can infer that it is generally superior to fine-tuning for retrieving factual information that is not present in the LLM's training data or is private.

I don't know the specific limitations of RAG beyond what is mentioned in the context.

In [None]:
ask_rag("Compare RAG with GenerativeAI")

Based on the provided context, I can compare RAG (Relevance-Aware Generation) with GenerativeAI.

While the text doesn't provide direct information about GenerativeAI's capabilities or methodology, we can infer some differences between RAG and GenerativeAI:

1. Data usage: RAG utilizes internal company data to improve its performance, whereas GenerativeAI seems to rely on external knowledge bases or training data.
2. Approach: RAG is a hybrid approach that combines retrieval-augmented generation with a specialist LM drafter and a generalist LM verifier. In contrast, the text suggests that conventional GenerativeAI lacks this level of internal understanding and data integration.
3. Goals: RAG aims to enhance factual recall and refine performance on specialized tasks by leveraging internal company data. GenerativeAI's goals are not explicitly stated in the context.

However, I don't know enough about GenerativeAI's specific capabilities, strengths, or weaknesses to make a more detailed comparison.

In [None]:
question = "Explain in brief RAG and fine-tuning. Also mention the source url from where the answer is being collected"
ask_rag(question)

Based on the provided context, here's a brief explanation of RAG and fine-tuning:

RAG (Relational Attention-based Graph) is generally superior for retrieving factual information that is not present in the LLM's training data or is private. It allows for dynamic integration of external knowledge without modifying the model's weights.

Fine-tuning, on the other hand, is more suitable for teaching the model specialized tasks or adapting it to a specific domain. However, it requires careful consideration of context window sizes and the potential for overfitting when fine-tuning on a specific dataset.

Source URL: Unfortunately, I couldn't find a specific source URL in the provided context, but the information seems to be based on general knowledge about RAG and fine-tuning in natural language processing (NLP) tasks.

In [None]:
question = "Are there any other retrieval models like RAG?"
ask_rag(question)

Based on the context provided, it appears that Retrieval-augmented Generation (RAG) is a hybrid approach that combines elements of both retrieval and generation models.

As for other retrieval models like RAG, I'm not aware of any specific ones mentioned in the context. However, some other retrieval models mentioned are:

1. BART with Retrieval: This seems to be a variant of the BART model that incorporates retrieval capabilities.
2. BM25: A widely used retrieval algorithm for text search.
3. ColBERT Model: A contextualized word embedding model that can be used for retrieval tasks.
4. DPR (Document Passage Retrieval) Model: A state-of-the-art retrieval model that can be fine-tuned for specific tasks.

It's worth noting that RAG is a relatively new and emerging approach, and there may not be many other retrieval models specifically designed to work in a similar way.

In [None]:
# Example usage
question = "Are there any other retrieval models like RAG?"
ask_rag(question)

Based on the provided context, it appears that Retrieval-Augmented Generation (RAG) is a hybrid approach that combines elements of both retrieval and generation models to improve the quality and relevance of generated content.

As for other retrieval models like RAG, I couldn't find any specific information in the context. However, the context does mention some other retrieval models such as BM25, ColBERT Model, and DPR (Document Passage Retrieval) Model, but it doesn't explicitly state that they are similar to RAG.

If you're looking for alternative retrieval models that share similarities with RAG, I would suggest exploring research papers or academic articles on the topic of hybrid retrieval-augmented generation models.

In [None]:
ask_rag("provide me with all the links available")

I don't know. The provided context doesn't mention any specific links. It appears to be a discussion about question-answering tasks and the strengths and weaknesses of different approaches, including Retrieval-Augmented Generative Models (RAGs) and other techniques. If you could provide more context or clarify what you're looking for, I'd be happy to try and help further.

In [None]:
ask_rag("what is estimated reading time?")

I don't know the answer to that question. The context provided only discusses Retrieval Augmented Generation (RAG) and its potential for information overload, but does not mention estimated reading time.