# RAG Implementation For Wikipedia Search Assistant

## References:

[Wikipedia]('https://en.wikipedia.org/wiki/')


## Description
The goal of this experiment is to implement a Retrieval-Augmented Generation (RAG) pipeline to answer user queries by combining document retrieval from Wikipedia with a language model.



##Setup

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
cd "YOUR-PATH"

In [None]:
# Install necessary libraries
%%capture
!pip install langchain langchain-community wikipedia wikipedia-api faiss-cpu chromadb tiktoken sentence-transformers openai


In [None]:
# Import Libraries. These libraries will be used to build the RAG pipeline
from langchain.document_loaders import WikipediaLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.embeddings import HuggingFaceEmbeddings
import time
from openai import OpenAIError  # Generic OpenAI exception class

In [None]:
# Set up OpenAI API Key
import os
os.environ['OPENAI_API_KEY'] = 'YOUR-OPENAI-API-KEY'

In [None]:
# Load Wikipedia Articles
# Define a function to fetch Wikipedia articles and convert them into documents
def load_wikipedia_articles(queries):
    documents = []
    for query in queries:
        try:
            loader = WikipediaLoader(query=query)
            docs = loader.load()
            documents.extend(docs)
        except Exception as e:
            print(f"Error loading {query}: {e}")
    return documents

In [None]:
# Example: Load Wikipedia articles based on user-defined queries
queries = ["Artificial Intelligence", "Machine Learning", "Natural Language Processing"]
documents = load_wikipedia_articles(queries)
print(f"Loaded {len(documents)} documents.")

Loaded 75 documents.


In [None]:
# Load the embedding model
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')  # Local embedding model

# Use HuggingFaceEmbeddings wrapper for compatibility with LangChain
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")


In [None]:
# Create Chroma Vector Store
# Store the embeddings in a Chroma database
vectorstore = Chroma.from_documents(documents, embeddings)
print("Chroma vector store created successfully!")

Chroma vector store created successfully!


In [None]:
# Set Up RAG Chain
# Combine retriever with an LLM (GPT-4 or GPT-3.5)
retriever = vectorstore.as_retriever()
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model="gpt-4"),
    retriever=retriever,
    return_source_documents=True
)


  llm=ChatOpenAI(model="gpt-4"),


In [None]:
def ask_questions_in_batches(queries, batch_size=3, delay=10):
    results = []
    for i in range(0, len(queries), batch_size):
        batch = queries[i:i+batch_size]
        for query in batch:
            try:
                print(f"Processing query: {query}")
                result = qa_chain({"query": query})
                results.append({
                    "query": query,
                    "answer": result["result"],
                    "sources": [doc.metadata.get("source", "Unknown Source") for doc in result["source_documents"]]
                })
                time.sleep(1)  # Small delay between individual queries
            except OpenAIError as e:
                print(f"OpenAI API error: {e}. Retrying in {delay} seconds...")
                time.sleep(delay)
                continue
    return results

In [None]:
# Example Query
user_query = "What is the impact of Artificial Intelligence on society?"
ask_questions_in_batches([user_query])

Processing query: What is the impact of Artificial Intelligence on society?


[{'query': 'What is the impact of Artificial Intelligence on society?',
  'answer': 'Artificial Intelligence (AI) has a significant impact on society in various ways. Here are a few examples:\n\n1. Efficiency and Productivity: AI can automate repetitive tasks, freeing up time for individuals to focus on more complex tasks that require critical thinking and personal touch.\n\n2. Decision-making: AI can analyze vast amounts of data to identify patterns and trends that humans might overlook, aiding in decision-making processes in fields such as finance or healthcare.\n\n3. Medical Diagnosis and Research: AI can assist doctors in diagnosing diseases or suggest treatment plans. It can also speed up the process of drug discovery and medical research.\n\n4. Agriculture: AI can help farmers monitor crop and soil health, predict weather conditions, and optimize resource usage, leading to increased crop yields and sustainability.\n\n5. Job Market Changes: While AI can lead to job displacement in