### **📌 AI-Powered Real-Time Search Engine (RAG System) 🚀**

- This script implements a **Retrieval-Augmented Generation (RAG) System** 
- that performs real-time search, retrieves relevant articles, processes 
- them into embeddings, and generates AI-powered responses using a large 
- language model (LLaMA-3.3-70B) via Groq API.

#### **Features:**
- ✅ Google Search API for fetching real-time results
- ✅ Newspaper3k for full-article extraction
- ✅ FAISS Vector Database for fast semantic search
- ✅ Sentence-Transformers for text embeddings
- ✅ LLaMA-3.3-70B via ChatGroq for AI-powered answers

#### **🚀 Developed with scalability, modularity, and performance in mind!**

### **📌 Import Required Libraries**

In [1]:
from googleapiclient.discovery import build  # Google Search API
from newspaper import Article  # Article extraction
from langchain.vectorstores import FAISS  # FAISS for vector search
from sentence_transformers import SentenceTransformer  # Embeddings
from langchain_groq import ChatGroq  # Groq API for LLM
from langchain.schema import Document  # LangChain document structure
from langchain.embeddings import HuggingFaceEmbeddings  # Open-source embeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter # Recursive character text splitter
from rich.console import Console #  format the output beautifully
from rich.markdown import Markdown # print your results with bold headings, colors, and clear separation.
import numpy as np  # For handling numerical operations
import time  # Time delays to avoid rate limiting
import os # For file handling
import warnings # For warnings
from dotenv import load_dotenv # Load environment variables
from huggingface_hub import login # Login to HuggingFace

load_dotenv() # Load environment variables from.env file
console = Console() # Console instance

warnings.filterwarnings("ignore") # Ignore warnings

2025-02-25 21:17:34.977235: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1740500255.003575   23305 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1740500255.011627   23305 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-02-25 21:17:35.038041: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


### **📌 API Keys & Configuration (Replace with your own keys in .env file)**

In [2]:
google_search_key = os.getenv("GOOGLE_SEARCH_API_KEY") # Google Search API key (Generate from: https://cse.google.com/cse/all)
search_engine_id = os.getenv("SEARCH_ENGINE_ID") # Search engine ID (Generate from: Copy ID after creating engine)
grok_api_key = os.getenv("LANGCHAIN_GROK_API_KEY") # Langchain Grok API key (Generate from: https://console.groq.com/)
hugginface_api_key = os.getenv("HUGGINFACE_LOGIN_API_KEY") # Huggingface Logging API key (Generate from: https://huggingface.co/)

In [3]:
login(str(hugginface_api_key))  # Replace with your API key

### **🔎 Function: Perform a Google Search**

In [4]:
def google_custom_search(query):
    """
    Queries Google Search API to fetch top search results.
    
    Args:
        query (str): The search query entered by the user.
    Returns:
        list: A list of search result dictionaries containing title, link, and snippet.
    """
    service = build("customsearch", "v1", developerKey=google_search_key)
    result = service.cse().list(q=query, cx=search_engine_id, num=3).execute()
    
    search_results = []
    if "items" in result:
        for item in result["items"]:
            search_results.append({
                "title": item["title"],
                "link": item["link"],
                "snippet": item["snippet"]
            })
    
    return search_results


### **📰 Function: Extract Full Article Text**

In [5]:
def extract_full_article(url):
    """
    Extracts full text from an online article using Newspaper3k.
    
    Args:
        url (str): URL of the news article.
    Returns:
        str: Extracted article text (up to 5000 characters).
    """
    try:
        article = Article(url)
        article.download()
        article.parse()
        return article.text[:5000]  # Limit extracted text to 5000 characters
    except Exception as e:
        return f"Could not extract article: {str(e)}"

### **🔍 Perform a real-time search**

In [6]:
query = "Which LLM model was released recently?"
search_results = google_custom_search(query)

### **📝 Extract full articles from search results**

In [7]:
def print_pretty_results(search_results, extracted_articles):
    for idx, result in enumerate(search_results):
        article_content = extracted_articles[idx][:500]  # Limit display to first 500 chars
        
        print("=" * 80)
        print(f"🔹 [Title]: {result['title']}\n")
        print(f"🔗 [URL]: {result['link']}\n")
        print(f"📝 [Snippet]: {result['snippet']}\n")
        print(f"📄 [Extracted Article]: {article_content}...\n")
        print("=" * 80)

In [8]:
all_text = []
for idx, result in enumerate(search_results):
    # Extract article content
    full_text = extract_full_article(result['link'])
    
    all_text.append(full_text)
    time.sleep(2)  # Prevent hitting API rate limits

- Example usage

In [9]:
print_pretty_results(search_results, all_text)

🔹 [Title]: Introducing Meta Llama 3: The most capable openly available LLM ...

🔗 [URL]: https://ai.meta.com/blog/meta-llama-3/

📝 [Snippet]: Apr 18, 2024 ... Over the coming months, we'll release multiple models with new capabilities including multimodality, the ability to converse in multiple ...

📄 [Extracted Article]: Today, we’re excited to share the first two models of the next generation of Llama, Meta Llama 3, available for broad use. This release features pretrained and instruction-fine-tuned language models with 8B and 70B parameters that can support a broad range of use cases. This next generation of Llama demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning. We believe these are the best open source models of their c...

🔹 [Title]: Best 22 Large Language Models (LLMs) (February 2025)

🔗 [URL]: https://explodingtopics.com/blog/list-of-llms

📝 [Snippet]: May 13, 2024 ... As a new upgrade fro

### **🧩 Chunk the extracted text for processing**

In [10]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
documents = text_splitter.create_documents(all_text)

- Print sample text chunks

In [11]:
for doc in documents[:3]:
    print(f"🔹 Chunk: {doc.page_content}")

🔹 Chunk: Today, we’re excited to share the first two models of the next generation of Llama, Meta Llama 3, available for broad use. This release features pretrained and instruction-fine-tuned language models with 8B and 70B parameters that can support a broad range of use cases. This next generation of Llama demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning. We believe these are the best open source models of their
🔹 Chunk: these are the best open source models of their class, period. In support of our longstanding open approach, we’re putting Llama 3 in the hands of the community. We want to kickstart the next wave of innovation in AI across the stack—from applications to developer tools to evals to inference optimizations and more. We can’t wait to see what you build and look forward to your feedback.
🔹 Chunk: With Llama 3, we set out to build the best open models that are on par with the best pr

### **🔬 Create Embeddings (Using SBERT Model)**

In [12]:
embedding_model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
texts = [doc.page_content for doc in documents]
text_embeddings = embedding_model.encode(texts)

### **🗂 Store Data in FAISS Vector Database**

In [13]:
faiss_docs = [Document(page_content=text) for text in texts]
text_embeddings = np.array(text_embeddings)
vector_db = FAISS.from_documents(faiss_docs, HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2"))

### **🔄 Function: Retrieve Relevant Chunks**

In [14]:
def retrieve_relevant_chunks(query, vector_db, embedding_model, top_k=3):
    """
    Retrieve the most relevant text chunks using vector similarity search.
    
    Args:
        query (str): The user query.
        vector_db (FAISS): FAISS vector database.
        embedding_model: Sentence transformer model.
        top_k (int): Number of top results to retrieve.
    Returns:
        list: Retrieved relevant document chunks.
    """
    query_embedding = embedding_model.encode([query])
    relevant_docs = vector_db.similarity_search_by_vector(query_embedding[0], top_k=top_k)
    return [doc.page_content for doc in relevant_docs]

In [15]:
retrieved_chunks = retrieve_relevant_chunks("What is the latest LLM model?", vector_db, embedding_model)
print(f"🔍 Retrieved Relevant Chunks for LLaMA: \n{retrieved_chunks}")

🔍 Retrieved Relevant Chunks for LLaMA: 
['With Llama 3, we set out to build the best open models that are on par with the best proprietary models available today. We wanted to address developer feedback to increase the overall helpfulness of Llama 3 and are doing so while continuing to play a leading role on responsible use and deployment of LLMs. We are embracing the open source ethos of releasing early and often to enable the community to get access to these models while they are still in development. The text-based models we are', 'LLM Name Developer Release Date Access Parameters DeepSeek R1 DeepSeek January 20, 2025 Open-Source 671 billion GPT-4o OpenAI May 13, 2024 API Unknown Claude 3.5 Anthropic June 20, 2024 API Unknown Grok-1 xAI November 4, 2023 Open-Source 314 billion Mistral 7B Mistral AI September 27, 2023 Open-Source 7.3 billion PaLM 2 Google May 10, 2023 Open-Source 340 billion Falcon 180B Technology Innovation Institute September 6, 2023 Open-Source 180 billion Stable 

### **🤖 Function: Generate AI Response Using LLaMA**

In [16]:
def stream_llm_response(query, retrieved_chunks):
    """
    Generate a concise and structured response using LLaMA-3.3-70B based on retrieved search results.
    
    Args:
        query (str): The user query.
        retrieved_chunks (list): Retrieved relevant document chunks.
    Returns:
        str: AI-generated response.
    """
    llm = ChatGroq(
        temperature=0,
        groq_api_key=grok_api_key,
        model_name="llama-3.3-70b-versatile"
    )
    
    # Combine retrieved text chunks
    context_text = "\n".join(retrieved_chunks)
    
    prompt = f"""
    🎯 You are an AI expert providing well-structured, **concise**, and **engaging** responses.
    
    🔍 **User Query:** {query}
    
    🔎 **Extracted Information from Trusted Sources:** 
    {context_text}
    
    ✨ **Task:** 
    - Highlight **key points** in **structured bullet points** ✅
    - Use **emojis** to enhance readability 🎭
    - Avoid unnecessary details 🚀
    - Ensure a **professional yet engaging tone** 🎤
    - Provide a **brief yet impactful conclusion** ✍️

    Now, generate the structured response with **emojis**.
    """
    
    return llm.invoke(prompt)


### **📝 Generate AI-Powered Answer**

In [17]:
ai_response = stream_llm_response("Which LLM model was released recently?", retrieved_chunks)
print(ai_response.content)

🎯 **Recent LLM Model Releases** 🚀
Here are the key points:
* 📆 **Latest Models**: Llama 3, DeepSeek R1, and GPT-4o are some of the recently released LLM models 🤖
* 📊 **Model Parameters**: Llama 3 has 8B and 70B parameter models, while DeepSeek R1 has 671 billion parameters 📈
* 📝 **Access**: Llama 3 and DeepSeek R1 are open-source, while GPT-4o is available through an API 📊
* 🚀 **Industry Growth**: The global large language model market is projected to grow from $6.5 billion in 2024 to $140.8 billion by 2033 🚀

🎤 **Conclusion**: The LLM industry is rapidly evolving with new models being released regularly 🎯. Llama 3 and DeepSeek R1 are two of the latest models that offer improved performance and open-source access 🤝. As the industry continues to grow, we can expect to see more innovative models and applications in the future 🚀. ✍️


#### **✅ This script efficiently combines real-time search, retrieval, and generation.**
#### **✅ Now ready for further optimization & deployment (Streamlit, FastAPI, etc.)!**