<a href="https://colab.research.google.com/github/rmit-ir/Tutotrial-Practical-LLMs/blob/main/LLM_Tutorial_Challenge2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Installing Required Python Packages

This script installs three essential Python packages:

1. **openai** - Enables interaction with OpenAI's API for AI-powered applications.
2. **requests** - A simple HTTP library for sending API requests and handling responses.
3. **serpapi** - A client library for the SerpAPI service, which allows programmatic access to Google search results.

These packages are necessary for AI-based applications, making API calls, and retrieving structured search engine results.

## Installation Commands

Run the following commands in the Notebook to install the required packages.


In [1]:
# Install the OpenAI library for accessing GPT models
!pip install openai

# Install the requests library for making HTTP API calls
!pip install requests

# Install the SerpAPI client to retrieve structured Google search results
!pip install serpapi

Collecting serpapi
  Downloading serpapi-0.1.5-py2.py3-none-any.whl.metadata (10 kB)
Downloading serpapi-0.1.5-py2.py3-none-any.whl (10 kB)
Installing collected packages: serpapi
Successfully installed serpapi-0.1.5


In [2]:
# Import the requests library for making HTTP requests to APIs
import requests

# Import the OpenAI library for AI model interactions
import openai

# Import the JSON module for handling JSON data
import json

# Import the search function from SerpAPI to retrieve Google search results
from serpapi import search

# Import pandas for data manipulation and analysis
import pandas as pd

from google.colab import (
    userdata,
)  # Provides access to user-specific information in Google Colab, used to access the user's secret API key

import os
from IPython.core.display import display, HTML

# Fetching Documents from SerpAPI

This function retrieves search results from Google using SerpAPI and structures them into a Pandas DataFrame.

## **Functionality**
- Uses SerpAPI to fetch search results based on a query.
- Extracts document titles, links, and snippets.
- Returns the data in a structured Pandas DataFrame.

## **Parameters**
- `query` *(str)*: The search query string.
- `num_results` *(int, optional)*: Number of search results to fetch (default: 10).

## **Returns**
- `pd.DataFrame`: A DataFrame containing document titles, links, and snippets.

# Johanne's API key for SERP API. Need to be removed but I had issues with the API keys so this solved it...


In [3]:
from google.colab import userdata

# Retrieve the SERP API key from user secrets
SERPAPI_KEY = userdata.get('SERP_API')

if SERPAPI_KEY:
  print("SERP_API found in user secrets.")
else:
  print("SERP_API not found in user secrets.")

SERP_API found in user secrets.


In [4]:
def fetch_documents_from_serpapi(query, num_results=10):
    """
    Fetch documents from SerpApi based on the query.

    Args:
        query (str): The search query string.
        num_results (int): The number of search results to fetch (default is 10).

    Returns:
        pd.DataFrame: DataFrame containing the document titles and links.
    """

    # Define the search parameters
    params = {
        "q": query,               # Search query
        "api_key": SERPAPI_KEY,       # Your SerpApi API key
        "num": num_results,       # Number of results to retrieve
        "engine": "google"        # Search engine to use (Google is the default)
    }

    # Perform the search query
    results = search(params)

    # Extract the relevant data (document titles and links)
    documents = []
    for result in results.get("organic_results", []):
        documents.append({
            "title": result.get("title"),
            "link": result.get("link"),
            "snippet": result.get("snippet", ""),
        })

    # Convert to DataFrame for better readability
    df = pd.DataFrame(documents)

    return df

In [5]:
# Example usage: Search for "Python"
query = "Python"
documents_df = fetch_documents_from_serpapi(query)

# Display the results
documents_df.head()

Unnamed: 0,title,link,snippet
0,Welcome to Python.org,https://www.python.org/,Python is a programming language that lets you...
1,Python Tutorial,https://www.w3schools.com/python/,Learn Python. Python is a popular programming ...
2,"Online Python - IDE, Editor, Compiler, Interpr...",https://www.online-python.com/,Build and Run your Python code instantly. Onli...
3,Python - Visual Studio Marketplace,https://marketplace.visualstudio.com/items?ite...,The Python extension provides pluggable access...
4,Online Python Compiler (Interpreter),https://www.programiz.com/python-programming/o...,Write and run your Python code using our onlin...


# Display results in Google-like way

In [6]:
def create_google_like_page(documents_df):
    """
    Display a Google-like search results page from the fetched documents.

    Args:
        documents_df (pd.DataFrame): DataFrame containing the document titles, links, and snippets.
    """

    # Create an HTML structure for displaying the search results in a Google-like layout
    html_content = """
    <html>
    <head>
        <title>Google Search Results</title>
        <style>
            body {
                font-family: Arial, sans-serif;
                margin: 20px;
                background-color: #f9f9f9;
            }
            .search-results {
                max-width: 800px;
                margin: auto;
                background-color: white;
                padding: 20px;
                box-shadow: 0px 4px 6px rgba(0, 0, 0, 0.1);
                border-radius: 8px;
            }
            .result-item {
                margin-bottom: 20px;
            }
            .result-title {
                font-size: 20px;
                color: #1a0dab;
                text-decoration: none;
            }
            .result-title:hover {
                text-decoration: underline;
            }
            .result-snippet {
                color: #4d5156;
                font-size: 14px;
                margin-top: 5px;
            }
            .result-link {
                color: #006621;
                font-size: 14px;
            }
            .result-link:hover {
                text-decoration: underline;
            }
            .search-bar {
                background-color: #ffffff;
                padding: 10px;
                margin-bottom: 20px;
                border-radius: 8px;
                box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.1);
            }
            .search-bar input {
                width: 100%;
                padding: 10px;
                font-size: 16px;
                border-radius: 4px;
                border: 1px solid #ddd;
            }
        </style>
    </head>
    <body>
        <div class="search-results">
            <div class="search-bar">
                <input type="text" placeholder="Search Google..." value="Python" readonly>
            </div>
    """

    # Loop through each document and create a search result item
    for index, row in documents_df.iterrows():
        title = row['title']
        link = row['link']
        snippet = row['snippet']

        html_content += f"""
            <div class="result-item">
                <a class="result-title" href="{link}" target="_blank">{title}</a>
                <div class="result-snippet">{snippet}</div>
                <a class="result-link" href="{link}" target="_blank">{link}</a>
            </div>
        """

    # Close the HTML tags
    html_content += """
        </div>
    </body>
    </html>
    """

    # Display the HTML content in the notebook
    display(HTML(html_content))

# Example usage: Display the results from the fetched documents
create_google_like_page(documents_df)

In [7]:
def summarize_document(document_text):
    """
    Summarizes the document using the OpenRouter API.

    Args:
        document_text (str): The text of the document to be summarized.
        OPENROUTER_API_KEY: The API key for OpenRouter authentication.

    Returns:
        str: The summary of the document, or None if an error occurs.
    """

    # Define the API endpoint for OpenRouter
    openrouter_endpoint = "https://openrouter.ai/api/v1/chat/completions"

    # Define request headers with API authorization
    headers = {
        "Authorization": f"Bearer {userdata.get('OPENROUTER_API_KEY')}",
        "Content-Type": "application/json"
    }

    # Define the request payload
    payload = {
        "model": "gpt-4o-mini",  # Model selection
        "messages": [
            {"role": "system", "content": "You are an AI assistant that summarizes documents."},
            {"role": "user", "content": f"Summarize the following text:\n\n{document_text}"}
        ],
        "max_tokens": 200,  # Limit response length
        "temperature": 0.7  # Control randomness
    }

    try:
        # Send the request to OpenRouter API
        response = requests.post(openrouter_endpoint, headers=headers, json=payload)

        # Parse the response
        if response.status_code == 200:
            summary = response.json().get("choices", [{}])[0].get("message", {}).get("content", "").strip()
            return summary
        else:
            print(f"Error summarizing document: {response.status_code} - {response.text}")
            return None

    except requests.exceptions.RequestException as e:
        print(f"Request failed: {e}")
        return None


In [8]:
# Loop through each document and summarize it
documents_df['summary'] = documents_df['snippet'].apply(lambda x: summarize_document(x))

# Display the documents with their summaries
documents_df[['title', 'link', 'summary']].head()

Unnamed: 0,title,link,summary
0,Welcome to Python.org,https://www.python.org/,Python is a versatile programming language tha...
1,Python Tutorial,https://www.w3schools.com/python/,Python is a widely-used programming language t...
2,"Online Python - IDE, Editor, Compiler, Interpr...",https://www.online-python.com/,Online-Python is a user-friendly tool that all...
3,Python - Visual Studio Marketplace,https://marketplace.visualstudio.com/items?ite...,The Python extension offers customizable acces...
4,Online Python Compiler (Interpreter),https://www.programiz.com/python-programming/o...,The text promotes an online Python compiler th...
