# Summarizing DuckDuckGo Search Results Using GPT-4

This notebook demonstrates how to fetch search results from DuckDuckGo and use OpenAI's GPT-4 to generate a summary.




## Setting up the Environment

In [None]:
!pip install openai requests playwright
!playwright install

## Imports

In [52]:
import openai
import requests

from bs4 import BeautifulSoup
from playwright.async_api import async_playwright

## API Keys

In [60]:
# Configure your API keys here
OPENAI_API_KEY = ''
RAPID_API_KEY = ''

## Function Definition

The function `duckduckgo_search` requires only a search query as input. The API keys are configured within the script to simplify usage.

In [77]:
def duckduckgo_search(query):

    # Set OpenAI API key
    from openai import OpenAI
    client = OpenAI(
        api_key=OPENAI_API_KEY
    )

    # Define the DuckDuckGo API endpoint and parameters
    url = "https://duckduckgo-duckduckgo-zero-click-info.p.rapidapi.com/"
    querystring = {
        "q": query,
        "no_html": "1",
        "no_redirect": "1",
        "skip_disambig": "1",
        "format": "json"
    }

    # Set the API headers
    headers = {
        "X-RapidAPI-Key": RAPID_API_KEY,
        "X-RapidAPI-Host": "duckduckgo-duckduckgo-zero-click-info.p.rapidapi.com"
    }

    # Get search results from DuckDuckGo
    response = requests.get(url, headers=headers, params=querystring)
    search_data = response.json()

    # Prepare the prompt for GPT-4
    prompt = f"""
    Here is some search result data for '{query}':
    {search_data}

    Summarize the key findings and insights from this data.
    """

    print(prompt)


    completion = client.chat.completions.create(
      model="gpt-3.5-turbo",
      # Prepare the messages for GPT-4
      messages = [
          {"role": "system", "content": "You are a helpful assistant."},
          {"role": "user", "content": f"Here is some search result data for '{query}': {search_data}. Summarize the key findings and insights from this data."}
      ]
    )

    print(completion.choices[0].message.content)

    summary = completion.choices[0].message.content
    return summary


### Usage

In [76]:
query = "Penguins"

duckduckgo_search(query)


    Here is some search result data for 'Penguins':
    {'Abstract': '', 'AbstractSource': 'Wikipedia', 'AbstractText': '', 'AbstractURL': 'https://en.wikipedia.org/wiki/Penguin_(disambiguation)', 'Answer': '', 'AnswerType': '', 'Definition': '', 'DefinitionSource': '', 'DefinitionURL': '', 'Entity': '', 'Heading': 'Penguin', 'Image': '', 'ImageHeight': 0, 'ImageIsLogo': 0, 'ImageWidth': 0, 'Infobox': '', 'Redirect': '', 'RelatedTopics': [{'FirstURL': 'https://duckduckgo.com/Penguin', 'Icon': {'Height': '', 'URL': '/i/1ab2a21d.jpg', 'Width': ''}, 'Result': '<a href="https://duckduckgo.com/Penguin">Penguin</a>A group of aquatic flightless birds from the order Sphenisciformes of the family Spheniscidae.', 'Text': 'Penguin A group of aquatic flightless birds from the order Sphenisciformes of the family Spheniscidae.'}, {'FirstURL': 'https://duckduckgo.com/Pittsburgh_Penguins', 'Icon': {'Height': '', 'URL': '/i/7ec2f55a.png', 'Width': ''}, 'Result': '<a href="https://duckduckgo.com/Pittsb

"The search results for 'Penguins' show a diverse range of topics related to penguins. Here are the key findings and insights from the data:\n\n1. **Main Topic**: The search primarily covers various aspects related to penguins, including the aquatic flightless birds, a professional ice hockey team (Pittsburgh Penguins), a supervillain character (Penguin), and more.\n\n2. **Categories Covered**:\n    - **Arts and Entertainment**: Includes mentions of albums, books, films, and TV series related to penguins.\n    - **Businesses**: Features businesses such as a restaurant, book publisher, and clothing line with 'Penguin' in their name.\n    - **Food**: Discusses penguin-themed food items like mints and biscuits.\n    - **Military**: Mentions naval ships and missiles named after penguins.\n    - **People**: Highlights individuals associated with the name 'Penguin'.\n    - **Places**: Covers locations such as towns, islands, and rivers named after penguins.\n    - **Sports**: Focuses on spor



---





---



# Browsing Web Content with Playwright and Summarizing Using GPT-4

This Python function, `browsing_web_using_playwright`, fetches web content asynchronously using Playwright, extracts relevant information using BeautifulSoup, and then generates a summary with GPT-4.

## Parameters:
- `url`: URL of the website to scrape.

## Process:
1. **Fetch Web Content**: Asynchronously launch a Chromium browser with Playwright, navigate to the specified URL, and retrieve the HTML content.
2. **Parse HTML Content**: Use BeautifulSoup to extract various elements such as title, meta description, keywords, headings, paragraphs, and links from the HTML.
3. **Prepare GPT-4 Prompt**: Construct a prompt containing the extracted information.
4. **Generate Summary**: Utilize GPT-4 to generate a summary based on the prompt.


In [84]:
async def browsing_web_using_playwright(url):

    # Fetch web content using Playwright asynchronously
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        page = await browser.new_page()
        await page.goto(url)
        html_content = await page.content()
        await browser.close()

    # Parse HTML content with BeautifulSoup
    soup = BeautifulSoup(html_content, 'html.parser')
    title_text = soup.title.text if soup.title else 'No title found'
    meta_description = soup.find('meta', attrs={'name': 'description'})
    meta_description = meta_description['content'] if meta_description else 'No description found'
    meta_keywords = soup.find('meta', attrs={'name': 'keywords'})
    meta_keywords = meta_keywords['content'] if meta_keywords else 'No keywords found'
    headings = [h.text.strip() for h in soup.find_all(['h1', 'h2'])]
    headings_text = ', '.join(headings) if headings else 'No headings found'
    paragraphs = [p.text.strip() for p in soup.find_all('p')[:3]]
    paragraphs_text = ' '.join(paragraphs) if paragraphs else 'No paragraphs found'
    links = [a['href'] for a in soup.find_all('a', href=True)[:5]]
    links_text = ', '.join(links) if links else 'No links found'

    # Prepare the prompt for GPT-4
    prompt = f"""
    Analyze the following data and summarize the key points:
    Title: {title_text}
    Meta Description: {meta_description}
    Meta Keywords: {meta_keywords}
    Headings: {headings_text}
    Paragraphs: {paragraphs_text}
    Links: {links_text}
    """

    print(prompt)

    # Call the GPT-4 summarization function
    summary = summarize_with_gpt4(prompt)
    return summary


# Summarizing with GPT-4

This Python function, `summarize_with_gpt4`, utilizes GPT-4 to generate a summary based on a provided prompt.

## Parameters:
- `prompt`: The prompt containing the text data to summarize.

## Process:
1. **Set API Key**: Configure the OpenAI API key for authentication.
2. **Call GPT-4**: Send the prompt to the GPT-4 model to generate a summary.
3. **Retrieve Summary**: Extract the generated summary from the response.


In [83]:
def summarize_with_gpt4(prompt):

    # Set OpenAI API key
    from openai import OpenAI
    client = OpenAI(
        api_key=OPENAI_API_KEY
    )

    completion = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ],
        max_tokens=300
    )

    # print(completion.choices[0].message.content)

    summary = completion.choices[0].message.content
    return summary


## Usage

In [85]:
url = 'https://example.com'
summary = await browsing_web_using_playwright(url)
print(summary)



    Analyze the following data and summarize the key points:
    Title: Example Domain
    Meta Description: No description found
    Meta Keywords: No keywords found
    Headings: Example Domain
    Paragraphs: This domain is for use in illustrative examples in documents. You may use this
    domain in literature without prior coordination or asking for permission. More information...
    Links: https://www.iana.org/domains/example
    
Based on the provided data:

Title: The title of the page is "Example Domain."
Meta Description: No meta description is found, which can impact search engine visibility and user understanding of the content.
Meta Keywords: No meta keywords are found, which used to be a way to help search engines understand the content of the page.
Headings: The heading on the page reiterates "Example Domain."
Paragraphs: The paragraph provides information that the domain is for illustrative examples in documents and can be used without coordination or permission.
Link