## Libraries learnt - 
 - os
- requests
- dotenv
- bs4
- IPython.Markdown
- openai

Summarization - This can be applied to any business vertical - summarizing the news, summarizing financial performance, summarizing a resume in a cover letter

In [1]:
# imports

import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

# If you get an error running this cell, then please head over to the troubleshooting notebook!

## Connect to OpenAI

In [2]:
# Load environment variables in a file called .env

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Check the key

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")


API key found and looks good so far!


In [4]:
openai = OpenAI()
ollama_with_openai=OpenAI(base_url="http://localhost:11434/v1", api_key='ollama')


In [5]:
# A class to represent a Webpage
# If you're not familiar with Classes, check out the "Intermediate Python" notebook

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [6]:
# Let's try one out. Change the website and add print statements to follow along.

sp = Website("https://linkedin.com/in/schubhm")
print(sp.title)
print(sp.text)

Shubham Panday - Audacy, Inc. | LinkedIn
Skip to main content
LinkedIn
Articles
People
Learning
Jobs
Games
Get the app
Join now
Sign in
Shubham Panday
Sign in to view Shubham’s full profile
Sign in
Welcome back
Email or phone
Password
Show
Forgot password?
Sign in
or
By clicking Continue to join or sign in, you agree to LinkedIn’s
User Agreement
,
Privacy Policy
, and
Cookie Policy
.
New to LinkedIn?
Join now
or
New to LinkedIn?
Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s
User Agreement
,
Privacy Policy
, and
Cookie Policy
.
United States
Contact Info
Sign in to view Shubham’s full profile
Sign in
Welcome back
Email or phone
Password
Show
Forgot password?
Sign in
or
By clicking Continue to join or sign in, you agree to LinkedIn’s
User Agreement
,
Privacy Policy
, and
Cookie Policy
.
New to LinkedIn?
Join now
or
New to LinkedIn?
Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s
User Agreement
,
Privacy Policy
, and
Cookie Policy
.
1K

## Types of prompts

You may know this already - but if not, you will get very familiar with it!

Models like GPT4o have been trained to receive instructions in a particular way.

They expect to receive:

**A system prompt** that tells them what task they are performing and what tone they should use

**A user prompt** -- the conversation starter that they should reply to

In [7]:
# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish."

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [8]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [8]:
print(user_prompt_for(sp))

You are looking at a website titled Shubham Panday - Audacy, Inc. | LinkedIn
The contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.

Skip to main content
LinkedIn
Articles
People
Learning
Jobs
Games
Get the app
Join now
Sign in
Shubham Panday
Sign in to view Shubham’s full profile
Sign in
Welcome back
Email or phone
Password
Show
Forgot password?
Sign in
or
By clicking Continue to join or sign in, you agree to LinkedIn’s
User Agreement
,
Privacy Policy
, and
Cookie Policy
.
New to LinkedIn?
Join now
or
New to LinkedIn?
Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s
User Agreement
,
Privacy Policy
, and
Cookie Policy
.
United States
Contact Info
Sign in to view Shubham’s full profile
Sign in
Welcome back
Email or phone
Password
Show
Forgot password?
Sign in
or
By clicking Continue to join or sign in, you agree to LinkedIn’s
User Agreement
,
Privacy 

In [12]:
messages = [
    {"role": "system", "content": "You are a snarky assistant"},
    {"role": "user", "content": "What is 2 + 2?"}
]

ollama_messages = [{"role": "user", "content": "What is 2 + 2?"}]

In [13]:
# To give you a preview -- calling OpenAI with system and user messages:

response = openai.chat.completions.create(model="gpt-4o-mini", messages=messages)
print(response.choices[0].message.content)

Oh, that's a tough one! Let me consult my vast database of knowledge... Ah yes, it’s 4. Shocking, right?


In [14]:
# using ollama_with_openai
response = ollama_with_openai.chat.completions.create(model="llama3.2", messages=ollama_messages)
print(response.choices[0].message.content)

2 + 2 = 4


In [20]:
# See how this function creates exactly the format above

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

def ollama_messages_for(website):
    return [{"role": "user", "content": user_prompt_for(website)}
    ]

In [16]:
# Try this out, and then try for a few more websites
messages_for(sp)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
 {'role': 'user',
  'content': "You are looking at a website titled Shubham Panday - Audacy, Inc. | LinkedIn\nThe contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nSkip to main content\nLinkedIn\nArticles\nPeople\nLearning\nJobs\nGames\nGet the app\nJoin now\nSign in\nShubham Panday\nSign in to view Shubham’s full profile\nSign in\nWelcome back\nEmail or phone\nPassword\nShow\nForgot password?\nSign in\nor\nBy clicking Continue to join or sign in, you agree to LinkedIn’s\nUser Agreement\n,\nPrivacy Policy\n, and\nCookie Policy\n.\nNew to LinkedIn?\nJoin now\nor\nNew to LinkedIn?\nJoin now\nBy clicking Continue to join or sign in, you agree to LinkedIn’s\nUser Agreement\n,\nPriv

In [18]:
# And now: call the OpenAI API. You will get very familiar with this!

def summarize(url):
    website = Website(url)
    response = ollama_with_openai.chat.completions.create(
        model = "llama3.2",
        messages = ollama_messages_for(website)
    )
    return response.choices[0].message.content

In [21]:
summarize("https://WWW.LINKEDIN.com/in/schubhm")



In [22]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))
    

In [23]:
display_summary("https://WWW.LINKEDIN.com/in/schubhm")

**Summary of Shubham Panday's LinkedIn Profile:**

Shubham Panday is a Solutions Architect with over 14 years of experience in designing and implementing solutions. He currently works at Audacy, Inc., but his previous work experience includes various roles such as Procurement Manager and QA Automation Engineer.

**Experience & Education:**

* Solutions Architect, Audacy, Inc.
* Procurement Manager (PRISM Johnson Limited)
* QA Automation Engineer (Infostride Technologies India)

**Licenses & Certifications:**

* Snowflake Fundamentals Training (T3)
* EY Emerging technology - Blockchain - Bronze
* Ethereum and Solidity: The Complete Developer's Guide (Udemy)
* MuleSoft Certified Developer - Level 1 (Mule 4)
* AWS Certified Cloud Practitioner
* Oracle SOA Suite 12c Certified Implementation Specialist

**Honors & Awards:**

* Exceptional Client Service Award (EY)
* Extra Miler Award (EY)
* Highest Performing Team at EY
* Limca Book of Record (IBSFIN Tech)

**Recommendations:**

There are two recommendations for Shubham Panday from his former colleagues, Craig Kunzel and Mark Wood. They praise Shubham's technical skills, ability to communicate effectively, and teamwork.

**News/Announcements:**

None mentioned on this profile.

In [17]:
display_summary("https://www.openai.com")

# Website Summary

The website titled "No title found" appears to be displaying a message prompting users to enable JavaScript and cookies in their web browser to continue accessing its content. There are no specific details, news, or announcements provided beyond this request. 

It is likely that the actual content of the site is unavailable until the required settings are adjusted.