In [1]:
import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

In [2]:
import os

conda_prefix = os.environ.get('CONDA_PREFIX')
if conda_prefix:
    print("Anaconda environment is active:")
    print(f"Environment Path: {conda_prefix}")
    print(f"Environment Name: {os.path.basename(conda_prefix)}")

virtual_env = os.environ.get('VIRTUAL_ENV')
if virtual_env:
    print("Virtualenv is active:")
    print(f"Environment Path: {virtual_env}")
    print(f"Environment Name: {os.path.basename(virtual_env)}")

if not conda_prefix and not virtual_env:
    print("Neither Anaconda nor Virtualenv seems to be active. Did you start jupyter lab in an Activated environment? See Setup Part 5.")

Anaconda environment is active:
Environment Path: /home/hamna/miniconda3/envs/llms
Environment Name: llms


In [3]:
from pathlib import Path

parent_dir = Path("..")
env_path = parent_dir / ".env"

if env_path.exists() and env_path.is_file():
    print(".env file found.")

    # Read the contents of the .env file
    with env_path.open("r") as env_file:
        contents = env_file.readlines()

    key_exists = any(line.startswith("OPENAI_API_KEY=") for line in contents)
    good_key = any(line.startswith("OPENAI_API_KEY=sk-proj-") for line in contents)
    
    if key_exists and good_key:
        print("SUCCESS! OPENAI_API_KEY found and it has the right prefix")
    elif key_exists:
        print("Found an OPENAI_API_KEY although it didn't have the expected prefix sk-proj- \nPlease double check your key in the file..")
    else:
        print("Didn't find an OPENAI_API_KEY in the .env file")
else:
    print(".env file not found in the llm_engineering directory. It needs to have exactly the name: .env")
    
    possible_misnamed_files = list(parent_dir.glob("*.env*"))
    
    if possible_misnamed_files:
        print("\nWarning: No '.env' file found, but the following files were found in the llm_engineering directory that contain '.env' in the name. Perhaps this needs to be renamed?")
        for file in possible_misnamed_files:
            print(file.name)

.env file found.
SUCCESS! OPENAI_API_KEY found and it has the right prefix


In [4]:
load_dotenv()
api_key = os.getenv('OPENAI_API_KEY')
# Check the key

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")

API key found and looks good so far!


In [6]:
openai = OpenAI()

In [7]:
# A class to represent a Webpage
# If you're not familiar with Classes, check out the "Intermediate Python" notebook

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [8]:
ed = Website("https://edwarddonner.com")

print(ed.title)
print(ed.text)

Home - Edward Donner
Home
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers and tons of press coverage.
Connect
with me for

In [9]:
# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish."

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [10]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

```Python 
The API from OpenAI expects to receive messages in a particular structure. Many of the other APIs share this structure:

[
    {"role": "system", "content": "system message goes here"},
    {"role": "user", "content": "user message goes here"}
]

In [11]:
# See how this function creates exactly the format above

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

In [12]:
messages_for(ed)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
 {'role': 'user',
  'content': 'You are looking at a website titled Home - Edward Donner\nThe contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nHome\nOutsmart\nAn arena that pits LLMs against each other in a battle of diplomacy and deviousness\nAbout\nPosts\nWell, hi there.\nI’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (\nvery\namateur) and losing myself in\nHacker News\n, nodding my head sagely to things I only half understand.\nI’m the co-founder and CTO of\nNebula.io\n. We’re applying AI to a field where it can make a massive, posi

# Time to bring it together - the API for OpenAI is very simple!

In [13]:
# And now: call the OpenAI API. You will get very familiar with this!

def summarize(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model = "gpt-4o-mini",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [14]:
summarize("https://edwarddonner.com")

'# Summary of Edward Donner\'s Website\n\nThe website serves as a personal platform for Ed Donner, who is an experimental coder, DJ, and co-founder of Nebula.io. Ed focuses on leveraging AI technology, specifically large language models (LLMs), to enhance talent discovery and management. His background includes founding the AI startup untapt, which was acquired in 2021.\n\nThe site features a section called **Outsmart**, designed as a competitive arena where LLMs engage in challenges related to diplomacy and strategic thinking. \n\n## Recent Posts and Announcements\n- **November 13, 2024:** "Mastering AI and LLM Engineering – Resources"\n- **October 16, 2024:** "From Software Engineer to AI Data Scientist – resources"\n- **August 6, 2024:** "Outsmart LLM Arena – a battle of diplomacy and deviousness"\n- **June 26, 2024:** "Choosing the Right LLM: Toolkit and Resources"\n\nThese posts provide resources and insights into AI, LLMs, and professional development in the tech field.'

In [15]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [16]:
display_summary("https://hassan883.github.io/")

# Summary of Hassan-Javed Website

Hassan-Javed is a personal website of an AI Developer based in Pakistan, specializing in machine learning, deep learning, and natural language processing (NLP). With over three years of professional experience in AI and a robust programming background in Python, Hassan Javed emphasizes his dedication to developing AI-driven solutions that advance organizational goals. 

## About Me
- **Education**: Currently pursuing an MS in Data Science and holds a Bachelor's degree in Computer Science.
- **Experience**: Over three years in AI-focused roles, with experience in both industry and research, including work with advanced models like Llama 3.2.

## Skills
- **Programming Languages**: Python, JavaScript, C/C++, PHP, HTML, CSS.
- **Data Science & AI**: Proficient in machine learning techniques, computer vision, data analysis, and model optimization.
- **Tools & Frameworks**: Experienced with TensorFlow, PyTorch, Flask, Django, and data visualization libraries like Matplotlib and Seaborn.

## Services
Hassan offers a range of services, including:
- **AI & Machine Learning Development**: Advanced model implementation, particularly with Llama models.
- **Data Science & Analytics**: In-depth data analysis, forecasting, and predictive modeling.
- **Machine Learning Engineering**: Model deployment and optimization to enhance performance and explainability.

## Experience
The website details several roles held by Hassan, including:
- **Current Role**: AI Developer at NASTP @ PAF, focused on research and development in AI, particularly in chatbots and time series models.
- **Previous Roles**: Worked at PanaceaLogics, focusing on NLP solutions and data science applications. 

## Portfolio
The portfolio section showcases various projects Hassan has worked on, indicating a strong capability in deploying functional, innovative applications, notably in AI.

## Blog
The blog features posts related to branding and design, focusing on the significance of web design in enhancing business efficacy and user interaction.

## Contact Information
Hassan can be contacted via email, phone, or through the submission form on the website, with locations in Rawalpindi, Pakistan.

This website serves as a professional showcase of Hassan Javed's expertise and services in the field of AI and machine learning.

In [17]:
# Step 1: Create your prompts

system_prompt = "You work as an assitant, and your job is to convert the content into the markdown and list down the experties in langauges, experience, years of experience in AI, ML and DL or LLMs, and list down the project which is related to LLMs"
user_prompt = """
    your are looking at the personal site and look for the list of information from the site, and ignores the, blogs and contact information. 
"""

# Step 2: Make the messages list

messages = [
    {"role":"system","content": system_prompt},
    {"role": "user", "content": user_prompt}
] # fill this in

# Step 3: Call OpenAI

response = openai.chat.completions.create(
    model='gpt-3.5-turbo',
    messages= messages
)

# Step 4: print the result

print(response.choices[0].message.content)

**Expertise:**
1. Languages: Python, R, SQL
2. Experience: Data analysis, Data visualization, Machine learning models development
3. Years of experience in AI, ML, and DL: 5 years

**Projects related to LLMs:**
1. Developed a sentiment analysis model using LSTM for analyzing customer reviews in the e-commerce industry. 
3. Utilized NLP techniques for text summarization in a news aggregation platform.
