WEBSITE SUMMARY BY SCRAPING

In [None]:
import os
import asyncio
from openai import OpenAI
from bs4 import BeautifulSoup
from dotenv import load_dotenv
from IPython.display import Markdown, display
from playwright.async_api import async_playwright

In [None]:
load_dotenv()

_raw_openai_key = os.getenv("OPENAI_API_KEY")
OPENAI_API_KEY = None
if _raw_openai_key:
    # Remove all whitespace characters anywhere in the key (spaces, tabs, newlines)
    sanitized_key = "".join(_raw_openai_key.split())
    if sanitized_key != _raw_openai_key:
        print("Sanitized OPENAI_API_KEY by removing whitespace.")
    OPENAI_API_KEY = sanitized_key
    # Ensure downstream libraries that read from env get the sanitized key
    os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
    print("OPENAI_API_KEY found!")
else:
    print(
        "OPENAI_API_KEY is not set. The app will start, but calls that require OpenAI will fail until it is configured."
    )

In [None]:
openai = OpenAI()
openai

In [None]:
class Website:
    url: str
    title: str
    text: str

    def __init__(self, url: str):
        self.url = url
        self.title = None
        self.text = None

    async def scrape(self):
        async with async_playwright() as p:
            browser = await p.chromium.launch()
            page = await browser.new_page()
            await page.goto(self.url)
            content = await page.content()
            await browser.close()

        soup = BeautifulSoup(content, "html.parser")
        self.title = soup.title.string if soup.title else "No Title Found"
        for irrelevant in soup.body(['script', 'style', 'img', 'input']):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)



In [None]:
system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

def user_prompt_for(website: Website):
    user_prompt = f"You are looking at a website titled '{website.title}' "
    user_prompt += "\nThe contents of this website is as follows; \
    please provide a short summary of this website in markdown. \
    If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [None]:
def messages_for(website: Website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)},
    ]

In [109]:
def summarize_website(url: str):
    website = Website(url)
    asyncio.run(website.scrape())
    
    response = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages_for(website)
    )
    return response.choices[0].message.content

In [111]:
def display_website_summary(url: str):
    summary=summarize_website(url)
    display(Markdown(summary))

In [112]:
display_website_summary("https://abhaseeb.com")

# Portfolio Summary

**Name:** Abdul Haseeb  
**Title:** Full-stack Developer / Software Engineer  
**Age:** 21  
**Location:** Islamabad, Pakistan

## Services Offered
- **Web Development:** Custom websites and web applications using HTML, CSS, JavaScript, React, Node.js, and various databases (Firebase, MongoDB, MySQL).
- **API Development:** Creation of RESTful APIs with technologies such as Node.js, Express, and MongoDB, including user authentication and third-party service integration.
- **Mobile App Development:** Development of mobile applications using React Native and related technology stacks (EXPO, Tailwind, Axios).

## Education
- **Bachelor of Science in Software Engineering (BSSE)**  
  Comsats University Islamabad (2019 - 2023)
  
- **Intermediate of Computer Science (ICS)**  
  Punjab Group of Colleges (2017 - 2019)
  
- **Matriculation**  
  Usman Science School (2015 - 2017)

## Experience
- **MERN Stack Developer** at NESL-IT (Aug 2023 - Present)
- **Freelance Full-stack Developer** on Fiverr (Sep 2022 - Present)

## Projects
1. **KidverCity Website:** Educational Institute website.
2. **AirSpace:** IOT-based parking system (Final Year Project).
3. **KidverCity Admin Portal:** Administration portal for an Educational Institute.
4. **Pocketclass:** Online teaching platform with various functionalities.
5. **Jones Transportation:** Expense tracking application for transport services.
6. **Crypto Blog:** Blog website with post interactions (likes, comments).
7. **Vig Sports:** Static website for sports/crypto blogs and software updates.
8. **University Management System:** Complete management system for tracking educational records.
9. **Punctual:** Attendance management system for schools.
10. **Profile App:** Display system for user portfolios and projects.
11. **Ecommerce App:** Online store with purchasing, rating, and commenting features.

## Contact Information
- **Email:** abdulhaseeb2115@gmail.com  
- **Phone/Whatsapp:** +923496384386  
- **Location:** Islamabad, Pakistan

Abdul Haseeb is currently available for work and actively seeking opportunities in full-stack development.