In [1]:
import os, requests, json
from typing import List
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [2]:
load_dotenv()
api_key = os.getenv("OPEN_API_KEY")

MODEL = 'gpt-4o-mini'
openai = OpenAI()

In [3]:
class Website:
    url: str
    title: str
    text: str
    body: str
    links: List[str]

    def __init__(self,url):
        self.url = url
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title!"
        if soup.body:
            for irrelevant in soup.body(['script','style','img','input']):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator = '\n', strip = True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]
    
    def get_contents(self):
        return f"Webpage title:\n{self.title}\n\nWebpage Content:\n{self.text}\n\n"

In [4]:
ed = Website("https://edwarddonner.com")
print(ed.get_contents())

Webpage title:
Home - Edward Donner

Webpage Content:
Home
Connect Four
Outsmart
An arena that pits LLMs against each other in a battle of diplomacy and deviousness
About
Posts
Well, hi there.
I’m Ed. I like writing code and experimenting with LLMs, and hopefully you’re here because you do too. I also enjoy DJing (but I’m badly out of practice), amateur electronic music production (
very
amateur) and losing myself in
Hacker News
, nodding my head sagely to things I only half understand.
I’m the co-founder and CTO of
Nebula.io
. We’re applying AI to a field where it can make a massive, positive impact: helping people discover their potential and pursue their reason for being. Recruiters use our product today to source, understand, engage and manage talent. I’m previously the founder and CEO of AI startup untapt,
acquired in 2021
.
We work with groundbreaking, proprietary LLMs verticalized for talent, we’ve
patented
our matching model, and our award-winning platform has happy customers a

In [5]:
ed.links

['https://edwarddonner.com/',
 'https://edwarddonner.com/connect-four/',
 'https://edwarddonner.com/outsmart/',
 'https://edwarddonner.com/about-me-and-about-nebula/',
 'https://edwarddonner.com/posts/',
 'https://edwarddonner.com/',
 'https://news.ycombinator.com',
 'https://nebula.io/?utm_source=ed&utm_medium=referral',
 'https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html',
 'https://patents.google.com/patent/US20210049536A1/',
 'https://www.linkedin.com/in/eddonner/',
 'https://edwarddonner.com/2025/01/23/llm-workshop-hands-on-with-agents-resources/',
 'https://edwarddonner.com/2025/01/23/llm-workshop-hands-on-with-agents-resources/',
 'https://edwarddonner.com/2024/12/21/llm-resources-superdatascience/',
 'https://edwarddonner.com/2024/12/21/llm-resources-superdatascience/',
 'https://edwarddonner.com/2024/11/13/llm-engineering-resources/',
 'https://edwarddonner.com/2024/11/13/llm-engineering-resources/',
 'ht

In [6]:
link_system_prompt = "You are provided with a list of links found on a webpage.\
You are able to decide which of the links would be most relevant to include in a brochure about a company,\
such as links to an About page, or a Company page, or Careers/Jobs pages."
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links":[
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [7]:
print(link_system_prompt)

You are provided with a list of links found on a webpage.You are able to decide which of the links would be most relevant to include in a brochure about a company,such as links to an About page, or a Company page, or Careers/Jobs pages.You should respond in JSON as in this example:
{
    "links":[
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}



In [8]:
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "Please decide which of these are relevant web links for a brochure about the company, respond with the full https URL:\
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [9]:
print(get_links_user_prompt(ed))

Here is the list of links on the website of https://edwarddonner.com - Please decide which of these are relevant web links for a brochure about the company, respond with the full https URL:Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
https://edwarddonner.com/
https://edwarddonner.com/connect-four/
https://edwarddonner.com/outsmart/
https://edwarddonner.com/about-me-and-about-nebula/
https://edwarddonner.com/posts/
https://edwarddonner.com/
https://news.ycombinator.com
https://nebula.io/?utm_source=ed&utm_medium=referral
https://www.prnewswire.com/news-releases/wynden-stark-group-acquires-nyc-venture-backed-tech-startup-untapt-301269512.html
https://patents.google.com/patent/US20210049536A1/
https://www.linkedin.com/in/eddonner/
https://edwarddonner.com/2025/01/23/llm-workshop-hands-on-with-agents-resources/
https://edwarddonner.com/2025/01/23/llm-workshop-hands-on-with-agents-resources/
https://edwarddonner.com/2024/12/21/llm-resources-su

In [10]:
def get_links(url):
    website = Website(url)
    completion = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role":"system", "content":link_system_prompt},
            {"role":"user", "content":get_links_user_prompt(website)}
        ],
        response_format={"type":"json_object"}
    )
    result = completion.choices[0].message.content
    return json.loads(result)

In [11]:
get_links("https://www.anthropic.com/")

{'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'},
  {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'},
  {'type': 'team page', 'url': 'https://www.anthropic.com/team'},
  {'type': 'research page', 'url': 'https://www.anthropic.com/research'},
  {'type': 'news page', 'url': 'https://www.anthropic.com/news'}]}

### Making a Brochure

In [12]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print(f"Found links: {links}")
    for link in links['links']:
        result += f"\n\n{link['type']}\n"
        result += Website(link['url']).get_contents()
    return result

In [13]:
print(get_all_details("https://www.anthropic.com/"))

Found links: {'links': [{'type': 'about page', 'url': 'https://www.anthropic.com/company'}, {'type': 'careers page', 'url': 'https://www.anthropic.com/careers'}, {'type': 'team page', 'url': 'https://www.anthropic.com/team'}, {'type': 'research page', 'url': 'https://www.anthropic.com/research'}, {'type': 'news page', 'url': 'https://www.anthropic.com/news'}]}
Landing page:
Webpage title:
Home \ Anthropic

Webpage Content:
Claude
Overview
Team
Enterprise
API
Pricing
Research
Company
Careers
News
Try Claude
AI
research
and
products
that put safety at the frontier
Claude.ai
Meet Claude 3.5 Sonnet
Claude 3.5 Sonnet, our most intelligent AI model, is now available.
Talk to Claude
API
Build with Claude
Create AI-powered applications and custom experiences using Claude.
Learn more
Announcement
Statement from Dario Amodei on the Paris AI Action Summit
Read the statement
Policy
Anthropic Economic Index
Policy
Our Responsible Scaling Policy
Our Work
Product
Claude for Enterprise
Sep 4, 2024
Ali

In [27]:
system_prompt = "You are an assistant that analyzes the contents of several relevant webpages from a company website\
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information"

# system_prompt = "You are an assistant that analyzes the contents of several relevant webpages from a company website\
# and creates a short humorous, entertaining and jokey brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
# Include details of company culture, customers and careers/jobs if you have the information"

In [28]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown in japanese language.\n"
    user_prompt += get_all_details(url)
    user_prompt += user_prompt[:20_000]
    return user_prompt

In [29]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role":"system", "content":system_prompt},
            {"role":"user", "content":get_brochure_user_prompt(company_name,url)}
        ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [30]:
create_brochure("VijayShree Toolings","https://www.vjtl.in/")

Found links: {'links': [{'type': 'about page', 'url': 'https://www.vjtl.in/aboutus.html'}, {'type': 'careers page', 'url': 'https://www.vjtl.in/careers.html'}, {'type': 'products page', 'url': 'https://www.vjtl.in/products.html'}, {'type': 'facilities page', 'url': 'https://www.vjtl.in/facilities.html'}, {'type': 'contact page', 'url': 'https://www.vjtl.in/contact.php'}]}


```markdown
# Vijayshreeウェアパーツ株式会社のご紹介

## 会社概要
Vijayshreeは、2009年に設立され、2017年に再編成された企業で、タングステンカーバイドツーリングおよびウェアパーツの製造・供給において30年以上の経験を持つエンジニアチームで構成されています。さまざまな業界で使用されるタングステンカーバイド製品の開発に特化し、特に一般工学、石油・ガス産業、電気スタンピング、金属成形、粉末成形業界での豊富なアプリケーション経験があります。

## 製品＆サービス
当社の製品ラインには、以下が含まれます：
- スリッティングカッター
- フォーミングコアロッド
- ピアシングパンチ
- ヘディングペレット

日本のカーバイドメーカーとの提携を通じて、50年以上の経験を活かした技術的なサポートと先進的なカーバイドグレードを提供しています。

## 会社の文化
Vijayshreeでは、社員の成長とキャリア開発を重視した、支援的な環境を提供しています。効率的な業務遂行を重視し、社員が自分の仕事に対する責任を持ち、より高度な判断を下すことを奨励しています。職場ではエネルギーに満ち、熱意にあふれる環境を目指しています。

## キャリア情報
現在、次のポジションで人材を募集しています：
- 営業
- マーケティング

応募希望の方は、履歴書を以下のメールアドレスに送付ください：
- **Email:** pune@vjtl.in

## 顧客
私たちは、以下の業界にサービスを提供しています：
- 自動車（シャーシ、エンジン、パワートレインなど）
- 航空（精密部品およびベアリングの製造用型）
- 電子（半導体、モーター、バッテリーなど）
- 食品および医療（乳製品処理機械）
- 土木工事（建設用ビット）
- 鉄鋼およびミリング（冷間・温間鍛造ロール）

## お問い合わせ
- **住所 (本社):** Ashoka Vihar, 2nd Floor, H2/8, XLO Point, Ambad, Nashik - 422010
- **電話:** +91 0253 6693924
- **メール:** contact@vjtl.in

当社は、ISO 9001:2015 認証を取得した企業です。持続可能な成長を目指すVijayshreeで、あなたのビジネスのパートナーとしてお手伝いできることを期待しています。
```


# Vijayshreeウェアパーツ株式会社のご紹介

## 会社概要
Vijayshreeは、2009年に設立され、2017年に再編成された企業で、タングステンカーバイドツーリングおよびウェアパーツの製造・供給において30年以上の経験を持つエンジニアチームで構成されています。さまざまな業界で使用されるタングステンカーバイド製品の開発に特化し、特に一般工学、石油・ガス産業、電気スタンピング、金属成形、粉末成形業界での豊富なアプリケーション経験があります。

## 製品＆サービス
当社の製品ラインには、以下が含まれます：
- スリッティングカッター
- フォーミングコアロッド
- ピアシングパンチ
- ヘディングペレット

日本のカーバイドメーカーとの提携を通じて、50年以上の経験を活かした技術的なサポートと先進的なカーバイドグレードを提供しています。

## 会社の文化
Vijayshreeでは、社員の成長とキャリア開発を重視した、支援的な環境を提供しています。効率的な業務遂行を重視し、社員が自分の仕事に対する責任を持ち、より高度な判断を下すことを奨励しています。職場ではエネルギーに満ち、熱意にあふれる環境を目指しています。

## キャリア情報
現在、次のポジションで人材を募集しています：
- 営業
- マーケティング

応募希望の方は、履歴書を以下のメールアドレスに送付ください：
- **Email:** pune@vjtl.in

## 顧客
私たちは、以下の業界にサービスを提供しています：
- 自動車（シャーシ、エンジン、パワートレインなど）
- 航空（精密部品およびベアリングの製造用型）
- 電子（半導体、モーター、バッテリーなど）
- 食品および医療（乳製品処理機械）
- 土木工事（建設用ビット）
- 鉄鋼およびミリング（冷間・温間鍛造ロール）

## お問い合わせ
- **住所 (本社):** Ashoka Vihar, 2nd Floor, H2/8, XLO Point, Ambad, Nashik - 422010
- **電話:** +91 0253 6693924
- **メール:** contact@vjtl.in

当社は、ISO 9001:2015 認証を取得した企業です。持続可能な成長を目指すVijayshreeで、あなたのビジネスのパートナーとしてお手伝いできることを期待しています。

In [25]:
def stream_brochure(company_name, url):
    stream = openai.chat.completions.create(
        model = MODEL,
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name,url)}
        ],
        stream = True
    )

    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ""
        response = response.replace("```","").replace('markdown',"")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [26]:
stream_brochure("VijayShree Toolings","https://www.vjtl.in/")

Found links: {'links': [{'type': 'about page', 'url': 'https://www.vjtl.in/aboutus.html'}, {'type': 'careers page', 'url': 'https://www.vjtl.in/careers.html'}, {'type': 'products page', 'url': 'https://www.vjtl.in/products.html'}, {'type': 'facilities page', 'url': 'https://www.vjtl.in/facilities.html'}]}


# Welcome to Vijayshree Wear Parts Pvt Ltd!

**Your one-stop shop for advanced Tungsten Carbide wear parts! Because who wouldn't want to be on a first-name basis with carbide?**

---

### What We Do
At Vijayshree, we turn raw Côte d'Azur dreams into high-performance tungsten carbide wear parts! Think of us as the Willy Wonka of wear parts but instead of chocolate, we make cutting-edge, super durable components that *actually* work in industries like:

- **General Engineering** - All the tools to impress Dad when fixing things around the house!
- **Oil and Gas** - Keeping the energy flowing—even when you forget to put gas in the car.
- **Automotive** - We make components that help your car go vroom!
- **Food and Healthcare** - Because those milk processing machines need love too!

---

### Our Culture
Forget the corporate cubicles! Here at Vijayshree, we believe in a workspace buzzing with energy and innovation. We nurture creativity, encourage independence, and even give you a little room to take risks (just don’t try building a rocket ship in the break room). 

Our team consists of experienced engineers boasting over 30 years in manufacturing, which is a little more than the amount of time it takes to binge-watch an entire series on Netflix.

---

### Why Choose Us?
- **Consistency Like Your Morning Coffee**: We promise quality assurance, so your parts perform stellar batch after batch! 
- **Timely Delivery**: We’re like your “always on time” friend who brings chips to the party.
- **Technical Support**: Got questions? Our engineers are here to lend a helping hand (and occasionally a bad pun).
  
**Let us help you reduce your cost per component—because saving money is always in style!**

---

### Careers - Join Us!
Looking for a place where you can work with amazing people, learn a ton, and grow your career? Look no further! 

Whether you’re interested in **Sales**, **Marketing**, or just want to send us your résumé (pune@vjtl.in) so we can have a good old giggle over it, we’re open to hearing from you!

> *“Career Development: Where You Work Harder, So You Can Hardly Work!”*

---

### Our Products
We offer an impressive range of products that could revive even the most tired industry, including:

- **Slitting Cutters** – Slice it right the first time!
- **Piercing Punches** – No, not the kind you use at parties!
- **And a host of other extraordinary wear parts** that fit various industrial needs.

---

### Join the Vijayshree Family!
So, if you’re fed up with ordinary machining and want to spice your operations with some extra tough, tungsten carbide magic, give us a call: +91 0253 6693924/ +91 8149369430. 

> Remember folks, at Vijayshree Wear Parts, we make wear parts that stick around longer than your new year’s resolutions. Cheers to durability!

---

**Vijayshree Wear Parts Pvt Ltd** - Where carbide dreams become high-performance reality! 🎉