First- We figure out which links are relevant for our Brochure


In [8]:
import os
import json
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from scraper import fetch_website_links, fetch_website_contents
from openai import OpenAI

In [2]:
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")
    
MODEL = 'gpt-5-nano'
openai = OpenAI()

API key looks good so far


In [3]:
link_system_prompt = """
You are provided with a list of links found on a webpage.
You are able to decide which of the links would be most relevant to include in a brochure about the company,
such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:

{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page", "url": "https://another.full.url/careers"}
    ]
}
"""

In [4]:
def get_links_user_prompt(url):
    user_prompt = f"""
Here is the list of links on the website {url} -
Please decide which of these are relevant web links for a brochure about the company, 
respond with the full https URL in JSON format.
Do not include Terms of Service, Privacy, email links.

Links (some might be relative links):

"""
    links = fetch_website_links(url)
    user_prompt += "\n".join(links)
    return user_prompt

In [5]:
print(get_links_user_prompt("https://www.merckgroup.com/en"))


Here is the list of links on the website https://www.merckgroup.com/en -
Please decide which of these are relevant web links for a brochure about the company, 
respond with the full https URL in JSON format.
Do not include Terms of Service, Privacy, email links.

Links (some might be relative links):

https://www.emdgroup.com
https://www.merckgroup.com
/en.html
/en/privacy-statement.html
/en/privacy-statement.html
/en/privacy-statement.html#3-2-2-cookies
#main-content
/de.html
https://careers.merckgroup.com/global/en
/en.html
/en/company.html
/en/company.html
/en/company.html
/en/company/strategy-and-values.html
/en/company/management.html
/en/partnering.html
/en/company/our-impact.html
/en/company/building-belonging.html
/en/company/history.html
/en/company/merck-in-germany.html
/en/locations.html
/en/company/contact-us.html
/en/sustainability.html
/en/sustainability/products-and-innovation.html
/en/sustainability/business-ethics.html
/en/sustainability/health-for-all.html
/en/sustai

In [6]:
def select_relevant_links(url):
    response = openai.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(url)}
        ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    links = json.loads(result)
    return links

In [7]:
select_relevant_links("https://www.merckgroup.com/en")

{'links': [{'type': 'home page', 'url': 'https://www.merckgroup.com'},
  {'type': 'company page',
   'url': 'https://www.merckgroup.com/en/company.html'},
  {'type': 'strategy and values page',
   'url': 'https://www.merckgroup.com/en/company/strategy-and-values.html'},
  {'type': 'management page',
   'url': 'https://www.merckgroup.com/en/company/management.html'},
  {'type': 'partnering page',
   'url': 'https://www.merckgroup.com/en/partnering.html'},
  {'type': 'our impact page',
   'url': 'https://www.merckgroup.com/en/company/our-impact.html'},
  {'type': 'building belonging page',
   'url': 'https://www.merckgroup.com/en/company/building-belonging.html'},
  {'type': 'history page',
   'url': 'https://www.merckgroup.com/en/company/history.html'},
  {'type': 'merck in Germany page',
   'url': 'https://www.merckgroup.com/en/company/merck-in-germany.html'},
  {'type': 'locations page',
   'url': 'https://www.merckgroup.com/en/locations.html'},
  {'type': 'contact us page',
   'url':

Second- After getting the relevent links for the brochure, Now its time to create the brochure using the relevent links

In [9]:
def fetch_page_and_all_relevant_links(url):
    contents = fetch_website_contents(url)
    relevant_links = select_relevant_links(url)
    result = f"## Landing Page:\n\n{contents}\n## Relevant Links:\n"
    for link in relevant_links['links']:
        result += f"\n\n### Link: {link['type']}\n"
        result += fetch_website_contents(link["url"])
    return result

In [10]:
print(fetch_page_and_all_relevant_links("https://www.merckgroup.com/en"))

## Landing Page:

English Home - Merck Group

Redirect
You have accessed
https://www.emdgroup.com
, but for users from your part of the world, we originally designed the following web presence
https://www.merckgroup.com
.
Let's go
Cancel
Internet Explorer is not supported
You are using an outdated browser. If you would like to use our website, please use Chrome, Firefox, Safari or other browser
Accept
Share Disclaimer
By sharing this content, you are consenting to share your data to this social media provider. More information are available in our
Privacy Statement
.
OK
Cancel
{"analytics":"targeting","share":"targeting"}
Cookie Disclaimer
We use cookies so that we can offer you the best possible website experience. This includes cookies which are necessary for the operation of the app and the website, as well as other cookies which are used solely for anonymous statistical purposes, for more comfortable website settings, or for the display of personalized content. You are free to deci

In [11]:
brochure_system_prompt = """
You are an assistant that analyzes the contents of several relevant pages from a company website
and creates a short brochure about the company for prospective customers, investors and recruits.
Respond in markdown without code blocks.
Include details of company culture, customers and careers/jobs if you have the information.
"""

In [12]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"""
You are looking at a company called: {company_name}
Here are the contents of its landing page and other relevant pages;
use this information to build a short brochure of the company in markdown without code blocks.\n\n
"""
    user_prompt += fetch_page_and_all_relevant_links(url)
    user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [13]:
get_brochure_user_prompt("Merck", "https://www.merckgroup.com/en")

'\nYou are looking at a company called: Merck\nHere are the contents of its landing page and other relevant pages;\nuse this information to build a short brochure of the company in markdown without code blocks.\n\n\n## Landing Page:\n\nEnglish Home - Merck Group\n\nRedirect\nYou have accessed\nhttps://www.emdgroup.com\n, but for users from your part of the world, we originally designed the following web presence\nhttps://www.merckgroup.com\n.\nLet\'s go\nCancel\nInternet Explorer is not supported\nYou are using an outdated browser. If you would like to use our website, please use Chrome, Firefox, Safari or other browser\nAccept\nShare Disclaimer\nBy sharing this content, you are consenting to share your data to this social media provider. More information are available in our\nPrivacy Statement\n.\nOK\nCancel\n{"analytics":"targeting","share":"targeting"}\nCookie Disclaimer\nWe use cookies so that we can offer you the best possible website experience. This includes cookies which are ne

In [14]:
def create_brochure(company_name, url):
    response = openai.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {"role": "system", "content": brochure_system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
        ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [15]:
create_brochure("Merck", "https://www.merckgroup.com/en")

# Merck Group — Innovation That Matters

---

## About Merck

Merck is a leading science and technology company dedicated to cutting-edge solutions in the fields of healthcare, life science, and performance materials. With a global footprint, Merck drives innovation that improves the quality of life for millions around the world.

---

## Our Mission and Vision

At Merck, we believe in the transformative power of science and technology to change lives. Our mission is to develop groundbreaking products and solutions that address some of the most pressing challenges in health and industry.

---

## What We Do

- **Healthcare:** We provide innovative therapies and treatments that help patients overcome serious diseases, improving healthcare outcomes globally.
- **Life Science:** We empower researchers and scientists worldwide with high-quality tools and technologies to accelerate scientific discovery.
- **Performance Materials:** Our advanced materials enable the development of next-generation technologies in electronics, displays, and more.

---

## Company Culture

Merck fosters an inclusive and inspiring work environment where creativity, collaboration, and continuous learning are highly valued. Employees are encouraged to innovate and are supported with resources for personal and professional growth.

- **Innovation-driven:** Embrace cutting-edge research and development.
- **Inclusive Work Environment:** Diverse teams empower fresh ideas and perspectives.
- **Sustainability:** Committed to ethical practices and responsible innovation.

---

## Customers

Merck serves a wide range of customers including:

- Pharmaceutical companies seeking advanced therapies.
- Academic and industrial researchers developing new scientific insights.
- Technology companies relying on high-performance materials.
- Healthcare providers managing patient care with innovative solutions.

---

## Careers at Merck

Merck offers exciting career opportunities for talented individuals passionate about science and innovation.

- Roles in research, manufacturing, marketing, sales, and more.
- Comprehensive training and development programs.
- Global mobility and opportunities to work across diverse fields.
- Employee benefits that promote well-being and work-life balance.

If you aspire to be part of a forward-thinking company that values innovation and impact, Merck is the place for you.

---

## Connect With Us

Discover more about Merck’s innovations, career opportunities, and commitment to shaping the future at  
**[merckgroup.com](https://www.merckgroup.com)**

---

*Merck: Science for a better life.*