# Introduction
In this project, I create an AI assistant. The target of AI assistant is to create brochure about some websites. I will use the llama 3.2 model and openai model for AI assistant. There are two AI assistants in this project. The First AI assistant gets important links on the website and it will use openai model, the second AI assistant creates a brochure and It will use llama 3.2 model.

In [2]:
import os
import re
import json
import ollama
import requests
from typing import List
from openai import OpenAI
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display, update_display

In [3]:
MODEL_LLAMA = 'llama3.2'
MODEL_OPENAI = 'gpt-4o-mini'
OLLAMA_API = "http://localhost:11434/api/chat"

load_dotenv(override = True)
api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")

openai = OpenAI()

API key looks good so far


In [4]:
question = """
Please explain what this code does and why:
yield from {book.get("author") for book in books if book.get("author")}
"""

In [5]:
system_prompt = 'You are an expert in Python programming. You provide clear, concise, and accurate explanations, helping users with Python concepts, debugging, and best practices. Keep responses precise and to the point.'

In [6]:
messages = [
    {
        'role' : 'system', 'content' : system_prompt
    },
    {
        'role' : 'user', 'content' : question
    }
]

In [7]:
# Checking ollama model for run
response = ollama.chat(model = MODEL_LLAMA, messages = messages)
print(response['message']['content'])

**Code Explanation**

This line of code uses a combination of Python features to extract author names from a list of dictionaries (`books`) and yield them one by one.

Here's a breakdown:

* `{book.get("author") for book in books if book.get("author")}`: This is a generator expression.
	+ `for book in books`: Iterates over each dictionary (`book`) in the `books` list.
	+ `if book.get("author")`: Filters out dictionaries that don't have an "author" key or value.
	+ `book.get("author")`: Retrieves and yields the value associated with the "author" key for each remaining dictionary.

* `yield from ...`: This keyword is used to yield values from a sub-generator (the generator expression). It's like saying "give me all the values generated by the sub-generator, one at a time".

So, when combined, this line of code yields author names from dictionaries in the `books` list, but only for those that have an "author" key.

**Example Use Case**

Suppose you have a list of books with their metadata

In [8]:
# A class to represent a Webpage
# This class is checking links on the website for AI models. Then The AI model gets important links for brochures.

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:
    """
    A utility class to represent a Website that we have scraped, now with links
    """

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers)
        self.body = response.content
        soup = BeautifulSoup(self.body, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        if soup.body:
            for irrelevant in soup.body(["script", "style", "img", "input"]):
                irrelevant.decompose()
            self.text = soup.body.get_text(separator="\n", strip=True)
        else:
            self.text = ""
        links = [link.get('href') for link in soup.find_all('a')]
        self.links = [link for link in links if link]

    def get_contents(self):
        return f"Webpage Title:\n{self.title}\nWebpage Contents:\n{self.text}\n\n"
        

In [9]:
huggingface = Website('https://huggingface.co')
huggingface.links

['/',
 '/models',
 '/datasets',
 '/spaces',
 '/docs',
 '/enterprise',
 '/pricing',
 '/login',
 '/join',
 '/spaces',
 '/models',
 '/deepseek-ai/DeepSeek-R1-0528',
 '/ByteDance-Seed/BAGEL-7B-MoT',
 '/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B',
 '/ResembleAI/chatterbox',
 '/google/gemma-3n-E4B-it-litert-preview',
 '/models',
 '/spaces/enzostvs/deepsite',
 '/spaces/ResembleAI/Chatterbox',
 '/spaces/NihalGazi/FLUX-Pro-Unlimited',
 '/spaces/multimodalart/wan2-1-fast',
 '/spaces/Lightricks/ltx-video-distilled',
 '/spaces',
 '/datasets/open-r1/Mixture-of-Thoughts',
 '/datasets/yandex/yambda',
 '/datasets/MiniMaxAI/SynLogic',
 '/datasets/disco-eth/EuroSpeech',
 '/datasets/cognitivecomputations/china-refusals',
 '/datasets',
 '/join',
 '/pricing#endpoints',
 '/pricing#spaces',
 '/pricing',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/enterprise',
 '/allenai',
 '/facebook',
 '/amazon',
 '/google',
 '/Intel',
 '/microsoft',
 '/grammarly',
 '/Writer

My website function is working clearly. Now My first AI assistant gets important links in the website function.

# First AI Asstant

In [10]:
# I create system prompt for first ai assitant
link_system_prompt = "You are provided with a list of links found on a webpage. \
You are able to decide which of the links would be most relevant to include in a brochure about the company, \
such as links to an About page, or a Company page, or Careers/Jobs pages.\n"
link_system_prompt += "You should respond in JSON as in this example:"
link_system_prompt += """
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}
"""

In [11]:
print(link_system_prompt)

You are provided with a list of links found on a webpage. You are able to decide which of the links would be most relevant to include in a brochure about the company, such as links to an About page, or a Company page, or Careers/Jobs pages.
You should respond in JSON as in this example:
{
    "links": [
        {"type": "about page", "url": "https://full.url/goes/here/about"},
        {"type": "careers page": "url": "https://another.full.url/careers"}
    ]
}



In [12]:
# I create a function for user prompt
def get_links_user_prompt(website):
    user_prompt = f"Here is the list of links on the website of {website.url} - "
    user_prompt += "please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. \
Do not include Terms of Service, Privacy, email links.\n"
    user_prompt += "Links (some might be relative links):\n"
    user_prompt += "\n".join(website.links)
    return user_prompt

In [13]:
print(get_links_user_prompt(huggingface))

Here is the list of links on the website of https://huggingface.co - please decide which of these are relevant web links for a brochure about the company, respond with the full https URL in JSON format. Do not include Terms of Service, Privacy, email links.
Links (some might be relative links):
/
/models
/datasets
/spaces
/docs
/enterprise
/pricing
/login
/join
/spaces
/models
/deepseek-ai/DeepSeek-R1-0528
/ByteDance-Seed/BAGEL-7B-MoT
/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
/ResembleAI/chatterbox
/google/gemma-3n-E4B-it-litert-preview
/models
/spaces/enzostvs/deepsite
/spaces/ResembleAI/Chatterbox
/spaces/NihalGazi/FLUX-Pro-Unlimited
/spaces/multimodalart/wan2-1-fast
/spaces/Lightricks/ltx-video-distilled
/spaces
/datasets/open-r1/Mixture-of-Thoughts
/datasets/yandex/yambda
/datasets/MiniMaxAI/SynLogic
/datasets/disco-eth/EuroSpeech
/datasets/cognitivecomputations/china-refusals
/datasets
/join
/pricing#endpoints
/pricing#spaces
/pricing
/enterprise
/enterprise
/enterprise
/enterprise
/

In [16]:
# This function is with Openai model
def get_links(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model=MODEL_OPENAI,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
      ],
        response_format={"type": "json_object"}
    )
    result = response.choices[0].message.content
    return json.loads(result)

# This function is with ollama
"""
def get_links(url):
    website = Website(url)
    response = ollama.chat(
        model = MODEL_LLAMA,
        messages=[
            {"role": "system", "content": link_system_prompt},
            {"role": "user", "content": get_links_user_prompt(website)}
        ],)
        #response_format={"type": "json_object"})

    result = response['message']['content']

    # json format
    match = re.search(r'\{.*\}', result, re.DOTALL)
    if match:
        json_str = match.group(0)
        data = json.loads(json_str)
        print('Json is found.')
        return data
    else:
        print('Json is not found.')"""
    

'\ndef get_links(url):\n    website = Website(url)\n    response = ollama.chat(\n        model = MODEL_LLAMA,\n        messages=[\n            {"role": "system", "content": link_system_prompt},\n            {"role": "user", "content": get_links_user_prompt(website)}\n        ],)\n        #response_format={"type": "json_object"})\n\n    result = response[\'message\'][\'content\']\n\n    # json format\n    match = re.search(r\'\\{.*\\}\', result, re.DOTALL)\n    if match:\n        json_str = match.group(0)\n        data = json.loads(json_str)\n        print(\'Json is found.\')\n        return data\n    else:\n        print(\'Json is not found.\')'

In [17]:
get_links("https://huggingface.co")

{'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'},
  {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'},
  {'type': 'company page', 'url': 'https://huggingface.co/enterprise'},
  {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'},
  {'type': 'blog page', 'url': 'https://huggingface.co/blog'},
  {'type': 'community discussion', 'url': 'https://discuss.huggingface.co'},
  {'type': 'GitHub', 'url': 'https://github.com/huggingface'},
  {'type': 'Twitter', 'url': 'https://twitter.com/huggingface'},
  {'type': 'LinkedIn',
   'url': 'https://www.linkedin.com/company/huggingface/'}]}

In [27]:
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        result += Website(link["url"]).get_contents()
    return result

"""
# I created this function for error. This function does not open the same links.
def get_all_details(url):
    result = "Landing page:\n"
    result += Website(url).get_contents()
    links = get_links(url)
    print("Found links:", links)
    for link in links["links"]:
        result += f"\n\n{link['type']}\n"
        try:
            result += Website(link["url"]).get_contents()
        except Exception as e:
            result += f"[Hata: {e}]\n"
    return result
"""

'\n# I created this function for error. This function does not open the same links.\ndef get_all_details(url):\n    result = "Landing page:\n"\n    result += Website(url).get_contents()\n    links = get_links(url)\n    print("Found links:", links)\n    for link in links["links"]:\n        result += f"\n\n{link[\'type\']}\n"\n        try:\n            result += Website(link["url"]).get_contents()\n        except Exception as e:\n            result += f"[Hata: {e}]\n"\n    return result\n'

In [19]:
print(get_all_details("https://huggingface.co"))

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'docs page', 'url': 'https://huggingface.co/docs'}, {'type': 'community discussion page', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}
Landing page:
Webpage Title:
Hugging Face – The AI community building the future.
Webpage Contents:
Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
The AI community building the future.
The platform where the machine learning community collaborat

I am skipping some links because I have trouble accessing them.

# Second AI Asstant

In [20]:
system_prompt = "You are an assistant that analyzes the contents of several relevant pages from a company website \
and creates a short brochure about the company for prospective customers, investors and recruits. Respond in markdown.\
Include details of company culture, customers and careers/jobs if you have the information."

In [21]:
def get_brochure_user_prompt(company_name, url):
    user_prompt = f"You are looking at a company called: {company_name}\n"
    user_prompt += f"Here are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\n"
    user_prompt += get_all_details(url)
    #user_prompt = user_prompt[:5_000] # Truncate if more than 5,000 characters
    return user_prompt

In [22]:
get_brochure_user_prompt("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog', 'url': 'https://huggingface.co/blog'}, {'type': 'company profile', 'url': 'https://www.linkedin.com/company/huggingface/'}, {'type': 'community discussion', 'url': 'https://discuss.huggingface.co'}, {'type': 'support status', 'url': 'https://status.huggingface.co/'}]}


'You are looking at a company called: HuggingFace\nHere are the contents of its landing page and other relevant pages; use this information to build a short brochure of the company in markdown.\nLanding page:\nWebpage Title:\nHugging Face – The AI community building the future.\nWebpage Contents:\nHugging Face\nModels\nDatasets\nSpaces\nCommunity\nDocs\nEnterprise\nPricing\nLog In\nSign Up\nThe AI community building the future.\nThe platform where the machine learning community collaborates on models, datasets, and applications.\nExplore AI Apps\nor\nBrowse 1M+ models\nTrending on\nthis week\nModels\ndeepseek-ai/DeepSeek-R1-0528\nUpdated\n1 day ago\n•\n16.4k\n•\n1.44k\nByteDance-Seed/BAGEL-7B-MoT\nUpdated\n8 days ago\n•\n7.3k\n•\n863\ndeepseek-ai/DeepSeek-R1-0528-Qwen3-8B\nUpdated\n1 day ago\n•\n10.3k\n•\n446\nResembleAI/chatterbox\nUpdated\nabout 11 hours ago\n•\n326\ngoogle/gemma-3n-E4B-it-litert-preview\nUpdated\n4 days ago\n•\n671\nBrowse 1M+ models\nSpaces\nRunning\n7.46k\n7.46k\n

In [23]:
def create_brochure(company_name, url):
    response = ollama.chat(
        model = MODEL_LLAMA,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
        ])

    result = response['message']['content']
    display(Markdown(result))

In [24]:
create_brochure("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'company page', 'url': 'https://huggingface.co'}, {'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'documentation page', 'url': 'https://huggingface.co/docs'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'discussion forum', 'url': 'https://discuss.huggingface.co'}, {'type': 'GitHub page', 'url': 'https://github.com/huggingface'}, {'type': 'LinkedIn page', 'url': 'https://www.linkedin.com/company/huggingface'}, {'type': 'Twitter page', 'url': 'https://twitter.com/huggingface'}]}


The text appears to be a collection of updates and news from Hugging Face, a company that specializes in natural language processing (NLP) and machine learning. The updates include:

* Announcements of new funding rounds for the company
* Updates on the launch of new products and services, such as the Agents & MCP Hackathon
* News about new hires and promotions within the company
* Announcements of new repositories and projects being open-sourced by Hugging Face
* Updates on upcoming live streams and Q&A sessions related to machine learning and NLP

The text also includes a section called "Similar pages" that lists other companies and organizations in the same industry as Hugging Face, such as Anthropic, Mistral AI, OpenAI, and Google DeepMind.

Overall, the text appears to be a newsletter or blog post from Hugging Face, providing updates and news about the company's activities and projects.

# Bonus
In this part, I will use the Opanai model to create a brochure. The AI assistant will use an openai model and will create a brochure.

In [25]:
def create_brochure_OpeanAI(company_name, url):
    response = openai.chat.completions.create(
        model=MODEL_OPENAI,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": get_brochure_user_prompt(company_name, url)}
          ],
    )
    result = response.choices[0].message.content
    display(Markdown(result))

In [26]:
create_brochure_OpeanAI("HuggingFace", "https://huggingface.co")

Found links: {'links': [{'type': 'about page', 'url': 'https://huggingface.co/huggingface'}, {'type': 'careers page', 'url': 'https://apply.workable.com/huggingface/'}, {'type': 'enterprise page', 'url': 'https://huggingface.co/enterprise'}, {'type': 'pricing page', 'url': 'https://huggingface.co/pricing'}, {'type': 'blog page', 'url': 'https://huggingface.co/blog'}, {'type': 'company page', 'url': 'https://www.linkedin.com/company/huggingface/'}]}


# Hugging Face Company Brochure

---

## **About Us**

Hugging Face is a rapidly growing technology company founded in 2016, dedicated to democratizing artificial intelligence through innovative machine learning tools and solutions. Our platform serves as a hub for the AI community, offering a collaborative environment where developers, researchers, and organizations can build, share, and deploy state-of-the-art models in natural language processing, image processing, and beyond.

---

## **What We Offer**

### **1. Models & Datasets**
- Access to over **1 million models** and **250,000 datasets** collaboratively built by the community.
- Advanced ML features, enabling users to contribute to and utilize the latest in machine learning tools.
  
### **2. Spaces**
- A creative platform for deploying machine learning applications.
- Users can run applications quickly and efficiently with an array of optimized hardware options.

### **3. Enterprise Solutions**
- Tailored services for organizations emphasizing **enterprise-grade security** and **dedicated support**.
- Flexible pricing starting at **$20/user/month**, with advanced analytics and access control features.

### **4. Open Source Collaboration**
- Hugging Face champions an open-source approach, fostering community contributions and enhancing collaborative research in machine learning.

---

## **Company Culture**

At Hugging Face, we embrace a **community-first culture** that encourages collaboration, experimentation, and continuous learning. Our diverse and multidisciplinary team shares a common mission to build the future of AI responsibly and ethically.

### **Values:**
- **Inclusivity**: We welcome everyone into the AI community, regardless of background.
- **Innovation**: We prioritize creativity and exploration as we tackle complex ML challenges.
- **Transparency**: We believe in open communication and sharing knowledge with the community.

---

## **Join Us**

### **Careers at Hugging Face**
We are always looking for talented and passionate individuals to join our team! Employees can enjoy a dynamic work environment that supports professional growth and innovation. 

- Current openings range from engineering to product management. Check [our careers page](https://huggingface.co/careers) for the latest opportunities. 

---

## **Our Customers**

Hugging Face serves over **50,000 organizations**, including industry leaders such as:
- **Amazon**
- **Google**
- **Microsoft**
- **Grammarly**

These organizations depend on our advanced platform to accelerate their AI initiatives and enhance their machine learning capabilities.

---

## **Get in Touch**

Ready to be part of the AI revolution? Explore our offerings, collaborate on projects, or simply join our thriving community of developers on platforms like [GitHub](https://github.com/huggingface), [Twitter](https://twitter.com/huggingface), and [LinkedIn](https://www.linkedin.com/company/huggingface).

Let’s build the future of AI together! Visit us at [Hugging Face](https://huggingface.co).

--- 

*This brochure aims to provide an overview of Hugging Face and its offerings as we continue to grow and shape the future of AI.*

# Conclusion
We used two models in this project. We learned how to apply the models and prepared prompts for the models. And finally, we observed the pipes prepared by the two models.<br>
I am improving myself on LLM models and I will prepare the same project. I am new to this field. I may have made mistakes. I am sorry about that. 
I hope my project was useful for you. Thank you for taking a look at my project. I will continue to share projects. If you want to be informed in advance,  you can follow me from the links below.

[LinkedIn](https://www.linkedin.com/in/ihsancenkiz/)<br>
[GitHub](https://github.com/ihsncnkz)<br>
[Kaggle](https://www.kaggle.com/ihsncnkz)<br>