# YOUR FIRST LAB
### Please read this section. This is valuable to get you prepared, even if it's a long read -- it's important stuff.

## Your first Frontier LLM Project

Let's build a useful LLM solution - in a matter of minutes.

By the end of this course, you will have built an autonomous Agentic AI solution with 7 agents that collaborate to solve a business problem. All in good time! We will start with something smaller...

Our goal is to code a new kind of Web Browser. Give it a URL, and it will respond with a summary. The Reader's Digest of the internet!!

Before starting, you should have completed the setup for [PC](../SETUP-PC.md) or [Mac](../SETUP-mac.md) and you hopefully launched this jupyter lab from within the project root directory, with your environment activated.

## If you're new to Jupyter Lab

Welcome to the wonderful world of Data Science experimentation! Once you've used Jupyter Lab, you'll wonder how you ever lived without it. Simply click in each "cell" with code in it, such as the cell immediately below this text, and hit Shift+Return to execute that cell. As you wish, you can add a cell with the + button in the toolbar, and print values of variables, or try out variations.  

I've written a notebook called [Guide to Jupyter](Guide%20to%20Jupyter.ipynb) to help you get more familiar with Jupyter Labs, including adding Markdown comments, using `!` to run shell commands, and `tqdm` to show progress.

## If you're new to the Command Line

Please see these excellent guides: [Command line on PC](https://chatgpt.com/share/67b0acea-ba38-8012-9c34-7a2541052665) and [Command line on Mac](https://chatgpt.com/canvas/shared/67b0b10c93a081918210723867525d2b).  

## If you'd prefer to work in IDEs

If you're more comfortable in IDEs like VSCode, Cursor or PyCharm, they both work great with these lab notebooks too.  
If you'd prefer to work in VSCode, [here](https://chatgpt.com/share/676f2e19-c228-8012-9911-6ca42f8ed766) are instructions from an AI friend on how to configure it for the course.

## If you'd like to brush up your Python

I've added a notebook called [Intermediate Python](Intermediate%20Python.ipynb) to get you up to speed. But you should give it a miss if you already have a good idea what this code does:    
`yield from {book.get("author") for book in books if book.get("author")}`

## I am here to help

If you have any problems at all, please do reach out.  
I'm available through the platform, or at ed@edwarddonner.com, or at https://www.linkedin.com/in/eddonner/ if you'd like to connect (and I love connecting!)  
And this is new to me, but I'm also trying out X/Twitter at [@edwarddonner](https://x.com/edwarddonner) - if you're on X, please show me how it's done 😂  

## More troubleshooting

Please see the [troubleshooting](troubleshooting.ipynb) notebook in this folder to diagnose and fix common problems. At the very end of it is a diagnostics script with some useful debug info.

## For foundational technical knowledge (eg Git, APIs, debugging) 

If you're relatively new to programming -- I've got your back! While it's ideal to have some programming experience for this course, there's only one mandatory prerequisite: plenty of patience. 😁 I've put together a set of self-study guides that cover Git and GitHub, APIs and endpoints, beginner python and more.

This covers Git and GitHub; what they are, the difference, and how to use them:  
https://github.com/ed-donner/agents/blob/main/guides/03_git_and_github.ipynb

This covers technical foundations:  
ChatGPT vs API; taking screenshots; Environment Variables; Networking basics; APIs and endpoints:  
https://github.com/ed-donner/agents/blob/main/guides/04_technical_foundations.ipynb

This covers Python for beginners, and making sure that a `NameError` never trips you up:  
https://github.com/ed-donner/agents/blob/main/guides/06_python_foundations.ipynb

This covers the essential techniques for figuring out errors:  
https://github.com/ed-donner/agents/blob/main/guides/08_debugging.ipynb

And you'll find other useful guides in the same folder in GitHub. Some information applies to my other Udemy course (eg Async Python) but most of it is very relevant for LLM engineering.

## If this is old hat!

If you're already comfortable with today's material, please hang in there; you can move swiftly through the first few labs - we will get much more in depth as the weeks progress. Ultimately we will fine-tune our own LLM to compete with OpenAI!

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Please read - important note</h2>
            <span style="color:#900;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations. If you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...</span>
        </td>
    </tr>
</table>
<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">This code is a live resource - keep an eye out for my emails</h2>
            <span style="color:#f71;">I push updates to the code regularly. As people ask questions, I add more examples or improved commentary. As a result, you'll notice that the code below isn't identical to the videos. Everything from the videos is here; but I've also added better explanations and new models like DeepSeek. Consider this like an interactive book.<br/><br/>
                I try to send emails regularly with important updates related to the course. You can find this in the 'Announcements' section of Udemy in the left sidebar. You can also choose to receive my emails via your Notification Settings in Udemy. I'm respectful of your inbox and always try to add value with my emails!
            </span>
        </td>
    </tr>
</table>
<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business value of these exercises</h2>
            <span style="color:#181;">A final thought. While I've designed these notebooks to be educational, I've also tried to make them enjoyable. We'll do fun things like have LLMs tell jokes and argue with each other. But fundamentally, my goal is to teach skills you can apply in business. I'll explain business implications as we go, and it's worth keeping this in mind: as you build experience with models and techniques, think of ways you could put this into action at work today. Please do contact me if you'd like to discuss more or if you have ideas to bounce off me.</span>
        </td>
    </tr>
</table>

In [1]:
# imports

import os
import requests
from dotenv import load_dotenv
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from openai import OpenAI

# If you get an error running this cell, then please head over to the troubleshooting notebook!

# Connecting to OpenAI (or Ollama)

The next cell is where we load in the environment variables in your `.env` file and connect to OpenAI.  

If you'd like to use free Ollama instead, please see the README section "Free Alternative to Paid APIs", and if you're not sure how to do this, there's a full solution in the solutions folder (day1_with_ollama.ipynb).

## Troubleshooting if you have problems:

Head over to the [troubleshooting](troubleshooting.ipynb) notebook in this folder for step by step code to identify the root cause and fix it!

If you make a change, try restarting the "Kernel" (the python process sitting behind this notebook) by Kernel menu >> Restart Kernel and Clear Outputs of All Cells. Then try this notebook again, starting at the top.

Or, contact me! Message me or email ed@edwarddonner.com and we will get this to work.

Any concerns about API costs? See my notes in the README - costs should be minimal, and you can control it at every point. You can also use Ollama as a free alternative, which we discuss during Day 2.

In [2]:
# Load environment variables in a file called .env

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Check the key

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
elif api_key.strip() != api_key:
    print("An API key was found, but it looks like it might have space or tab characters at the start or end - please remove them - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")


API key found and looks good so far!


In [3]:
openai = OpenAI()

# If this doesn't work, try Kernel menu >> Restart Kernel and Clear Outputs Of All Cells, then run the cells from the top of this notebook down.
# If it STILL doesn't work (horrors!) then please see the Troubleshooting notebook in this folder for full instructions

# Let's make a quick call to a Frontier model to get started, as a preview!

In [5]:
# To give you a preview -- calling OpenAI with these messages is this easy. Any problems, head over to the Troubleshooting notebook.

message = "Hello, GPT! This is my first ever message to you! Hi!"
response = openai.chat.completions.create(model="gpt-4o-mini", messages=[{"role":"user", "content":message}])
print(response.choices[0].message.content)

Hello! Welcome! I'm glad you're here. How can I assist you today?


## OK onwards with our first project

In [6]:
# A class to represent a Webpage
# If you're not familiar with Classes, check out the "Intermediate Python" notebook

# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        """
        Create this Website object from the given url using the BeautifulSoup library
        """
        self.url = url
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        self.title = soup.title.string if soup.title else "No title found"
        for irrelevant in soup.body(["script", "style", "img", "input"]):
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

In [8]:
# Let's try one out. Change the website and add print statements to follow along.

ed = Website("https://google.com")
print(ed.title)
print(ed.text)

Google
About
Store
Gmail
Images
Sign in
See more
Delete
Delete
Report inappropriate predictions
Advertising
Business
How Search works
Applying AI towards science and the environment
Privacy
Terms
Settings
Search settings
Advanced search
Your data in Search
Search history
Search help
Send feedback
Dark theme: Off
Google apps


## Types of prompts

You may know this already - but if not, you will get very familiar with it!

Models like GPT4o have been trained to receive instructions in a particular way.

They expect to receive:

**A system prompt** that tells them what task they are performing and what tone they should use

**A user prompt** -- the conversation starter that they should reply to

In [9]:
# Define our system prompt - you can experiment with this later, changing the last sentence to 'Respond in markdown in Spanish."

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

In [10]:
# A function that writes a User Prompt that asks for summaries of websites:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [11]:
print(user_prompt_for(ed))

You are looking at a website titled Google
The contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.

About
Store
Gmail
Images
Sign in
See more
Delete
Delete
Report inappropriate predictions
Advertising
Business
How Search works
Applying AI towards science and the environment
Privacy
Terms
Settings
Search settings
Advanced search
Your data in Search
Search history
Search help
Send feedback
Dark theme: Off
Google apps


## Messages

The API from OpenAI expects to receive messages in a particular structure.
Many of the other APIs share this structure:

```python
[
    {"role": "system", "content": "system message goes here"},
    {"role": "user", "content": "user message goes here"}
]
```
To give you a preview, the next 2 cells make a rather simple call - we won't stretch the mighty GPT (yet!)

In [12]:
messages = [
    {"role": "system", "content": "You are a snarky assistant"},
    {"role": "user", "content": "What is 2 + 2?"}
]

In [13]:
# To give you a preview -- calling OpenAI with system and user messages:

response = openai.chat.completions.create(model="gpt-4o-mini", messages=messages)
print(response.choices[0].message.content)

Oh, we're diving deep into the world of advanced mathematics, huh? Well, brace yourself: 2 + 2 equals a solid 4. Shocking, I know!


## And now let's build useful messages for GPT-4o-mini, using a function

In [14]:
# See how this function creates exactly the format above

def messages_for(website):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt_for(website)}
    ]

In [15]:
# Try this out, and then try for a few more websites

messages_for(ed)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
 {'role': 'user',
  'content': 'You are looking at a website titled Google\nThe contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nAbout\nStore\nGmail\nImages\nSign in\nSee more\nDelete\nDelete\nReport inappropriate predictions\nAdvertising\nBusiness\nHow Search works\nApplying AI towards science and the environment\nPrivacy\nTerms\nSettings\nSearch settings\nAdvanced search\nYour data in Search\nSearch history\nSearch help\nSend feedback\nDark theme: Off\nGoogle apps'}]

## Time to bring it together - the API for OpenAI is very simple!

In [16]:
# And now: call the OpenAI API. You will get very familiar with this!

def summarize(url):
    website = Website(url)
    response = openai.chat.completions.create(
        model = "gpt-4o-mini",
        messages = messages_for(website)
    )
    return response.choices[0].message.content

In [17]:
summarize("https://edwarddonner.com")

"# Summary of Edward Donner's Website\n\nEdward Donner's website serves as a personal hub for his professional interests and activities, primarily focusing on his work with language models (LLMs) and AI technology. As the co-founder and CTO of Nebula.io, he aims to leverage AI in talent management to help individuals discover their potential. Previously, he founded the AI startup untapt, which was acquired in 2021.\n\nThe website features his enthusiasm for coding, electronic music, and community engagement through platforms like Hacker News. \n\n## Announcements\n- **May 28, 2025**: Launching courses aimed at training individuals to become experts and leaders in LLM technology.\n- **May 18, 2025**: Hosting a 2025 AI Executive Briefing.\n- **April 21, 2025**: Introducing the Complete Agentic AI Engineering Course.\n- **January 23, 2025**: Offering a hands-on LLM Workshop focused on agents and associated resources. \n\nVisitors are encouraged to connect with him and stay updated through

In [18]:
# A function to display this nicely in the Jupyter output, using markdown

def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [19]:
display_summary("https://edwarddonner.com")

# Website Summary: Edward Donner

**Overview**  
This website belongs to Ed Donner, a technology enthusiast focused on coding and experiments with large language models (LLMs). He is the co-founder and CTO of Nebula.io, a company leveraging AI to assist individuals in discovering their potential and improving talent management for recruiters. Previously, he founded the AI startup untapt, which was acquired in 2021.

**Interests**  
Ed has interests in DJing and electronic music production, although he notes he is out of practice. He often engages with the Hacker News community.

**Key Offerings**  
- **Connect Four**: An arena designed for LLMs to compete in diplomacy and strategy.
- Expertise in proprietary LLMs verticalized for talent with patented matching models.

**Recent Announcements**  
- **May 28, 2025**: Introduction of courses aimed at developing leaders and experts in LLMs.
- **May 18, 2025**: Announcement of the 2025 AI Executive Briefing.
- **April 21, 2025**: Launch of "The Complete Agentic AI Engineering Course."
- **January 23, 2025**: Hosting of an LLM workshop focusing on hands-on experience with agent technologies. 

**Connect**  
Visitors are encouraged to connect with Ed and subscribe to his newsletter for further updates.

# Let's try more websites

Note that this will only work on websites that can be scraped using this simplistic approach.

Websites that are rendered with Javascript, like React apps, won't show up. See the community-contributions folder for a Selenium implementation that gets around this. You'll need to read up on installing Selenium (ask ChatGPT!)

Also Websites protected with CloudFront (and similar) may give 403 errors - many thanks Andy J for pointing this out.

But many websites will work just fine!

In [20]:
display_summary("https://cnn.com")

# Website Summary: CNN 

CNN is a global news platform that covers a wide range of topics including U.S. and international news, politics, business, health, entertainment, science, and sports. It provides timely updates on significant events and issues worldwide.

## Key News Highlights:
- **Trump Administration Actions**: The Trump administration has been involved in several controversial actions including a ban on visas for new international students at Harvard and various immigration policies affecting migrant children and adults.
- **Israel-Hamas Conflict**: Reports highlight tensions in Gaza, with analysis pointing to Israeli gunfire during aid operations, amid ongoing violence in the region.
- **Ukraine-Russia War**: Ongoing coverage of the war includes reports on drone strikes affecting key Russian air bases and the rising casualty figures.
- **Crowd Crush Incident in India**: A tragic crowd crush occurred outside a cricket stadium, resulting in at least 11 deaths.
- **Cultural Moments**: Celebrations and reunions from the iconic film "Back to the Future," including commentary from its stars, are highlighted in entertainment news.

CNN also features a variety of other content, including investigative journalism, opinion pieces, personal stories, and multimedia content such as videos and podcasts. The platform encourages user feedback and engagement regarding advertisements and technical experiences. 

Overall, CNN aims to inform and engage its audience with up-to-date reporting on critical global issues.

In [21]:
display_summary("https://anthropic.com")

# Summary of Anthropic Website

The **Anthropic** website primarily focuses on its AI model, **Claude**, and related products, emphasizing safety and human benefit in AI development. 

### Key Sections:

- **Claude Models**: 
  - Introduction of **Claude Opus 4**, the latest and most advanced AI model.
  - Other models mentioned include **Claude Sonnet 4** and **Claude Haiku 3.5**, noted for their capabilities in coding and AI applications.

- **Research & Initiatives**: 
  - Emphasis on AI safety and the responsible development of AI technologies.
  - Topics like the **Anthropic Economic Index** and a commitment to transparency are highlighted.
  - The site discusses **alignment science** and interpretability, showcasing models' internal processes.

- **News & Updates**:
  - Announcement of **ISO 42001 certification**, reflecting their commitment to quality and safety.
  - Content on future updates and capabilities of upcoming models.

- **Learning & Development**:
  - Features the **Anthropic Academy** for building and learning about AI applications using Claude.
  - Resources aimed at developers to explore building with Claude and APIs.

- **Corporate Information**: 
  - Information about the company's mission to ensure that AI serves humanity positively, emphasizing the importance of considering societal impacts.

Overall, the website reflects Anthropic's focus on advanced AI technology while prioritizing safety, transparency, and ethical considerations in the development and application of AI systems.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business applications</h2>
            <span style="color:#181;">In this exercise, you experienced calling the Cloud API of a Frontier Model (a leading model at the frontier of AI) for the first time. We will be using APIs like OpenAI at many stages in the course, in addition to building our own LLMs.

More specifically, we've applied this to Summarization - a classic Gen AI use case to make a summary. This can be applied to any business vertical - summarizing the news, summarizing financial performance, summarizing a resume in a cover letter - the applications are limitless. Consider how you could apply Summarization in your business, and try prototyping a solution.</span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you continue - now try yourself</h2>
            <span style="color:#900;">Use the cell below to make your own simple commercial example. Stick with the summarization use case for now. Here's an idea: write something that will take the contents of an email, and will suggest an appropriate short subject line for the email. That's the kind of feature that might be built into a commercial email tool.</span>
        </td>
    </tr>
</table>

In [24]:
# Step 1: Create your prompts

system_prompt = "You are an assistant that analyzes the contents of emails \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."
user_prompt = """
The AI Engineering Stack
1 message
The Pragmatic Engineer <pragmaticengineer+deepdives@substack.com> Tue, May 20, 2025 at 8:28 AM
Reply-To: The Pragmatic Engineer
<reply+2pncgo&4kt3&&b89235ca75ab8ca4d63b517b854088e961d0ffce9a44ddd303cb543b0f2e2bf1@mg1.substack.com>
To: me@gmail.com
Forwarded this email? Subscribe here for more
Hi – this is Gergely with the monthly, free issue of the Pragmatic Engineer
Newsletter. In every issue, I cover challenges at Big Tech and startups
through the lens of engineering managers and senior engineers. If you’ve
been forwarded this email, you can subscribe here.
Many subscribers expense this newsletter to their learning and development
budget. If you have such a budget, here’s an email you could send to your
manager.
The AI Engineering Stack
Three layers of the AI stack, how AI engineering is different from ML
engineering and fullstack engineering, and more. An excerpt from
the book AI Engineering by Chip Huyen
GERGELY OROSZ AND CHIP HUYEN
MAY 20
READ IN APP
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 1/22
Before we start: on Monday, 16 June, I’ll be recording an episode of the
Pragmatic Engineer podcast live at the LDX3 conference in London, with
special guest, Farhan Thawar, who is Shopify’s Head of Engineering. It’s the
closing session of the conference on that day, and it’d be great if you can join,
should you be at the event. During the live pod recording, Farhan and me will
cover:
How Shopify’s “Reflexive AI usage” approach is changing how their
engineering team works
How Shopify iterates as fast as it does as a full-remote company
An unconventional approach to engineering career growth: mastery and
craft
… and more on how one of the most unconventional tech companies
operates and gets stuff done
If you can, why not join us live at LDX3. I will also deliver a keynote at the
conference, and you can meet the The Pragmatic Engineer team, including
myself, Elin, and Dominic, too. If you won’t be there, the recording will be
published as an episode of The Pragmatic Engineer Podcast after the event.
With that, let’s get into the AI Engineering Stack.
“AI Engineering” is a term that didn’t even exist two years ago, but today, AI
engineers are in high demand. Companies like Meta, Google, and Amazon,
offer higher base salaries for these roles than “regular” software engineers
get, while AI startups and scaleups are scrambling to hire them.
However, closer inspection reveals AI engineers are often regular software
engineers who have mastered the basics of large language models (LLM),
such as working with them and integrating them.
So far, the best book I’ve found on this hot topic is AI Engineering by Chip
Huyen, published in January by O’Reilly. Chip has worked as a researcher at
Netflix, was a core developer at NVIDIA (building NeMo, NVIDIA’s GenAI
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 2/22
framework), and cofounded Claypot AI. She has also taught machine learning
(ML) at Stanford University.
In February, we published a podcast episode with Chip about what AI
engineering is, how it differs from ML engineering, and the techniques AI
engineers should be familiar with.
For this article, I asked Chip if she would be willing to share an excerpt of her
book, and she has generously agreed. This covers what an AI engineering
stack looks like: the one us software engineers must become expert in order
to be an AI engineer.
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 3/22
My AI Engineering book by Chip Huyen
In today’s issue we get into:
. Three layers of the AI stack. Application development, model
development, infrastructure.
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 4/22
. AI engineering versus ML engineering. Similarities and differences,
including how inference optimization evaluation matters more in AI
engineering, and ML knowledge being more of a nice-to-have and less
of a must-have.
. Application development in AI engineering. The three main focus
areas: evaluation, prompt engineering, and AI interfaces.
. AI Engineering versus full-stack engineering. “AI engineering is just
software engineering with AI models thrown in the stack.”
If you find this excerpt useful, you’ll likely get value from the rest of the book,
which can be purchased as an ebook or a physical copy. This newsletter has
also published some deepdives which you may find useful for getting into AI
engineering:
AI Engineering in the real world – stories from 7 software engineersturned AI engineers
Building, launching, and scaling ChatGPT Images – insights from
OpenAI’s engineering team
Building Windsurf – and the engineering challenges behind it
As with all recommendations in this newsletter, I have not been paid to
mention this book, and no links in this article are affiliates. For more details,
see my ethics statement.
The bottom of this article could be cut off in some email clients. Read the full
article uninterrupted, online.
Read the full article online
With that, let’s get into it:
This excerpt is from Chapter 1 of AI Engineering, by Chip Huyen. Copyright ©
2025 Chip Huyen. Published by O'Reilly Media, Inc. Used with permission.
AI engineering’s rapid growth has induced an incredible amount of hype and
FOMO (fear of missing out). The number of new tools, techniques, models,
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 5/22
and applications introduced every day can be overwhelming. Instead of trying
to keep up with these constantly shifting sands, let’s inspect the fundamental
building blocks of AI engineering.
To understand AI engineering, it’s important to recognize that AI engineering
evolved out of ML engineering. When a company starts experimenting with
foundation models, it’s natural that its existing ML team should lead the effort.
Some companies treat AI engineering the same as ML engineering, as shown
in Figure 1-12.
Figure 1-12. Many companies put AI engineering and ML
engineering under the same umbrella, as shown in job
headlines on LinkedIn from December 17, 2023.
Some companies have separate job descriptions for AI engineering, as
shown in Figure 1-13.
Regardless of where organizations position AI engineers and ML engineers,
their roles significantly overlap. Existing ML engineers can add AI engineering
to their list of skills to enhance their job prospects, and there are also AI
engineers with no ML experience.
To best understand AI engineering and how it differs from traditional ML
engineering, the following section breaks down the different layers of the AI
application building process, and looks at the role each layer plays in AI and
ML engineering.
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 6/22
Figure 1-13. Some companies have separate job descriptions
for AI engineering, as shown in the job headlines on LinkedIn
from December 17, 2023.
1. Three layers of the AI Stack
There are three layers to any AI application stack: application development,
model development, and infrastructure. When developing an AI application,
you’ll likely start from the top layer and move downwards as needed:
Application development
With models so readily available, anyone can use them to develop
applications. This is the layer that has seen the most action in the last two
years, and it’s still rapidly evolving. Application development involves
providing a model with good prompts and necessary context. This layer
requires rigorous evaluation and good applications demand good interfaces.
Model development
This layer provides tooling for developing models, including frameworks for
modeling, training, fine-tuning, and inference optimization. Because data is
central to model development, this layer also contains dataset engineering.
Model development also requires rigorous evaluation.
Infrastructure
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 7/22
At the bottom of the stack is infrastructure, which includes tooling for model
serving, managing data and compute, and monitoring.
The three layers, and examples of responsibilities for each one, are shown
below:
Figure 1-14. Three layers of the AI engineering stack
To get a sense of how the landscape has evolved with foundation models; in
March 2024, I searched GitHub for all AI-related repositories with at least 500
stars. Given the prevalence of GitHub, I believe this data is a good proxy for
understanding the ecosystem. In my analysis, I also included repositories for
applications and models, which are the products of the application
development and model development layers, respectively. I found a total of
920 repositories. Figure 1-15 shows the cumulative number of repositories in
each category month by month.
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 8/22
Figure 1-15. Cumulative count of repositories by category over
time
The data shows a big jump in the number of AI toolings in 2023, after the
introduction of Stable Diffusion and ChatGPT. That year, the categories which
saw the biggest increases were applications and application development.
The infrastructure layer saw some growth, but much less than in other layers.
This is expected: even though models and applications have changed, the
core infrastructural needs of resource management, serving, monitoring, etc.,
remain the same.
This brings us to the next point. While the level of excitement and creativity
around foundation models is unprecedented, many principles of building AI
applications are unchanged. For enterprise use cases, AI applications still
need to solve business problems, and, therefore, it’s still essential to map
from business metrics to ML metrics, and vice versa, and you still need to do
systematic experimentation. With classical ML engineering, you experiment
with different hyperparameters. With foundation models, you experiment with
different models, prompts, retrieval algorithms, sampling variables, and more.
We still want to make models run faster and cheaper. It’s still important to set
up a feedback loop so we can iteratively improve our applications with
production data.
This means that much of what ML engineers have learned and shared over
the last decade is still applicable. This collective experience makes it easier
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 9/22
for everyone to begin building AI applications. However, built on top of these
enduring principles are many innovations unique to AI engineering.
2.AI engineering versus ML engineering
While the unchanging principles of deploying AI applications are reassuring,
it’s also important to understand how things have changed. This is helpful for
teams that want to adapt their existing platforms for new AI use cases, and
for developers interested in which skills to learn in order to stay competitive in
a new market.
At a high level, building applications using foundation models today differs
from traditional ML engineering in three major ways:
. Without foundation models, you have to train your own models for
applications. With AI engineering, you use a model someone else has
trained. This means AI engineering focuses less on modeling and
training, and more on model adaptation.
. AI engineering works with models that are bigger, consume more
compute resources, and incur higher latency than traditional ML
engineering. This means there’s more pressure for efficient training and
inference optimization. A corollary of compute-intensive models is that
many companies now need more GPUs and work with bigger compute
clusters than previously, which means there’s more need for engineers
who know how to work with GPUs and big clusters [A head of AI at a
Fortune 500 company told me his team knows how to work with 10
GPUs, but not how to work with 1,000 GPUs.]
. AI engineering works with models that can produce open-ended outputs,
which provide models with the flexibility for more tasks, but are also
harder to evaluate. This makes evaluation a much bigger problem in AI
engineering.
In short, AI engineering differs from ML engineering in that it’s less about
model development, and more about adapting and evaluating models. I’ve
mentioned model adaptation several times, so before we move on, I want to
ensure we’re on the same page about what “model adaptation” means. In
general, model adaptation techniques can be divided into two categories,
depending on whether they require updating model weights:
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 10/22
Prompt-based techniques, which includes prompt engineering, adapt a
model without updating the model weights. You adapt a model by giving it
instructions and context, instead of changing the model itself. Prompt
engineering is easier to get started on and requires less data. Many
successful applications have been built with just prompt engineering. Its ease
of use allows you to experiment with more models, which increases the
chance of finding a model that is unexpectedly good for an application.
However, prompt engineering might not be enough for complex tasks, or
applications with strict performance requirements.
Fine-tuning, on the other hand, requires updating model weights. You
adapt a model by making changes to the model itself. In general, fine-tuning
techniques are more complicated and require more data, but they can
significantly improve a model’s quality, latency, and cost. Many things aren’t
possible without changing model weights, such as adapting a model to a new
task it wasn’t exposed to during training.
Now, let’s zoom into the application development and model development
layers to see how each has changed with AI engineering, starting with what
ML engineers are more familiar with. This section gives an overview of
different processes involved in developing an AI application.
Model development
Model development is the layer most commonly associated with traditional
ML engineering. It has three main responsibilities: modeling and training,
dataset engineering, and inference optimization. Evaluation is also required
because most people come across it first in the application development
layer.
Modeling and training
Modeling and training refers to the process of coming up with a model
architecture, training it, and fine-tuning it. Examples of tools in this category
are Google’s TensorFlow, Hugging Face’s Transformers, and Meta’s PyTorch.
Developing ML models requires specialized ML knowledge. It requires
knowing different types of ML algorithms such as clustering, logistic
regression, decision trees, and collaborative filtering, and also neural network
architectures such as feedforward, recurrent, convolutional, and transformer.
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 11/22
It also requires understanding of how a model learns, including concepts
such as gradient descent, loss function, regularization, etc.
With the availability of foundation models, ML knowledge is no longer a musthave for building AI applications. I’ve met many wonderful, successful AI
application builders who aren’t at all interested in learning about gradient
descent. However, ML knowledge is still extremely valuable, as it expands the
set of tools you can use, and helps with trouble-shooting when a model
doesn’t work as expected.
Differences between training, pre-training, fine-tuning, and
post-training
Training always involves changing model weights, but not all changes to
model weights constitute training. For example, quantization, the process of
reducing the precision of model weights, technically changes the model’s
weight values but isn’t considered training.
The term “training” can often be used in place of pre-training, finetuning, and
post-training, which refer to different phases:
Pre-training refers to training a model from scratch; the model weights are
randomly initialized. For LLMs, pre-training often involves training a model for
text completion. Out of all training steps, pre-training is often the most
resource-intensive by a long shot. For the InstructGPT model, pre-training
takes up to 98% of the overall compute and data resources. Pre-training also
takes a long time. A small mistake during pre-training can incur a significant
financial loss and set back a project significantly. Due to the resourceintensive nature of pre-training, it has become an art that only a few practice.
Those with expertise in pre-training large models, however, are highly sought
after [and attract incredible compensation packages].
Fine-tuning means continuing to train a previously-trained model; model
weights are obtained from the previous training process. Since a model
already has certain knowledge from pre-training, fine-tuning typically requires
fewer resources like data and compute than pre-training does.
Post-training. Many people use post-training to refer to the process of
training a model after the pre-training phase. Conceptually, post-training and
fine-tuning are the same and can be used interchangeably. However,
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 12/22
sometimes, people use them differently to signify the different goals. It’s
usually post-training when done by model developers. For example, OpenAI
might post-train a model to make it better at following instructions before
releasing it.
It’s fine-tuning when it’s done by application developers. For example, you
might fine-tune an OpenAI model which has been post-trained in order to
adapt it to your needs.
Pre-training and post-training make up a spectrum, and their processes and
toolings are very similar.
[Footnote: If you think the terms “pre-training” and “post-training” lack
imagination, you’re not alone. The AI research community is great at many
things, but naming isn’t one of them. We already talked about how “large
language model” is hardly a scientific term because of the ambiguity of the
word “large”. And I really wish people would stop publishing papers with the
title “X is all you need.”]
Some people use “training” to refer to prompt engineering, which isn’t correct.
I read a Business Insider article in which the author said she’d trained
ChatGPT to mimic her younger self. She did so by feeding her childhood
journal entries into ChatGPT. Colloquially, the author’s usage of the “training”
is correct, as she’s teaching the model to do something. But technically, if you
teach a model what to do via the context input into it, then that is prompt
engineering. Similarly, I’ve seen people use “fine-tuning” to describe prompt
engineering.
Dataset engineering
Dataset engineering refers to curating, generating, and annotating data
needed for training and adapting AI models.
In traditional ML engineering, most use cases are close-ended: a model’s
output can only be among predefined values. For example, spam
classification with only two possible outputs of “spam” and “not spam”, is
close-ended. Foundation models, however, are open-ended. Annotating
open-ended queries is much harder than annotating close-ended queries; it’s
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 13/22
easier to determine whether an email is spam than it is to write an essay. So
data annotation is a much bigger challenge for AI engineering.
Another difference is that traditional ML engineering works more with tabular
data, whereas foundation models work with unstructured data. In AI
engineering, data manipulation is more about deduplication, tokenization,
context retrieval, and quality control, including removing sensitive information
and toxic data.
Many people argue that because models are now commodities, data is the
main differentiator, making dataset engineering more important than ever.
How much data you need depends on the adapter technique you use.
Training a model from scratch generally requires more data than fine-tuning
does, which in turn requires more data than prompt engineering.
Regardless of how much data you need, expertise in data is useful when
examining a model, as its training data gives important clues about its
strengths and weaknesses.
Inference optimization
Inference optimization means making models faster and cheaper. Inference
optimization has always been important for ML engineering. Users never
reject faster models, and companies can always benefit from cheaper
inference. However, as foundation models scale up to incur ever-higher
inference cost and latency, inference optimization has become even more
important.
One challenge of foundation models is that they are often autoregressive:
tokens are generated sequentially. If it takes 10ms for a model to generate a
token, it’ll take a second to generate an output of 100 tokens, and even more
for longer outputs. As users are notoriously impatient, getting AI applications’
latency down to the 100ms latency expected of a typical internet application
is a huge challenge. Inference optimization has become an active subfield in
both industry and academia.
A summary of how the importance of categories of model development
changes with AI engineering:
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 14/22
Table 1-4. How different responsibilities of model development
have changed with foundation models
3.Application development in AI engineering
With traditional ML engineering where teams build applications using their
proprietary models, the model quality is a differentiation. With foundation
models, where many teams use the same model, differentiation must be
gained through the application development process.
The application development layer consists of these responsibilities:
evaluation, prompt engineering, and AI interface.
Evaluation
Evaluation is about mitigating risks and uncovering opportunities, and is
necessary throughout the whole model adaptation process. Evaluation is
needed to select models, benchmark progress, determine whether an
application is ready for deployment, and to detect issues and opportunities for
improvement in production.
While evaluation has always been important in ML engineering, it’s even
more important with foundation models, for many reasons. To summarize,
these challenges arise chiefly from foundation models’ open-ended nature
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 15/22
and expanded capabilities. For example, in close-ended ML tasks like fraud
detection, there are usually expected ground truths which you can compare a
model’s outputs against. If output differs from expected output, you know the
model is wrong. For a task like chatbots, there are so many possible
responses to each prompt that it is impossible to curate an exhaustive list of
ground truths to compare a model’s response to.
The existence of so many adaptation techniques also makes evaluation
harder. A system that performs poorly with one technique might perform much
better with another. When Google launched Gemini in December 2023, they
claimed Gemini was better than ChatGPT in the MMLU benchmark
(Hendrycks et al., 2020). Google had evaluated Gemini using a prompt
engineering technique called CoT@32. In this technique, Gemini was shown
32 examples, while ChatGPT was shown only 5 examples. When both were
shown five examples, ChatGPT performed better, as shown below:
Table 1-5. Different prompts can cause models to perform very
differently, as seen on Gemini’s technical report (December
2023)
Prompt engineering and context construction
Prompt engineering is about getting AI models to express desirable behaviors
from the input alone, without changing the model weights. The Gemini
evaluation story highlights the impact of prompt engineering on model
performance. By using a different prompt engineering technique, Gemini
Ultra’s performance on MMLU went from 83.7% to 90.04%.
It’s possible to get a model to do amazing things with just prompts. The right
instructions can get a model to perform a task you want in the format of your
choice. Prompt engineering is not just about telling a model what to do. It’s
also about giving the model the necessary context and tools for a given task.
For complex tasks with long context, you might also need to provide the
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 16/22
model with a memory management system, so the model can keep track of
its history.
AI interface
AI interface means creating an interface for end users to interact with AI
applications. Before foundation models, only organizations with sufficient
resources to develop AI models could develop AI applications. These
applications were often embedded into organizations’ existing products. For
example, fraud detection was embedded into Stripe, Venmo, and PayPal.
Recommender systems were part of social networks and media apps like
Netflix, TikTok, and Spotify.
With foundation models, anyone can build AI applications. You can serve your
AI applications as standalone products, or embed them into other products,
including products developed by others. For example, ChatGPT and
Perplexity are standalone products, whereas GitHub’s Copilot is commonly
used as a plug-in in VSCode, while Grammarly is commonly used as a
browser extension for Google Docs. Midjourney can be used via its
standalone web app, or its integration in Discord.
There needs to be tools that provide interfaces for standalone AI applications,
or which make it easy to integrate AI into existing products. Here are some
interfaces that are gaining popularity for AI applications:
Standalone web, desktop, and mobile apps. [Streamlit, Gradio, and
Plotly Dash are common tools for building AI web apps.]
Browser extensions that let users quickly query AI models while
browsing.
Chatbots integrated into chat apps like Slack, Discord, WeChat, and
WhatsApp.
Many products, including VSCode, Shopify, and Microsoft 365, provide
APIs that let developers integrate AI into their products as plug-ins and
add-ons. These APIs can also be used by AI agents to interact with the
world.
While the chat interface is the most commonly used, AI interfaces can also be
voice-based, such as voice assistants, or they can be embodied as with
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 17/22
augmented and virtual reality.
These new AI interfaces also mean new ways to collect and extract user
feedback. The conversation interface makes it so much easier for users to
give feedback in natural language, but this feedback is harder to extract.
A summary of how the importance of different categories of app development
changes with AI engineering:
Table 1-6. The importance of different categories in app
development for AI engineering and ML engineering
4.AI Engineering versus full-stack
engineering
The increased emphasis on application development, especially on
interfaces, brings AI engineering closer to full-stack development. [Footnote:
7 Anton Bacaj told me: “AI engineering is just software engineering with AI
models thrown in the stack.”]
The growing importance of interfaces leads to a shift in the design of AI
toolings to attract more frontend engineers. Traditionally, ML engineering is
Python-centric. Before foundation models, the most popular ML frameworks
supported mostly Python APIs. Today, Python is still popular, but there is also
increasing support for JavaScript APIs, with LangChain.js, Transformers.js,
OpenAI’s Node library, and Vercel’s AI SDK.
While many AI engineers come from traditional ML backgrounds, more
increasingly come from web development or full-stack backgrounds. An
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 18/22
advantage that full-stack engineers have over traditional ML engineers is their
ability to quickly turn ideas into demos, get feedback, and iterate.
With traditional ML engineering, you usually start with gathering data and
training a model. Building the product comes last. However, with AI models
readily available today, it’s possible to start with building the product first, and
only invest in data and models once the product shows promise, as
visualized in Figure 1-16.
Figure 1-16. The new AI engineering workflow rewards those
who can iterate fast. Image recreated from “The Rise of the AI
Engineer” (Shawn Wang, 2023).
In traditional ML engineering, model development and product development
are often disjointed processes, with ML engineers rarely involved in product
decisions at many organizations. However, with foundation models, AI
engineers tend to be much more involved in building the product.
Summary
I intend this chapter to serve two purposes. One is to explain the emergence
of AI engineering as a discipline, thanks to the availability of foundation
models. The second is to give an overview of the process of building
applications on top of these models. I hope this chapter achieves this. As an
overview, it only lightly touches on many concepts, which are explored further
in the book.
The rapid growth of AI engineering is motivated by the many applications
enabled by the emerging capabilities of foundation models. We’ve covered
some of the most successful application patterns, for both consumers and
enterprises. Despite the incredible number of AI applications already in
production, we’re still in the early stages of AI engineering, with countless
more innovations yet to be built.
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 19/22
While AI engineering is a new term, it has evolved out of ML engineering,
which is the overarching discipline involved in building applications with all ML
models. Many principles of ML engineering are applicable to AI engineering.
However, AI engineering also brings new challenges and solutions.
One aspect of AI engineering that is challenging to capture in words is the
incredible collective energy, creativity, and engineering talent of the
community. This enthusiasm can often be overwhelming, as it’s impossible to
keep up-to-date with new techniques, discoveries, and engineering feats that
seem to happen constantly.
One consolation is that since AI is great at information aggregation, it can
help us aggregate and summarize all these new updates! But tools can help
only to a certain extent; the more overwhelming a space is, the more
important it is to have a framework to help navigate it. This book aims to
provide such a framework.
The rest of the book explores this framework step-by-step, starting with the
fundamental building block of AI engineering: foundation models that make so
many amazing applications possible.
Gergely, again. Thanks, Chip, for sharing this excerpt. To go deeper in this
topic, consider picking up her book, AI Engineering, in which many topics
mentioned in this article are covered in greater depth, including:
Sampling variables are discussed in Chapter 2
Pre-training and post-training differences are explored further in
Chapters 2 and 7
The challenges of evaluating foundation models are discussed in
Chapter 3
Chapter 5 discusses prompt engineering, and Chapter 6 discusses
context construction
Inference optimization techniques, including quantization, distillation,
and parallelism, are discussed in Chapters 7 through 9
Dataset engineering is the focus of Chapter 8
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 20/22
User feedback design is discussed in Chapter 10
“AI Engineering” is surprisingly easy for software engineers to pick up.
When LLMs went mainstream in late 2022, I briefly assumed that to work in
the AI field, one needed to be an ML researcher. This is still true if you want to
work in areas like foundational model research. However, most AI
engineering positions at startups, scaleups and Big Tech, are about building
AI applications on top of AI APIs, or self-hosted LLMs.
Most of the complexity lies in the building of applications, not the LLM model
part. That’s not to say there are not a bunch of new things to learn in order to
become a standout AI engineer, but it’s still very much doable, and many
engineers are making the switch.
I hope you enjoyed this summary of the AI engineering stack. For more
deepdives on AI engineering, check out:
AI Engineering in the real world – Hands-on examples and learnings
from software engineers turned “AI engineers” at seven companies
Building, launching, and scaling ChatGPT Images – ChatGPT
Images became one of the largest AI engineering projects in the world,
overnight. But how was it built? A deepdive with OpenAI’s engineering
team
Building Windsurf – the engineering challenges of building an AI
product serving hundreds of billions of tokens per day.
AI Engineering with Chip Huyen – What is AI engineering, and how to
get started with it? Podcast with Chip.
A guest post by
Chip Huyen
I work at the intersection of AI, education, and storytelling. I help
companies deploy AI applications in production. I'm the author of
Designing Machine Learning Systems and AI Engineering, but I
also write novels and stuff.
Subscribe to Chip
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 21/22
You’re on the free list for The Pragmatic Engineer. For the full
experience, become a paying subscriber. Many readers expense this
newsletter within their company’s training/learning/development budget.
Upgrade to paid
This post is public, so feel free to share and forward it.
Share The Pragmatic Engineer
If you enjoyed this post, you might enjoy my book, The Software Engineer's
Guidebook. Here is what Tanya Reilly, senior principal engineer and author
of The Staff Engineer's Path said about it:
"From performance reviews to P95 latency, from team dynamics to
testing, Gergely demystifies all aspects of a software career. This book is
well named: it really does feel like the missing guidebook for the
whole industry."
Get The Software Engineer's Guidebook
LIKE COMMENT RESTACK
© 2025 Gergely Orosz
548 Market Street PMB 72296, San Francisco, CA 94104
Unsubscribe
6/4/25, 8:41 PM Gmail - The AI Engineering Stack
https://mail.google.com/mail/u/0/?ik=6c5510f221&view=pt&search=all&permthid=thread-f:1832653853260371304&simpl=msg-f:1832653853260371304 22/22
"""

# Step 2: Make the messages list
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

# Step 3: Call OpenAI

response = openai.chat.completions.create(model="gpt-4o-mini", messages=messages)

# Step 4: print the result
summary = response.choices[0].message.content
display(Markdown(summary))

# Summary of the Email: "The AI Engineering Stack"

- **Sender**: The Pragmatic Engineer (Gergely Orosz)
- **Date**: May 20, 2025
- **Topic**: Overview of AI Engineering and its distinctions from traditional ML engineering, featuring insights from Chip Huyen's book *AI Engineering*.

## Key Points:
- **AI Engineering Stack**: Described in terms of three layers:
  - **Application Development**: Focuses on using models to develop applications, emphasizing prompt engineering and user interfaces.
  - **Model Development**: Involves creating and optimizing models, less about traditional training and more about model adaptation.
  - **Infrastructure**: Provides the necessary tools for model serving and data management.

- **Differences Between AI and ML Engineering**:
  - AI engineering prioritizes adaptation of existing large models over building new ones.
  - Works with larger models requiring significant computing resources, which impacts latency and performance.

- **Emerging Interfacing Techniques**: New interfaces for AI applications have emerged, allowing for easier integration and user interaction, pushing AI engineering toward a full-stack development approach.

- **Call to Action**: Encourages readers to dive deeper into AI engineering, highlighting the evolving landscape of roles, responsibilities, and skills in this growing field.

## Additional Information:
- Upcoming events include a live podcast recording at the LDX3 conference in London featuring discussions on AI at Shopify.
- A recommendation to explore Chip Huyen's book for more comprehensive learning on AI engineering topics. 

This email serves to enlighten subscribers about the increasing significance of AI engineering and how professionals can adapt to harness this emerging field.

## An extra exercise for those who enjoy web scraping

You may notice that if you try `display_summary("https://openai.com")` - it doesn't work! That's because OpenAI has a fancy website that uses Javascript. There are many ways around this that some of you might be familiar with. For example, Selenium is a hugely popular framework that runs a browser behind the scenes, renders the page, and allows you to query it. If you have experience with Selenium, Playwright or similar, then feel free to improve the Website class to use them. In the community-contributions folder, you'll find an example Selenium solution from a student (thank you!)

# Sharing your code

I'd love it if you share your code afterwards so I can share it with others! You'll notice that some students have already made changes (including a Selenium implementation) which you will find in the community-contributions folder. If you'd like add your changes to that folder, submit a Pull Request with your new versions in that folder and I'll merge your changes.

If you're not an expert with git (and I am not!) then GPT has given some nice instructions on how to submit a Pull Request. It's a bit of an involved process, but once you've done it once it's pretty clear. As a pro-tip: it's best if you clear the outputs of your Jupyter notebooks (Edit >> Clean outputs of all cells, and then Save) for clean notebooks.

Here are good instructions courtesy of an AI friend:  
https://chatgpt.com/share/677a9cb5-c64c-8012-99e0-e06e88afd293