# Generating text with Large Language Models (LLMs)

`GPT-4o` is one of several sophisticated large language models (LLMs) built by [OpenAI](https://openai.com/), a San Francisco-based company whose mission is to "ensure that artificial general intelligence benefits all of humanity." `GPT-4o` can generate human-like prose by responding to prompts written in the [Chat Markup Language](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/chatgpt), or ChatML for short. Here are few examples demonstrating how to leverage `GPT-4o`'s text-generation capabilities. Start by asking `GPT-4o` to describe molecular biology in the style of Dr. Seuss. Run this cell several times and you'll get a different result each time. Set `temperature` to 0, however, and the results will be the same most of the time:

In [None]:
from openai import OpenAI

client = OpenAI(api_key='OPENAI_API_KEY')

messages = [{
    'role': 'user',
    'content': 'Describe molecular biology in the style of Dr. Seuss'
}]

response = client.chat.completions.create(
    model='gpt-4o',
    messages=messages
)

print(response.choices[0].message.content)

In the land of the cell, where the wee enzymes dwell,  
Lived molecules grand, with a story to tell.  
With wibbles and wobbles and twists all around,  
In the nucleus townhouse, DNA could be found.  

Now DNA, you see, is a marvelous thing,  
A helical ladder, with bases that cling.  
A, T, C, and G, all in a line,  
In nuclear splendor, they truly do shine.  

Proteins, my friend, are the workers you'll meet,  
Built from amino acids, a structural feat!  
They wobble and weave in three dimensions and more,  
Folding with flair, oh the shapes they explore!  

Messenger RNA, with its ribbon so bright,  
Carries the message from morning to night.  
From nucleus city to the ribosome's door,  
Translating instructions, to protein they soar.  

Ribosomes, oh joy, are the factories keen,  
Constructing proteins like a sewing machine!  
They take mRNA with a tick and a tock,  
And stitch together chains, by the clock!  

Enzymes are zipping, unzipping with ease,  
Speeding reactions, as quic

You can richen the UI experience by streaming the response. Here's how:

In [2]:
response = client.chat.completions.create(
    model='gpt-4o',
    messages=messages,
    stream=True
)

for chunk in response:
    content = chunk.choices[0].delta.content
    if content is not None:
        print(content, end='')

In the land of Cells, all small and unseen,  
Lies a world of wonders, so lively and keen.  
With creatures named Genes, and Proteins, an art,  
Molecular Biology's where they all start.  

Oh, the DNA helix, with twists bold and bright,  
Carries codes in its spirals, just hidden from sight.  
A-T and C-G, they pair with such flair,  
Creating blueprints for life, a plan rare and rare.  

Then come the Ribosomes, round as a ball,  
Reading those codes, giving life to the call.  
They stitch up Amino Acids, like beads on a thread,  
Building Proteins, life’s builders, on which all is fed.  

The Proteins, they fold, in shapes wild and fun,  
Doing jobs in the cells, yes, each and every one.  
From enzymes to transporters, their work is routine,  
Keeping life bustling, in this microscopic scene.  

And let's not forget Mitochondria's might,  
The powerhouse cells, generating light.  
They churn out the energy, like sunbeams they glow,  
Fueling the cells, making life grow and grow.  



## Conversational context

Messages transmitted to `GPT-4o` use the Chat Markup Language. ChatML exists so that the context of a conversation can be preserved across calls. To demonstrate, ask the LLM what its name is:

In [3]:
messages = [{
    'role': 'user',
    'content': "My name is Jeff. What's your name?"
}]
 
response = client.chat.completions.create(
    model='gpt-4o',
    messages=messages,
    stream=True
)
 
for chunk in response:
    content = chunk.choices[0].delta.content
    if content is not None:
        print(content, end='')

Hello Jeff! I don’t have a personal name, but you can call me Assistant. How can I help you today?

But now try this:

In [4]:
messages = [
    {
        'role': 'system',
        'content': 'You are a friendly assistant named LISA.'
    },
    {
        'role': 'user',
        'content': "My name is Jeff. What's your name?"
    }
]
 
response = client.chat.completions.create(
    model='gpt-4o',
    messages=messages,
    stream=True
)
 
for chunk in response:
    content = chunk.choices[0].delta.content
    if content is not None:
        print(content, end='')

Hello Jeff! My name is LISA. How can I assist you today?

You can be as specific as you’d like with `system` messages, even saying "If you don’t know the answer to a question, say I don’t know." You can also prescribe a persona. Replace "friendly" with "sarcastic" in the message from system and run the code again. The response may be "Oh, hi Jeff, I’m LISA. You can call me whatever you'd like, but don’t call me late for dinner." Run the code several times and there’s no end to the colorful responses you’ll receive.

ChatML's greatest power lies in persisting context from one call to the next. As an example, try this:

In [5]:
messages = [
    {
        'role': 'system',
        'content': 'You are a friendly assistant named LISA.'
    },
    {
        'role': 'user',
        'content': "My name is Jeff. What's your name?"
    }
]
 
response = client.chat.completions.create(
    model='gpt-4o',
    messages=messages,
    stream=True
)

for chunk in response:
    content = chunk.choices[0].delta.content
    if content is not None:
        print(content, end='')

Hello Jeff! My name is LISA. How can I assist you today?

Then follow up immediately with this:

In [6]:
messages = [
    {
        'role': 'system',
        'content': 'You are a friendly assistant named LISA.'
    },
    {
        'role': 'user',
        'content': 'What is my name?'
    }
]
 
response = client.chat.completions.create(
    model='gpt-4o',
    messages=messages,
    stream=True
)

for chunk in response:
    content = chunk.choices[0].delta.content
    if content is not None:
        print(content, end='')

I'm sorry, I don't have access to personal information like your name. If you'd like, you can tell me your name, and I'll be happy to address you that way!

The LLM will respond with something along the lines of "I'm sorry, but I don’t have access to that information." But now try this:

In [7]:
messages = [
    {
        'role': 'system',
        'content': 'You are a friendly assistant named LISA.'
    },
    {
        'role': 'user',
        'content': "My name is Jeff. What's your name?"
    },
    {
        'role': 'assistant',
        'content': 'Hello Jeff, my name is LISA. Nice to meet you!'
    },
    {
        'role': 'user',
        'content': 'What is my name?'
    }
]
 
response = client.chat.completions.create(
    model='gpt-4o',
    messages=messages,
    stream=True
)

for chunk in response:
    content = chunk.choices[0].delta.content
    if content is not None:
        print(content, end='')

Your name is Jeff. If you have any questions or need assistance, feel free to ask!

Get it? Calls to `GPT-4o` are stateless. If you give `GPT-4o` your name in one call and ask it to repeat your name in the next call, it has no clue. But with ChatML, you can provide past responses as context for the current call. You can build a conversational assistant simply by repeating the last few prompts and responses in each call to `GPT-4o`. The further back you go, the longer the assistant's "memory" will be. Here's a `chat` function that enables that by accepting previous messages as input and returning a messages list with the response:

In [None]:
def chat(input, messages=None):
    if not messages:
        messages = [{ 'role': 'system', 'content': 'You are a friendly assistant named LISA' }]
    
    message = { 'role': 'user', 'content': input }
    messages.append(message)

    client = OpenAI(api_key='OPENAI_API_KEY')
    
    response = client.chat.completions.create(
        model='gpt-4o',
        messages=messages
    )
    
    # Return a message thread containing the LLM's response
    messages.append({ 'role': 'assistant', 'content': response.choices[0].message.content })
    return messages

Now use the `chat` function to tell the LLM your name:

In [9]:
messages = chat("My name is Jeff. What's your name?")
print(messages[-1]['content'])

Hello Jeff! My name is LISA. How can I assist you today?


Then ask it what your name is:

In [10]:
messages = chat("What is my name?", messages=messages)
print(messages[-1]['content'])

Your name is Jeff. How can I help you today?


## Tokenization

LLMs don't work with words; they work with *tokens*. Tokenization plays an important role in Natural Language Processing. Neural networks can’t process text, at least not directly; they only process numbers. Tokenization converts words into numbers that a deep-learning model can understand. When `GPT-4o` generates a response by predicting a series of tokens, the tokenization process is reversed to convert the tokens into human-readable text.

OpenAI LLMs use a form of tokenization called [Byte-Pair Encoding](https://en.wikipedia.org/wiki/Byte_pair_encoding) (BPE), which was developed in the 1990s as a mechanism for compressing text. Today, it is widely used in the NLP space. Here’s how `GPT-4o` BPE-tokenizes the phrase "fourscore and seven years ago:"

![](Images/bpe.png)

As a rule of thumb, 3 words on average translate to about 4 BPE tokens. That’s important because LLMs limit the number of tokens in each API call. The maximum token count is controlled by a parameter named `max_tokens`. For `GPT-4o`, the default is 4,096 tokens or about 3,000 words. `GPT-4o` has a context window size of 128K, so the upper limit is 128K. That's enough to pass in a document that's a few hundred pages long. If the number of tokens generated exceeds `max_tokens`, then either the call will fail or the response will be truncated.

You can compute the number of tokens generated from a text sample with help from a Python package named [`tiktoken`](https://pypi.org/project/tiktoken/):

In [11]:
import tiktoken
 
text = '''
    Jeff loves to build and fly model jets. He built his first
    jet, a BVM BobCat, in 2007. After that, he built a BVM Bandit,
    a Skymaster F-16, and a Skymaster F-5. The latter two are 1/6th
    scale models of actual fighter jets. Top speed is around 200 MPH.
    '''
 
encoding = tiktoken.encoding_for_model('gpt-4o')
num_tokens = len(encoding.encode(text))
print(f'{num_tokens} tokens')

86 tokens


You can estimate the token count for an entire `messages` array with the following code, which was adapted comments and all from the [OpenAI documentation](https://platform.openai.com/docs/guides/chat/introduction):

In [12]:
num_tokens = 0
 
for message in messages:
    num_tokens += 4 # every message follows <im_start>{role/name}\n{content}<im_end>\n
    for key, value in message.items():
        num_tokens += len(encoding.encode(value))
        if key == 'name':  # if there's a name, the role is omitted
            num_tokens += -1 # role is always required and always 1 token
             
num_tokens += 2 # every reply is primed with <im_start>assistant
print(f'{num_tokens} tokens')

77 tokens


There are a couple of reasons to be aware of the token count in each call. First, you’re charged by the token for input and output. The larger the `messages` array and the longer the response, the more you pay. Second, when using the messages array to provide context from previous calls, you have a finite amount of space to work with. It's common practice to pick a number – say, 10 or 20 – and limit the context from previous calls to that number of messages, or to programmatically compute the number of tokens that a conversation comprises and include as many messages as `max_tokens` will allow.

## Natural language processing

`GPT-4o` can perform many NLP tasks such as sentiment analysis and neural machine translation (NMT) without further training. Here's an example that translates text from English to French:

In [13]:
prompt = f'Translate the following text from English to French: {text}'
messages = chat(prompt)
print(messages[-1]['content'])

Jeff aime construire et piloter des avions à réaction miniatures. Il a construit son premier avion, un BVM BobCat, en 2007. Après cela, il a construit un BVM Bandit, un Skymaster F-16, et un Skymaster F-5. Les deux derniers sont des modèles à l'échelle 1/6e de véritables avions de combat. Leur vitesse maximale est d'environ 200 MPH.


The following examples demonstrate how to use `GPT-4o` for sentiment analysis:

In [14]:
prompt = '''
    Indicate whether the following review's sentiment is positive or
    negative: Great food and excellent service.
    '''

messages = chat(prompt)
print(messages[-1]['content'])

The sentiment of the review is positive.


In [15]:
prompt = '''
    Indicate whether the following review's sentiment is positive or
    negative: Long lines and poor customer service.
    '''

messages = chat(prompt)
print(messages[-1]['content'])

The sentiment of the review is negative.


Sentiment analysis is a text-classification task. LLMs can classify text in other ways, too. The next two examples demonstrate how to use `GPT-4o` as a spam filter:

In [16]:
prompt = '''
    Indicate whether the following email is spam or not spam:
    Please plan to attend the code review at 2:00 p.m. this afternoon.
    '''

messages = chat(prompt)
print(messages[-1]['content'])

This email is not spam. It appears to be an internal communication related to a workplace or professional activity, specifically about attending a code review meeting.


In [17]:
prompt = '''
    Indicate whether the following email is spam or not spam:
    Order prescription meds online and save $$$.
    '''

messages = chat(prompt)
print(messages[-1]['content'])

This email is likely spam. It promotes the purchase of prescription medications online with promises of saving money, which is a common characteristic of spam emails. Additionally, unsolicited emails offering such deals can often be associated with scams or disreputable sources.


A practical use for LLMs is parsing freeform address fields and generating structured data. Here's an example:

In [None]:
addresses = [
    '11 Aviation Avenue, Charlottetown, PE C1E 0A1, Canada',
    'Roche Molecular Systems, Inc., 4300 Hacienda Drive, Pleasanton, CA 94588, US',
    'Cross Research S.A., Phase I Unit, Via F.A. Giorgioli 14, Arzo, 6864, CH',
    'Wasdell Group, Wasdell Packaging Ltd Unit 1-8, Euroway Industrial Estate, Blagrove, Swindon, SN5 8YW, GB',
    'Policlinico Gemelli, 4th Floor, Wing J, Largo Gemelli 8, Rome, 00168, IT',
    'Academisch Ziekenhuis Maastricht, CDL Stamcellaboratorium, P. Debyelaan 25 5e, Maastricht, 6229 HX, NL',
    'Wintellect Brussels, Leuvensesteenweg 555, Marken Benelux, 1930, BE',
    'SCM department, AstraZeneca K.K., Maihara Factory, AstraZeneca K.K., 215-31, Miyos, Shiga-Ken, 215-31, JP',
    'Healthcare Logistics Australia, 7 Dolerite Way, Pemulwuy NSW 2145, AU',
    'Suncoast Research, 2128 W Flagler St, Suite 101, Mami, FL, 33135, US'
]

client = OpenAI(api_key='OPENAI_API_KEY')

for address in addresses:
    prompt = f'''
        Parse the freeform address below into fields and return a JSON
        representation that uses the following format. Convert country
        abbreviations such as "CA" into country names such as "Canada"
        and state abbreviations such as "CA" into state names such as
        "California." Leave unknown fields blank. Also correct any
        obvious misspellings. Return JSON only and do not use markdown.
    
        {{
            "Name": "Recipient",
            "Street Address": "Street address",
            "City": "City, town, etc.",
            "State": "State, province, region, territory, canton, county, department, länder, or prefecture",
            "Country": "Country name",
            "Postal Code": "Postal code"
        }}
    
        Address: {address}
        '''

    messages = [{ 'role': 'user', 'content': prompt }]
    
    response = client.chat.completions.create(
        model='gpt-4o',
        messages=messages,
        response_format={ 'type': 'json_object' }
    )
    
    print(response.choices[0].message.content)

{
    "Name": "",
    "Street Address": "11 Aviation Avenue",
    "City": "Charlottetown",
    "State": "Prince Edward Island",
    "Country": "Canada",
    "Postal Code": "C1E 0A1"
}
{
    "Name": "Roche Molecular Systems, Inc.",
    "Street Address": "4300 Hacienda Drive",
    "City": "Pleasanton",
    "State": "California",
    "Country": "United States",
    "Postal Code": "94588"
}
{
    "Name": "Cross Research S.A.",
    "Street Address": "Phase I Unit, Via F.A. Giorgioli 14",
    "City": "Arzo",
    "State": "",
    "Country": "Switzerland",
    "Postal Code": "6864"
}
{
    "Name": "Wasdell Group, Wasdell Packaging Ltd",
    "Street Address": "Unit 1-8, Euroway Industrial Estate, Blagrove",
    "City": "Swindon",
    "State": "",
    "Country": "United Kingdom",
    "Postal Code": "SN5 8YW"
}
{
    "Name": "",
    "Street Address": "Policlinico Gemelli, 4th Floor, Wing J, Largo Gemelli 8",
    "City": "Rome",
    "State": "",
    "Country": "Italy",
    "Postal Code": "00168"
}

`GPT-4o`'s 128K context-window size enables it to ingest large documents for summarization or other purposes. Let's see what it can do with Microsoft's 2022 annual report:

In [19]:
from IPython.display import Markdown, display

with open('Data/annual-report.txt', 'r') as file:
    report = file.read()

prompt = f'''
    Summarize the following annual report from Microsoft. Use
    markdown formatting in your output:

    {report}
    '''

messages = chat(prompt)
output = messages[-1]['content']
display(Markdown(output))

# Microsoft 2022 Annual Report Summary

## Chairman's Message

**Satya Nadella** highlights the rapid technological advancements amidst global economic and political changes, emphasizing Microsoft's mission to empower people and organizations worldwide.

### Key Points

- **Technological Era**: With the rise in digital transformation, Microsoft is focused on helping industries use technology to address current challenges and growth opportunities.
- **Customer Impact**: Examples include:
  - **Ferrovial**: Enhancing roads for autonomous transport using Microsoft’s cloud.
  - **Peace Parks Foundation**: Using Dynamics 365 and Power BI for funding and conservation efforts.
  - **Kawasaki Heavy Industries**: Creating an industrial metaverse with Azure IoT and HoloLens.
  - **Globo (Brazil)**: Empowering employees to create solutions with Power Platform.
  - **Ørsted**: Utilizing Microsoft Intelligent Data Platform for predictive turbine maintenance.

### Financial Highlights

- **Record Revenue**: $198 billion in revenue, $83 billion in operating income.
- **Microsoft Cloud**: Surpassed $100 billion in annualized revenue.

## Responsibility

Microsoft commits to aligning its innovations with global economic and social issues towards a more inclusive, equitable, sustainable, and trusted future.

### Key Initiatives

- **Inclusive Economic Growth**: Reached over 23 million people with digital skills training. Aim to equip 10 million underserved with digital economy skills by 2025.
- **Cybersecurity Workforce**: Plan to skill 250,000 people in the US by 2025.
- **Nonprofit Support**: $3.2 billion in technology assistance to nonprofits, aiming to double outreach in five years.

## Sustainability and Trust

### Environmental Goals

- **Carbon Negative by 2030**: Released a sustainability report to track progress.
- **Water and Waste Initiatives**: Launched projects for water replenishment and waste diversion, protecting more land than used by 2025.

### Trust and Technology

- **Privacy and Security**: Committed to strong privacy laws and analyzing 43 trillion security signals daily.
- **Responsible AI**: Published standards and tools to ensure AI aligns with their principles.

## Future Opportunities

Microsoft envisions doubling the tech's GDP percentage and integrating it across all industries, evolving towards every company becoming a software company.

## Market Leadership

### Segments and Innovations

1. **Azure**: Expanding datacenter regions, enhancing cloud services, supporting hybrid consistency.
2. **Data and AI**: Comprehensive data offerings with Microsoft Intelligent Data Platform, Microsoft Purview, and Azure OpenAI Service.
3. **Modern Work**: Leading hybrid work solutions with Microsoft Teams (270 million+ users) and Microsoft Viva.
4. **Consumer Tech**: Launch of Windows 11 and new Surface devices; strength in Xbox and LinkedIn offerings.

### Financial Metrics

- **Share Repurchases**: Executed buybacks under $60 billion program commenced in 2021.
- **Dividend Policy**: Regular quarterly increases, reflecting confidence in cash flow and growth potential.

## Strategic Business Segments

1. **Productivity and Business Processes**: Growth from Office 365 and LinkedIn.
2. **Intelligent Cloud**: Strong performance in Azure and other cloud services.
3. **More Personal Computing**: Gains in Windows, Devices, and Xbox markets.

## Governance and Leadership

- **Board and Management**: Ensures transparent governance with a focus on environmental, social, and governance (ESG) responsibilities.

Microsoft remains steadfast in its mission to transform industries, manage digital risks, and lead with purpose-driven innovation.

For comparison, let's see how Gemini 1.5 Flash summarizes the same report:

In [None]:
import google.generativeai as genai

genai.configure(api_key='GOOGLE_API_KEY')
model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content(prompt)
output = response.text

display(Markdown(output))

  from .autonotebook import tqdm as notebook_tqdm


This document is a corporate filing by Microsoft Corporation, likely a section of their 10-K report, which is an annual filing with the Securities and Exchange Commission (SEC). 

Here is a summary of the document's key points:

**Financial Statements**

* **Opinion on the Financial Statements:** Deloitte & Touche LLP, the independent auditor, expressed an unqualified opinion on Microsoft's consolidated financial statements for the year ended June 30, 2022. This means the statements are considered to be presented fairly, in all material respects, in accordance with generally accepted accounting principles in the United States.
* **Critical Audit Matters:** The audit report identified two critical audit matters:
    * **Revenue Recognition:**  Deloitte & Touche LLP performed extensive procedures to evaluate Microsoft's judgment in determining revenue recognition for complex customer agreements. 
    * **Income Taxes – Uncertain Tax Positions:** The audit report focused on the evaluation of management's estimates of uncertain tax positions related to unresolved transfer pricing issues with the IRS. 
* **Management's Report on Internal Control Over Financial Reporting:**  Microsoft's management concluded that the Company's internal control over financial reporting was effective as of June 30, 2022. 
* **Report of Independent Registered Public Accounting Firm on Internal Control Over Financial Reporting:** Deloitte & Touche LLP, the independent auditor, expressed an unqualified opinion on Microsoft's internal control over financial reporting as of June 30, 2022.  This means the Company has a strong system of internal controls in place.

**Other Information**

* **Directors and Executive Officers:** The report lists the current board of directors and executive officers of Microsoft Corporation.
* **Investor Relations Contact Information:** This section provides details for contacting Microsoft's Investor Relations department. 
* **Attending the Annual Meeting:** This section provides information on how shareholders can attend the 2022 Annual Shareholders Meeting, which was held virtually.
* **Registered Shareholder Services:** This section provides contact information for Computershare, the transfer agent, who handles shareholder-related services.
* **Environmental, Social, and Governance (ESG)/Corporate Social Responsibility:** Microsoft highlights its commitment to ESG principles and its ongoing efforts in corporate social responsibility.

**Overall, this document is a significant filing for investors and other stakeholders, providing assurance about the accuracy of Microsoft's financial statements and their overall business practices.** 
