
# Building AI Applications with ChatGPT

Sumudu Tennakoon, PhD
<hr>

# Information Extraction Tasks

In this notebook we will explore some basic fetures on Python programing language for those who have a prior programing expereince.

To learn more about Python, refeer to the following websites

- Python : https://www.python.org

To learn more about the Python packages we explore in this notebook, refer to the following websites

- OpenAI API : https://platform.openai.com/docs/api-reference


### Python Library Installation

* Run below code cell to install required libraries before you continue. Ignore that if you already installed them.

In [1]:
# !pip install openai

In [2]:
import openai
import configparser
config = configparser.ConfigParser()
config.read(r'../../../config.ini') #Change to your path or assign API Key to openai_api_key (not recomended for production)

openai_api_key = config['SECRETS']['openai_api_key']

## Extract Contact Information form Email Message

In [3]:
from openai import OpenAI
client = OpenAI(api_key=openai_api_key)

text = """
Dear Bob,

Thank you for the assitance locating the items I needed to get. 

Please ship my order to the address 1234 Elm Street Springfield, IL 62704. 

Please call my phone number 234-567-7890 or email to alice@exampledomain.com if you need more information.

Best,

Alice Smith
"""

MODEL = "gpt-3.5-turbo"

INSTRUCTIONS = r'Extract Name, Phone Number, Email and Address from the text given \
and output as JSON object with Name, Phone Number, Email and Address as keys'

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": INSTRUCTIONS},
        {"role": "user", "content": text},
    ],
    temperature=0,
    max_tokens=100
)

chat_response = response.choices[0].message.content
usage = response.usage.model_dump()

print(F"chat_response: {chat_response}\n\nusage:{usage}")

chat_response: {
  "Name": "Alice Smith",
  "Phone Number": "234-567-7890",
  "Email": "alice@exampledomain.com",
  "Address": "1234 Elm Street Springfield, IL 62704"
}

usage:{'completion_tokens': 49, 'prompt_tokens': 111, 'total_tokens': 160}


## Extract Information from Paragraph

* use of `stop` sequence
* use of output formatting instructions

In [4]:
text = """
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. It rose to dominate the personal computer operating system market with MS-DOS in the mid-1980s, followed by Windows. The company's 1986 initial public offering (IPO) and subsequent rise in its share price created three billionaires and an estimated 12,000 millionaires among Microsoft employees. Since the 1990s, it has increasingly diversified from the operating system market and has made a number of corporate acquisitions, the largest being the acquisition of LinkedIn for $26.2 billion in December 2016, followed by their acquisition of Skype Technologies for $8.5 billion in May 2011.

As of 2015, Microsoft is market-dominant in the IBM PC compatible operating system market and the office software suite market, although it has lost the majority of the overall operating system market to Android.The company also produces a wide range of other consumer and enterprise software for desktops, laptops, tabs, gadgets, and servers, including Internet search (with Bing), the digital services market (through MSN), mixed reality (HoloLens), cloud computing (Azure), and software development (Visual Studio).

Steve Ballmer replaced Gates as CEO in 2000 and later envisioned a "devices and services" strategy. This unfolded with Microsoft acquiring Danger Inc. in 2008,  entering the personal computer production market for the first time in June 2012 with the launch of the Microsoft Surface line of tablet computers, and later forming Microsoft Mobile through the acquisition of Nokia's devices and services division. Since Satya Nadella took over as CEO in 2014, the company has scaled back on hardware and instead focused on cloud computing, a move that helped the company's shares reach their highest value since December 1999.

Earlier dethroned by Apple in 2010, in 2018, Microsoft reclaimed its position as the most valuable publicly traded company in the world.[10] In April 2019, Microsoft reached a trillion-dollar market cap, becoming the third U.S. public company to be valued at over $1 trillion after Apple and Amazon, respectively. As of 2022, Microsoft has the fourth-highest global brand valuation.
"""

In [5]:
from openai import OpenAI
client = OpenAI(api_key=openai_api_key)

MODEL = "gpt-3.5-turbo"

INSTRUCTIONS = """You are a personal assitant converting a user submitted text containing company details into a table of company history. \
Table should contain two columns Year and Event.\
Add [START] before table. \
Add [END] [BREAK] after table."""

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": INSTRUCTIONS},
        {"role": "user", "content": text},
    ],
    temperature=0,
    max_tokens=1000,
    stop = "[BREAK]"
)

chat_response = response.choices[0].message.content
usage = response.usage.model_dump()

print(F"chat_response: {chat_response}\n\nusage:{usage}")

chat_response: [START]
Year | Event
-----|------
1975 | Microsoft founded by Bill Gates and Paul Allen to develop and sell BASIC interpreters for the Altair 8800.
1980s | Microsoft dominates the personal computer operating system market with MS-DOS and Windows.
1986 | Microsoft's initial public offering (IPO) creates three billionaires and an estimated 12,000 millionaires among employees.
1990s | Microsoft diversifies from the operating system market and makes several corporate acquisitions.
2011 | Microsoft acquires Skype Technologies for $8.5 billion.
2016 | Microsoft acquires LinkedIn for $26.2 billion.
2000 | Steve Ballmer replaces Bill Gates as CEO.
2012 | Microsoft enters the personal computer production market with the launch of the Microsoft Surface line of tablet computers.
2014 | Satya Nadella becomes CEO and focuses on cloud computing.
2018 | Microsoft reclaims its position as the most valuable publicly traded company in the world.
2019 | Microsoft reaches a trillion-dollar 

In [6]:
from openai import OpenAI
client = OpenAI(api_key=openai_api_key)

MODEL = "gpt-3.5-turbo"

INSTRUCTIONS = """You are a personal assitant converting a user submitted text containing \
company information into a list of JSON object containing company history events. \
Each element in JSON object should contain two keys columns Year and Event.\
Add [START] before JSON object. \
Add [END] [BREAK] after JSON object."""

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": INSTRUCTIONS},
        {"role": "user", "content": text},
    ],
    temperature=0,
    max_tokens=1000,
    stop = "[BREAK]"
)

chat_response = response.choices[0].message.content
usage = response.usage.model_dump()

print(F"chat_response: {chat_response}\n\nusage:{usage}")

chat_response: [START]
[
  {
    "Year": "1975",
    "Event": "Microsoft was founded by Bill Gates and Paul Allen to develop and sell BASIC interpreters for the Altair 8800."
  },
  {
    "Year": "1980s",
    "Event": "Microsoft rose to dominate the personal computer operating system market with MS-DOS."
  },
  {
    "Year": "1986",
    "Event": "Microsoft had its initial public offering (IPO) and created three billionaires and an estimated 12,000 millionaires among its employees."
  },
  {
    "Year": "1990s",
    "Event": "Microsoft diversified from the operating system market and made a number of corporate acquisitions."
  },
  {
    "Year": "December 2016",
    "Event": "Microsoft acquired LinkedIn for $26.2 billion."
  },
  {
    "Year": "May 2011",
    "Event": "Microsoft acquired Skype Technologies for $8.5 billion."
  },
  {
    "Year": "2015",
    "Event": "Microsoft is market-dominant in the IBM PC compatible operating system market and the office software suite market."
  },

#### Extract JSON String from the Chat Response

In [7]:
import re

json_string_pattern = r"\[START\]([\w\s\W]*)\[END\]"
json_string = re.findall(json_string_pattern, chat_response)[0]

print(json_string)


[
  {
    "Year": "1975",
    "Event": "Microsoft was founded by Bill Gates and Paul Allen to develop and sell BASIC interpreters for the Altair 8800."
  },
  {
    "Year": "1980s",
    "Event": "Microsoft rose to dominate the personal computer operating system market with MS-DOS."
  },
  {
    "Year": "1986",
    "Event": "Microsoft had its initial public offering (IPO) and created three billionaires and an estimated 12,000 millionaires among its employees."
  },
  {
    "Year": "1990s",
    "Event": "Microsoft diversified from the operating system market and made a number of corporate acquisitions."
  },
  {
    "Year": "December 2016",
    "Event": "Microsoft acquired LinkedIn for $26.2 billion."
  },
  {
    "Year": "May 2011",
    "Event": "Microsoft acquired Skype Technologies for $8.5 billion."
  },
  {
    "Year": "2015",
    "Event": "Microsoft is market-dominant in the IBM PC compatible operating system market and the office software suite market."
  },
  {
    "Year": "2000

#### Transform JSON String to Python List of Dictionary

In [8]:
import json

company_events = json.loads(json_string)

company_events

[{'Year': '1975',
  'Event': 'Microsoft was founded by Bill Gates and Paul Allen to develop and sell BASIC interpreters for the Altair 8800.'},
 {'Year': '1980s',
  'Event': 'Microsoft rose to dominate the personal computer operating system market with MS-DOS.'},
 {'Year': '1986',
  'Event': 'Microsoft had its initial public offering (IPO) and created three billionaires and an estimated 12,000 millionaires among its employees.'},
 {'Year': '1990s',
  'Event': 'Microsoft diversified from the operating system market and made a number of corporate acquisitions.'},
 {'Year': 'December 2016',
  'Event': 'Microsoft acquired LinkedIn for $26.2 billion.'},
 {'Year': 'May 2011',
  'Event': 'Microsoft acquired Skype Technologies for $8.5 billion.'},
 {'Year': '2015',
  'Event': 'Microsoft is market-dominant in the IBM PC compatible operating system market and the office software suite market.'},
 {'Year': '2000', 'Event': 'Steve Ballmer replaced Bill Gates as CEO.'},
 {'Year': 'June 2012',
  'Ev

#### Create Data Frame from Extratced JSON

In [9]:
import pandas as pd

company_events_df = pd.read_json(json_string)

company_events_df

  company_events_df = pd.read_json(json_string)


Unnamed: 0,Year,Event
0,1975,Microsoft was founded by Bill Gates and Paul A...
1,1980s,Microsoft rose to dominate the personal comput...
2,1986,Microsoft had its initial public offering (IPO...
3,1990s,Microsoft diversified from the operating syste...
4,December 2016,Microsoft acquired LinkedIn for $26.2 billion.
5,May 2011,Microsoft acquired Skype Technologies for $8.5...
6,2015,Microsoft is market-dominant in the IBM PC com...
7,2000,Steve Ballmer replaced Bill Gates as CEO.
8,June 2012,Microsoft entered the personal computer produc...
9,2014,Satya Nadella took over as CEO and Microsoft f...


In [10]:
# Dierect JSION only output

from openai import OpenAI
client = OpenAI(api_key=openai_api_key)

MODEL = "gpt-3.5-turbo"

INSTRUCTIONS = """You are a personal assitant converting a user submitted text containing \
company information into a list of JSON object containing company history events. \
Each element in JSON object should contain two keys columns Year and Event."""

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": INSTRUCTIONS},
        {"role": "user", "content": text},
    ],
    temperature=0,
    max_tokens=1000,
)

chat_response = response.choices[0].message.content
usage = response.usage.model_dump()

print(F"chat_response: {chat_response}\n\nusage:{usage}")

chat_response: [
  {
    "Year": "1975",
    "Event": "Microsoft was founded by Bill Gates and Paul Allen to develop and sell BASIC interpreters for the Altair 8800."
  },
  {
    "Year": "1980s",
    "Event": "Microsoft rose to dominate the personal computer operating system market with MS-DOS, followed by Windows."
  },
  {
    "Year": "1986",
    "Event": "Microsoft had its initial public offering (IPO) and created three billionaires and an estimated 12,000 millionaires among its employees."
  },
  {
    "Year": "1990s",
    "Event": "Microsoft diversified from the operating system market and made a number of corporate acquisitions."
  },
  {
    "Year": "December 2016",
    "Event": "Microsoft acquired LinkedIn for $26.2 billion."
  },
  {
    "Year": "May 2011",
    "Event": "Microsoft acquired Skype Technologies for $8.5 billion."
  },
  {
    "Year": "2015",
    "Event": "Microsoft is market-dominant in the IBM PC compatible operating system market and the office software suite 

<hr/>
First Upload 2023-07-04 | Last update 2023-12-15 by Sumudu Tennakoon

<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.