## Table Of Contents:
* [Text Generation Models ](#text-gen)
* [Text Summarization Models](#text-summ)
* [How to write SQL queries with OpenAI](#write-sql)

In [12]:
!pip install openai



In [5]:
import openai

In [6]:
import os

In [7]:
openai.api_key = os.getenv('OPENAI_API_KEY')

In [8]:
print(openai.api_key[0:2])

sk


### Pretty Printing Helper

In [11]:
import json

def show_json(obj):
    display(json.loads(obj.model_dump_json()))

### Text Generation Models <a class="text-gen" id="text-gen"></a>

OpenAI's text generation models (often called generative pre-trained transformers or large language models) have been trained to understand natural language, code, and images. 

The models provide text outputs in response to their inputs. The inputs to these models are also referred to as "prompts". 

Designing a prompt is essentially how you “program” a large language model model, usually by providing instructions or some examples of how to successfully complete a task.

In [2]:
from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who won the world series in 2020?"},
    {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
    {"role": "user", "content": "Where was it played?"}
  ]
)

**The main input is the messages parameter-**

Messages must be an array of message objects, where each object has a role (either "system", "user", or "assistant") and content (the content of the message). Conversations can be as short as 1 message or fill many pages.

Typically, a conversation is formatted with a system message first, followed by alternating user and assistant messages.

- The system message helps set the behavior of the assistant. In the example above, the assistant was instructed with "You are a helpful assistant."

- The user messages help instruct the assistant. They can be generated by the end users of an application, or set by a developer as an instruction.

- The assistant messages help store prior responses. They can also be written by a developer to help give examples of desired behavior.

Including the conversation history helps when user instructions refer to prior messages. In the example above, the user’s final question of "Where was it played?" only makes sense in the context of the prior messages about the World Series of 2020. Because the models have no memory of past requests, all relevant information must be supplied via the conversation. If a conversation cannot fit within the model’s token limit, it will need to be shortened in some way.


In [12]:
show_json(response)

{'id': 'chatcmpl-8zFMxWSRQfRp2delPvXl3IZVWXlCu',
 'choices': [{'finish_reason': 'stop',
   'index': 0,
   'logprobs': None,
   'message': {'content': 'The 2020 World Series was played at a neutral site, Globe Life Field in Arlington, Texas, due to the COVID-19 pandemic.',
    'role': 'assistant',
    'function_call': None,
    'tool_calls': None}}],
 'created': 1709607779,
 'model': 'gpt-3.5-turbo-0125',
 'object': 'chat.completion',
 'system_fingerprint': 'fp_2b778c6b35',
 'usage': {'completion_tokens': 29, 'prompt_tokens': 53, 'total_tokens': 82}}

In [5]:
response.choices[0].message.content

'The 2020 World Series was played at a neutral site, Globe Life Field in Arlington, Texas, due to the COVID-19 pandemic.'

In Python, the assistant’s reply can be extracted with response.choices[0].message.content

Every response will include a finish_reason. The possible values for finish_reason are:

- stop: API returned complete model output
- length: Incomplete model output due to max_tokens parameter or token limit
- content_filter: Omitted content due to a flag from our content filters
- null: API response still in progress or incomplete

### Text Summarization Models <a class="text-summ" id="text-summ"></a>

We want to generate a short summary of a product review from an ecommerce site

In [27]:
prod_review = f"""
Got this panda plush toy for my daughter's birthday, \\
who loves it and takes it everywhere. It's soft and \\ 
super cute, and its face has a friendly look. It's \\
a bit small for what I paid though. I think there \\ 
might be other options that are bigger for the \\ 
same price. It arrived a day earlier than expected, \\
so I got to play with it myself before I gave it \\ 
to her.
"""

In [17]:
from openai import OpenAI
client = OpenAI()

def get_completion(prompt, model="gpt-3.5-turbo"): # Andrew mentioned that the prompt/ completion paradigm is preferable for this class
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message.content


In [23]:
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site. 

Summarize the review below, delimited by triple 
backticks, in at most 30 words. 

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)

```
Cute panda plush toy loved by daughter, soft and friendly-looking. Smaller than expected for the price, but arrived early.
```


### How to write SQL queries with OpenAI <a class="write-sql" id="write-sql"></a>

We want a SQL query to be generated which computes the average total order value for all orders on 2023-04-01

Given the following SQL tables, your job is to write queries given a user’s request.
    
    CREATE TABLE Orders (
      OrderID int,
      CustomerID int,
      OrderDate datetime,
      OrderTime varchar(8),
      PRIMARY KEY (OrderID)
    );
    
    CREATE TABLE OrderDetails (
      OrderDetailID int,
      OrderID int,
      ProductID int,
      Quantity int,
      PRIMARY KEY (OrderDetailID)
    );
    
    CREATE TABLE Products (
      ProductID int,
      ProductName varchar(50),
      Category varchar(50),
      UnitPrice decimal(10, 2),
      Stock int,
      PRIMARY KEY (ProductID)
    );
    
    CREATE TABLE Customers (
      CustomerID int,
      FirstName varchar(50),
      LastName varchar(50),
      Email varchar(100),
      Phone varchar(20),
      PRIMARY KEY (CustomerID)
    );

USER

Write a SQL query which computes the average total order value for all orders on 2023-04-01.    

In [23]:
from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {
      "role": "system",
      "content": "Given the following SQL tables, your job is to write queries given a user’s request.\n    \n    CREATE TABLE Orders (\n      OrderID int,\n      CustomerID int,\n      OrderDate datetime,\n      OrderTime varchar(8),\n      PRIMARY KEY (OrderID)\n    );\n    \n    CREATE TABLE OrderDetails (\n      OrderDetailID int,\n      OrderID int,\n      ProductID int,\n      Quantity int,\n      PRIMARY KEY (OrderDetailID)\n    );\n    \n    CREATE TABLE Products (\n      ProductID int,\n      ProductName varchar(50),\n      Category varchar(50),\n      UnitPrice decimal(10, 2),\n      Stock int,\n      PRIMARY KEY (ProductID)\n    );\n    \n    CREATE TABLE Customers (\n      CustomerID int,\n      FirstName varchar(50),\n      LastName varchar(50),\n      Email varchar(100),\n      Phone varchar(20),\n      PRIMARY KEY (CustomerID)\n    );"
    },
    {
      "role": "user",
      "content": "Write a SQL query which computes the average total order value for all orders on 2023-04-01."
    }
  ],
  temperature=0.7,
  max_tokens=1000,
  top_p=1
)

**Adjust your settings-**

Prompt design is not the only tool you have at your disposal. You can also control completions by adjusting your settings. 

**-temperature**

One of the most important settings is called **temperature**.

You may have noticed that if you submitted the same prompt multiple times in the examples above, the model would always return identical or very similar completions. This is because your temperature was set to 0.

Try re-submitting the same prompt a few times with temperature set to 1.

When temperature is above 0, submitting the same prompt results in different completions each time.

Remember that the model predicts which text is most likely to follow the text preceding it. 

**temperature is a value between 0 and 1** that essentially lets you control how confident the model should be when making these predictions. 

Lowering temperature means it will take fewer risks, and completions will be more **accurate and deterministic**. 

Higher temperature may be useful for tasks where variety or **creativity are desired**, or if you'd like to generate a few variations for your end users or human experts to choose from.

**-max_tokens**

specifies the length of the generated tokens

**-top_p(nucleus)**

sampling limits token generation to the cumulative probability of p.
The cumulative probability cutoff for token selection. Lower values mean sampling from a smaller, more top-weighted nucleus.

**-top_k**

sampling limits token generation to the top k most likely tokens at each step



In [24]:
show_json(response)

{'id': 'chatcmpl-90KE9IbTxb8Vdbj7SmsuHRg5c74Pp',
 'choices': [{'finish_reason': 'stop',
   'index': 0,
   'logprobs': None,
   'message': {'content': "```sql\nSELECT AVG(TotalOrderValue) AS AverageTotalOrderValue\nFROM (\n    SELECT O.OrderID, SUM(OD.Quantity * P.UnitPrice) AS TotalOrderValue\n    FROM Orders O\n    JOIN OrderDetails OD ON O.OrderID = OD.OrderID\n    JOIN Products P ON OD.ProductID = P.ProductID\n    WHERE O.OrderDate = '2023-04-01'\n    GROUP BY O.OrderID\n) AS OrderTotals;\n```",
    'role': 'assistant',
    'function_call': None,
    'tool_calls': None}}],
 'created': 1709864781,
 'model': 'gpt-3.5-turbo-0125',
 'object': 'chat.completion',
 'system_fingerprint': 'fp_2b778c6b35',
 'usage': {'completion_tokens': 98, 'prompt_tokens': 217, 'total_tokens': 315}}

In [25]:
response.choices[0].message.content

"```sql\nSELECT AVG(TotalOrderValue) AS AverageTotalOrderValue\nFROM (\n    SELECT O.OrderID, SUM(OD.Quantity * P.UnitPrice) AS TotalOrderValue\n    FROM Orders O\n    JOIN OrderDetails OD ON O.OrderID = OD.OrderID\n    JOIN Products P ON OD.ProductID = P.ProductID\n    WHERE O.OrderDate = '2023-04-01'\n    GROUP BY O.OrderID\n) AS OrderTotals;\n```"