In [2]:
import os
import openai

#### models in openAI
------------------------------------

OpenAI provides a variety of models, each suited for different use cases, ranging from text generation, code generation, to specific tasks like question answering, summarization, and more.

| Model Type               | Description                                                                                       | Model ID               | Variants/Use Cases                                           |
|--------------------------|---------------------------------------------------------------------------------------------------|------------------------|-------------------------------------------------------------|
| **GPT-4 Models**         | The most advanced language model, designed for a wide range of tasks, including text completion and complex reasoning. | `gpt-4`                | Variants: `gpt-4`, `gpt-4-32k` (larger context window)     |
| **GPT-3.5 Models**       | A slightly earlier version of the GPT-4 model, used for efficient and high-quality completions across various tasks. | `gpt-3.5-turbo`        |                                                             |
| **Codex Models**         | Specialized for code generation tasks, including programming in multiple languages.              | `code-davinci-002`, `code-cushman-001` | Use cases: Autocomplete for code, bug fixes, code explanations. |
| **Text-Davinci Models**  | One of the most capable models for text generation, document completion, and more.              | `text-davinci-003`     | Use cases: Writing, summarization, text generation.         |




##### Key Properties of Models:
- **Token Limit**: Defines how much input and output the model can handle in a single request. For example:
  - GPT-4 has a context length limit of 8,192 tokens, and `gpt-4-32k` can handle up to 32,768 tokens.
- **Training Data**: The models are trained on a mixture of publicly available and licensed datasets but have a knowledge cutoff (e.g., GPT-4 has a cutoff in September 2021).

Each model has a balance of capability, efficiency, and cost, and you can select the right one depending on the task you need to perform.

#### Endpoints
-----------------------------

- OpenAI provides several API endpoints that enable developers to interact with their models for various tasks such as `text generation`, `conversation`, `code completion`, `embedding generation`, and more.

- Each endpoint serves a different purpose and can be accessed via HTTP requests.

In [3]:
# for older 0.28 API
# Make a request to the Completions Endpoint
# response = openai.Completion.create(
#   model       = "gpt-3.5-turbo",                                # Specify the model you want to use
#   prompt      = "Once upon a time, there was a wise owl that",  # Input prompt
#   max_tokens  = 50,                                             # The maximum number of tokens to generate
#   temperature = 0.7,                                            # Controls randomness (0.7 = moderate creativity)
#   n           = 1,                                              # The number of completions to generate
#   stop        = None                                            # Optional stop sequence
# )

In [4]:
# Print the generated completion
# print(response['choices'][0]['message']['content'])

#### 2. Chat Completion Endpoint (/v1/chat/completions)

**Purpose**: Designed for GPT models (GPT-3.5, GPT-4) to handle conversation-like inputs, but can be used for tasks similar to text generation.

**API Endpoint**: POST https://api.openai.com/v1/chat/completions


**Usage**: It’s the recommended endpoint for most tasks, including text completion, Q&A, summarization, etc., by providing a chat history as input.

**Key Parameters**:

- **model**: Specifies the model to be used (e.g., gpt-4 or gpt-3.5-turbo).
- **messages**: A list of messages, where each message contains:
  - **role**: Either "system", "user", or "assistant".
  - **content**: The text for the message.
- **max_tokens**: Controls how many tokens to generate.
- **temperature**: Controls randomness in the output (higher value = more random).


In [5]:
from openai import OpenAI

In [6]:
client = OpenAI(
    # defaults to os.environ.get("OPENAI_API_KEY")
    # api_key = openai_api_key
)

In [9]:
# Make a request to the Chat Completion Endpoint
response = client.chat.completions.create(
    model="gpt-3.5-turbo",  # Specify the model
    messages=[
        {"role": "user", "content": "Can you summarize the key parameters for the Chat Completion Endpoint?"}
    ],
    max_tokens=50,
    temperature=0.1
)

In [10]:
print(response.choices[0].message.content)

Sure! The key parameters for the Chat Completion Endpoint typically include:

1. Chat ID: A unique identifier for the chat session that needs to be completed.
2. Completion Status: A flag indicating whether the chat session was successfully completed or not.
3


In OpenAI's API for language models, messages are the fundamental building blocks of prompts. 

Each message typically consists of two main components: the `role` of the sender and the `content` of the message. 

| **Role**   | **Description**                                                                                                                                                 | **Example**                                                                                                                  |
|------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
| User       | Represents the end user or the application initiating the conversation. Sends input messages, asking questions or providing information for the model's response. | If a user asks, "What is machine learning?", the message is sent with the role of "user".                                  |
| Assistant  | Represents the AI model's responses to the user’s queries. Generates replies based on the context and content of previous messages, aiming to be helpful and informative. | After the user asks about machine learning, the assistant might respond, "Machine learning is a subset of artificial intelligence that focuses on building systems that learn from data." |
| System     | Provides initial instructions or context to the assistant to guide its behavior throughout the interaction. Typically used to set the tone, rules, or constraints of the assistant’s responses. | A system message might state, "You are a friendly and informative assistant."                                              |


In [None]:
[
  {"role": "system",    "content": "You are a knowledgeable assistant."},
  {"role": "user",      "content": "Can you tell me about neural networks?"},
  {"role": "assistant", "content": "Neural networks are a series of algorithms that mimic \
                                    the operations of a human brain to recognize relationships in a set of data."}
]


In [12]:
client = openai.OpenAI(
    #api_key=openai_api_key
)

In [13]:
completion = client.chat.completions.create (
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a poetic assistant, skilled in explaining complex programming concepts \
                                   with creative flair."},
    {"role": "user",   "content": "Compose a poem that explains the concept of recursion in programming, \
                                   in max 50 words"}
  ]
)

In [14]:
completion

ChatCompletion(id='chatcmpl-AF1WQneyZIKYopPGD6d7sNu054AvM', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="In coding's dance, recursion plays,\nA loop where function calls itself ablaze.\nLike a mirror reflecting endless grace,\nSolving problems in a recursive embrace.\nEach iteration peels back layers deep,\nIn a hypnotic rhythm, algorithms leap.", refusal=None, role='assistant', function_call=None, tool_calls=None))], created=1728144134, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=49, prompt_tokens=46, total_tokens=95, completion_tokens_details=CompletionTokensDetails(reasoning_tokens=0), prompt_tokens_details={'cached_tokens': 0}))

In [15]:
import json
from pprint import pprint

In [16]:
# Pretty print the response using pprint
pprint(response.to_dict())

{'choices': [{'finish_reason': 'length',
              'index': 0,
              'logprobs': None,
              'message': {'content': 'Sure! The key parameters for the Chat '
                                     'Completion Endpoint typically include:\n'
                                     '\n'
                                     '1. Chat ID: A unique identifier for the '
                                     'chat session that needs to be '
                                     'completed.\n'
                                     '2. Completion Status: A flag indicating '
                                     'whether the chat session was '
                                     'successfully completed or not.\n'
                                     '3',
                          'refusal': None,
                          'role': 'assistant'}}],
 'created': 1728143166,
 'id': 'chatcmpl-AF1GoIbcGCnRQRzTXzL5XfOjCt5SG',
 'model': 'gpt-3.5-turbo-0125',
 'object': 'chat.completion',
 'system

**Choices**

| Attribute               | Description                                                                                              |
|-------------------------|----------------------------------------------------------------------------------------------------------|
| **finish_reason**        | The reason the completion finished (e.g., `'stop'`).                                                     |
| **index**                | The index of the choice in the list.                                                                     |
| **logprobs**             | Any associated log probabilities (in this case, it's `None`).                                            |
| **message**              | An instance of `ChatCompletionMessage` that contains:                                                    |
| - **content**            | The actual text generated by the model (e.g., a poem about recursion).                                   |
| - **role**               | The role of the entity that generated the message (e.g., `'assistant'`).                                 |
| - **function_call**      | Information if the message included a function call (in this case, it's `None`).                         |
| - **tool_calls**         | Information if there were any tool calls (also `None` in this case).                                     |

#### Exercise - 01 

- (extract pieces of info from the completion response)

Qs : Extract the model name from the response object.

In [17]:
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "Who won the world series in 2020?"},
    ]
)

In [18]:
# Expected solution
model_name = response.to_dict()['model']
print("Model used:", model_name)

Model used: gpt-3.5-turbo-0125


Qs : Extract the content of the assistant’s reply from the response object.

In [20]:
# Expected solution
assistant_reply = response.to_dict()['choices'][0]['message']['content']
print("Assistant's reply:", assistant_reply)

Assistant's reply: The Los Angeles Dodgers won the World Series in 2020, defeating the Tampa Bay Rays.


In [23]:
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "Who won the world series in 2020, 2021, 2022, 2023 ?"},
    ]
)

Qs : Create a dictionary of World Series winners by year from the assistant’s reply.

In [24]:
# Example solution
winners_by_year = response.to_dict()['choices'][0]['message']['content']
print(winners_by_year)

The winners of the World Series for the years you mentioned are as follows:

- **2020**: Los Angeles Dodgers
- **2021**: Atlanta Braves
- **2022**: Houston Astros
- **2023**: Texas Rangers

If you need more information about any of these seasons or teams, feel free to ask!


Get the output in a dictionary format, year : winner name

In [25]:
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "Who won the world series in 2020, 2021, 2022, 2023 ? \
                                       Provide the answer in a dictionary format as below \
                                       year : winner name \
                                       "},
    ]
)

In [26]:
# Example solution
winners_by_year = response.to_dict()['choices'][0]['message']['content']
print(winners_by_year)

Here is the information in the requested dictionary format:

```python
{
    2020: "Los Angeles Dodgers",
    2021: "Atlanta Braves",
    2022: "Houston Astros",
    2023: "Texas Rangers"
}
```


In [27]:
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": '''Who won the world series in 2020, 2021, 2022, 2023 ? \
                                       Provide the answer in a dictionary format as below \
                                       {
                                           year : winner name 
                                       }
                                       without any extra text
                                       \
                                       '''},
    ]
)

Qs : load the output in dataframe format

| Year | Winner                |
|------|-----------------------|
| 2020 | Los Angeles Dodgers   |
| 2021 | Atlanta Braves        |
| 2022 | Houston Astros        |
| 2023 | Texas Rangers         |

In [28]:
# Example solution
winners_by_year = response.to_dict()['choices'][0]['message']['content']
winners_by_year

'{\n    2020: "Los Angeles Dodgers",\n    2021: "Atlanta Braves",\n    2022: "Houston Astros",\n    2023: "Texas Rangers"\n}'

In [29]:
import pandas as pd
import ast

In [30]:
# Convert the string representation of the dictionary to an actual dictionary
world_series_dict = ast.literal_eval(winners_by_year)

In [31]:
# Load the dictionary into a pandas DataFrame
df = pd.DataFrame(list(world_series_dict.items()), columns=['Year', 'Winner'])
df

Unnamed: 0,Year,Winner
0,2020,Los Angeles Dodgers
1,2021,Atlanta Braves
2,2022,Houston Astros
3,2023,Texas Rangers
