---   
 <img align="left" width="75" height="75"  src="https://upload.wikimedia.org/wikipedia/en/c/c8/University_of_the_Punjab_logo.png"> 

<h1 align="center">Department of Data Science</h1>
<h1 align="center">Course: Generative and Agentic AI</h1>

---
<h3><div align="right">Instructor: Muhammad Arif Butt, Ph.D.</div></h3>    

<br><br>
<h1 align="center">Lec-01: A Hello to OpenAI's Family of AI Models</h1>

# Learning agenda of this notebook
1. Installing OpenAI SDK and Accessing OpenAI LLMs via API keys
    - Installing OpenAI SDK
    - Generating API Keys for OpenAI
    - Saving API Keys
    - Accessing / Testing API Keys
    - Initialize the openai Client
    - Access API EndPoint: `client.models.list()`
    - Access API EndPoint: `client.chat.completions.create()`
    - Access API EndPoint: `client.responses.create()`
    - Understanding real business costs for AI usage
2. Understanding Arguments to `client.responses.create()` Method
3. Hands-On Practice Examples with OpenAI's `Responses` API
    - Examples (Question Answering)
    - Examples (Binary Classification: Sentiment analysis, Spam detection, Medical diagnosis)
    - Examples (Multi-class Classification)
    - Examples (Text Generation)
    - Examples (Code Generation)
    - Examples (Text Translation)
    - Examples (Text Summarization)
    - Examples (Named Entity Recognition)
4. Points to Ponder and Tasks To DO

<h1 align="center"><div class="alert alert-success" style="margin: 20px">The API calls made in this entire notebook, when executed once will cost you around \$0.02 </h1>

# <span style='background :lightgreen' >1. Installing OpenAI SDK and Accessing OpenAI LLMs via API keys</span>
## a. Installing OpenAI SDK

<h3 align="center"><div class="alert alert-success" style="margin: 20px">Each flagship LLM provider has its own SDK for languages like Python and JavaScript</h3>

<h3 align="center"><div class="alert alert-success" style="margin: 20px">SDK (Software Development Kit) is a collection of software tools, libraries, code samples, and documentation that makes it easy for developers to interact with complex services without building everything from scratch</h3>

<h3 align="center"><div class="alert alert-success" style="margin: 20px">API  is a set of rules that allows different software applications, services or systems to communicate with each other. It defines the methods and data formats an application can use to request services or information from another application and how that application should respond.</h3>

- When developers integrate Large Language Models (LLMs) into applications, they communicate with these models through REST APIs using HTTP/HTTPS requests. The process involves sending prompts and configuration parameters to the LLM provider's servers and receiving generated responses, typically in JSON format.
- To simplify this integration process, all flagship LLM providers provide their own Software Development Kits (SDKs) for different programming languages like Python, Java, JavaScript, Go etc. These are specialized client libraries that:
    - Handle authentication and API key management
    - Network communication (HTTP requests)
    - Data formatting (JSON) and response parsing
    - Implement retry logic and error handling


### Major LLM Providers Comparisons:
| Company   | Models                                                | Python Library           | JavaScript Library      |
| ---------- | ---------------------------------------------------- | ------------------------ | ----------------------- |
| OpenAI     | GPT-4o, GPT-4o-mini, GPT-4 Turbo, o1-preview, o1-mini| `openai`                 | `openai`                |
| Anthropic  | Claude 4 (Opus 4, Sonnet 4), Claude 3.5 Sonnet       | `anthropic`              | `@anthropic-ai/sdk`     |
| Google     | Gemini 2.0 Flash, Gemini 1.5 Pro/Flash               | `google-generativeai`    | `@google/generative-ai` |
| Microsoft  | Azure OpenAI                                         | `openai (azure endpoint)`| `@azure/openai`         |
| Cohere     | Command R+, Command R                                | `cohere`                 | `cohere-ai`             |
| Mistral AI | Mixtral, 7B, Mistral Large 2, Codestral              | `mistralai`              | `@mistralai/mistralai`  |
| Meta AI    | Llama 3.1, 3.2                                       | via 3rd party APIs       | via 3rd party APIs  |

- Following are the commands to install Python client libraries in Python that turns your python requests into an HTTP call, and converts the results coming back from the HTTP call into python objects.
```python
pip install openai               # OpenAI GPT models
pip install anthropic            # Anthropic Claude models
pip install google-generativeai  # Google Gemini models
pip install cohere               # Cohere Command models
pip install mistralai            # Mistral AI models
pip install azure-ai-openai      # Azure OpenAI Service
```

<h3 align="center"><div class="alert alert-success" style="margin: 20px">OpenAI is an AI R&D company founded in 2015 by Sam Altman, Elon Musk and others</h3>

In [1]:
# Install opeani SDK
# In Jupyter notebooks, commands prefixed with ! are executed by a separate non-login shell, created by the Python kernel. That shell does not automatically inherit the environment activation you may have performed in your terminal before starting Jupyter.
# The only way for ! commands to reflect a virtual environment is if the Jupyter kernel itself was started from inside that environment, meaning the Python interpreter belongs to that venv/conda/uv environment.
#!uv add openai

In [1]:
!uv tree | grep openai

[2mResolved [1m241 packages[0m [2min 0.92ms[0m[0m
├── openai v2.14.0


## b. Generating API Keys for OpenAI
<h3 align="center"><div class="alert alert-success" style="margin: 20px">An API key is a unique authentication token that identifies and authorizes an application (not the user) making API requests, allowing API provider to track usage, control access and enforces rate limits.</h3>

- All of the major closed source providers like OpenAI (GPT family, DALL-E3, Whisper, TTS), Anthropic(Claude), Google (Gemini) have two different types of plans one for their chat interfaces and other to access the LLMs via APIs. These two plans are are completely separate and don't have any relationship with eachother.
- You all must have used the free versions of Chat Interfaces (web apps) like https://chatgpt.com, https://claude.ai, and https://gemini.google.com/app, where you can chat with the models directly. Some of you might have got subscriptions of ChatGPT Plus, Claude Pro, or Gemini Advanced in order to unlock features inside the chat website/app (faster responses, more advanced models). They work on monthly subsriptions and do not give you an API key.
- APIs are everywhere in programming. Whether you're retrieving weather data, generating text with GPT, calling Claude or Gemini, or sending emails—you’re almost always using an API.
- LLM providers like OpenAI(GPT-4, GPT-4o, GPT-4o-mini), Anthropic (Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku), Google DeepMind (Gemini 1.5 Pro, Gemini 1.5 Flash, Gemini Nano), and others like SendGrid (emails, Stripe (payments), Google Maps (locations) require authentication before you can call their models and without an API key, your request will be rejected with an Unauthorized or 401 error.
- An API key is like a password that:
    - Identifies you as a registered user
    - Controls access (free tier vs. paid tier, rate limits, premium features)
    - Enables billing and usage tracking
- Today we will be accessing LLMs via API means you call out the model running on cloud from your code and it will respond with answers to your questions. Unfortunately, there is no free package like the one you have for ChatGPT. Moreover, it donot have monthly subscription, rather you are charged per API request. The cost is based on the number of input and output tokens. Keep in mind: each API call may require 10,000,000,000,000 floating point calculations - that compute uses electricity!
- Follow the following steps to generate your OPenAI API key:
    - Visit https://openai.com/api/ and sign-up using your Google credentials or create a new account.
    - Once you are logged-in, you can visit the openai official platform at https://platform.openai.com/docs/overview. 
    - Navigate to the subscription or billing section by clicking the setting/gear icon at the top right and complete the payment process.
    - Once you have made the payment, create your own API key and save it for later usage.
    - For details about prices of various OpenAI models visit https://platform.openai.com/docs/pricing. 
- Please read best practices for API Key Safety at https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety
> Use OpenAI Playground at https://platform.openai.com/playground, which is a user-friendly web interface for experimenting with openAI's language models, different prompts and parameters. It is aimed at developers and researchers to explore the capabilities of these models without writing a single line of code.

## c. Saving the API Keys
- An **environment variable** is a key–value pair stored by your operating system that programs can access at runtime. They’re used to configure programs without hardcoding settings in code. Think of them as invisible notes your system uses to tell programs what to do. For example: When you run a Python program, it might check an environment variable called DEBUG to decide whether to show detailed error logs. APIs like OpenAI or Stripe often require you to store your secret API keys in an environment variable like OPENAI_API_KEY.

#### Option 1: (Save API Keys in an environment variable on Mac/Linux/Windows)
- For Linux/MAC machines, create an environment variable named OPENAI_API_KEY inside .zshrc and save your key there by running this command on your terminal:`echo "export OPENAI_API_KEY='yourkey'" >> ~/.zshrc`
- Update the shell with the new variable by running the command: `source ~/.zshrc`
- Confirm that you have set your environment variable using the command `echo $OPENAI_API_KEY`
- Now inside your Python code, you can access your key using this LOC `openai.api_key = os.environ["OPENAI_API_KEY"]`

#### Option 2: (Save API Keys inside `.env` file):
- The best way is to create a `.env` or just `env` file within your Python project directory and save all your API keys inside it, and it doesnot go into your git repositories. Your .env file must not be shared publicly as it will contain secrets like API keys or passwords. To ensure this you should add `.env` to your `.gitignore` file.
- In a real project, we may need to use more than one API keys, one for OpenAI, another for Hugging Face, another for Google's Gemini, another for Pinecone, and so on. Finally, your `.env` file might contain many keys like:
```
OPENAI_API_KEY=sk-proj-xxxx
ANTHROPIC_API_KEY=sk-proj-xxxx
GOOGLE_API_KEY=sk-proj-xxxx
DEEPSEEK_API_KEY=sk-proj-xxxx
```
- Now inside your Python code, you can access your key using the `load_dotenv()` method of Python `dotenv` module

#### Points to Ponder:
- **Rotate Keys Regularly:** Must rotate your keys regularly. A quarterly rotation schedule is a widely accepted best practice. It minimizes the window of vulnerability in case a key ever gets compromised.
- **Environment Separation:** Use distinct keys for development, testing, and production environments. This isolation ensures that even if a test key is leaked, your production systems and costs remain protected.
- **Least Privilege Access:** Configure your API keys with only the minimal permissions needed for each task. For example, if one service only needs read access, don't give it full control. Limiting scope prevents small issues from turning into major security breaches.
- **Monitor usage Pattern:** Most API providers offer dashboards or logs that show request frequency and token consumption. Keep an eye out for sudden spikes or unusual activity. These often indicate leaked credentials or automated misuse.

## d. Accessing and Testing the API Keys
- The `load_dotenv()` method looks for a file named `.env` in the current directory or in the specified path as mentioned in its first argument. The second argument `override=True` is optional and it means  any existing environment variables will be overwritten by values from the .env file. This method reads each line in the `.env` file and sets the environment variables accordingly.

In [1]:
import os
from dotenv import load_dotenv

load_dotenv('../keys/.env', override=True) # Opens the .env file in the pwd or the specified file path, reads key=value pairs and insert them into os.environ and returns True/False depending the file exist or not
openai_api_key = os.getenv('OPENAI_API_KEY') # The os.getenv() method reads the environment variables and returns the value associated with 'OPENAI_API_KEY', or None if it is not set.
if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:15]}")
else:
    print("OpenAI API Key not set")

OpenAI API Key exists and begins sk-proj-I-w8uq1


## e. Initialize the openai Client

<h1 align="center"><div class="alert alert-success" style="margin: 20px"><b>openai client library</b> is a light weight Python utility that turns your Python requests into HTTP request object and later convert the HTTP response object into a Python object. </h1>

- The `openai` Python client library is a high-level abstraction layer that sits between your Python application and OpenAI's REST API. It functions as a lightweight wrapper that transforms Python method calls into properly formatted HTTP requests and converts JSON responses back into Python objects.
- The `openai.OpenAI()` constructor instantiates a client object that encapsulates all the configurations needed to make API calls. It serves as the initialization point where you establish the connection parameters and authentication credentials needed for all subsequent API interactions.
- The return value is an OpenAI client object, which will send future requests to the choosen provider's API endpoint in subsequent calls.
```python
client = OpenAI(
            api_key=openai_api_key,    # REQUIRED (or from env var). It specifies the API key of the provider, which can be API key of OpenAI or may be API key of some other provider. 
            base_url=None,             # OPTIONAL (defaults to https://api.openai.com/v1)
            timeout=None,              # OPTIONAL (defaults to 10 minutes)
            max_retries=2,             # OPTIONAL (default: 2 retries)
            default_headers=None,      # OPTIONAL (defaults to None)
            default_query=None,        # OPTIONAL (defaults to None)
            http_client=None,          # OPTIONAL (defaults to None)
)
``` 
      
- **`base_url` Parameter:**
    - The base URL is the address of the API server where all the requests are sent via the client that performs a particular function.(API end point) is the exact web address, where you send a request to use a specific function of an API.
    - The OpenAI API has become a defacto standard so many AI providers now support OpenAI SDK, and the openai client can call their AI models by simply specifying the appropriate base_url
    - Every Provider has its own base URL, the following base urls are compatible with OpenAI client less Anthropic:
        - OpenAI → `https://api.openai.com/v1`
        - Google → `https://generativelanguage.googleapis.com/v1beta/openai`
        - Groq → `https://api.groq.com/openai/v1`
        - Ollama → `http://localhost:11434/v1`
        - Deepseek → `https://api.deepseek.com/v1`   
        - Mistral → `https://api.mistral.ai/v1`
        - X → `https://api.x.ai/v1`
        - Anthropic → `https://api.anthropic.com/v1`  **(Not compatible with OpenAI API)**

## f. Access API Endpoint: `client.models.list()`
- Think of an **API Endpoint** like different departments in a building - each department (endpoint) handles different tasks.
- All API requests start with a base URL, and then you add endpoint paths to access specific functions. (base url is like the main entrance of a building, while endpoints are the different rooms inside.
- Think of it this way:
    - An API (Application Programming Interface) is like a menu in a restaurant.
    - An endpoint is a specific item on the menu that you can order, with its own format and expected inputs.
    - The LLM model is the “kitchen” that prepares the output.
- The Base URL specifies where to send requests. For OpenAI the base url is `https://api.openai.com/v1`
- The Full URL endpoint is like a door into a different capability of the model.


| Client Method                          | Full URL Endpoint                                | Typical Models Used                                          | API Action & Description                                                             |
| -------------------------------------- | ------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------------------------------ |
| `client.models.list()`                 | `https://api.openai.com/v1/models`               |                                                              | Returns a list of all available models.                                              |
| `client.chat.completions.create()`     | `https://api.openai.com/v1/chat/completions`     | `gpt-5.2`, `gpt-5.2 pro`, `gpt-4.1`, `gpt-4o`, `gpt-4o-mini` | Generates conversational/chat responses from a model.                                |
| `client.responses.create()`            | `https://api.openai.com/v1/responses`            | `gpt-5.2`, `gpt-5.2 pro`, `gpt-4.1`, `gpt-4o`, `gpt-4o-mini` | Unified endpoint for text, multimodal outputs, structured responses, and tool calls. |
| `client.audio.transcriptions.create()` | `https://api.openai.com/v1/audio/transcriptions` | `gpt-4o Transcribe`, `gpt-4o mini Transcribe`, `whisper-1`   | Converts spoken audio into text (speech-to-text).                                    |
| `client.audio.translations.create()`   | `https://api.openai.com/v1/audio/translations`   | `gpt-4o Transcribe`, `gpt-4o mini Transcribe`, `whisper-1`   | Transcribes and translates audio into another language.                              |
| `client.images.generate()`             | `https://api.openai.com/v1/images/generations`   | `gpt-image-1`, `dall-e-3`                                    | Generates images from text prompts.                                                  |
| `client.images.edit()`                 | `https://api.openai.com/v1/images/edits`         | `gpt-image-1`, `dall-e-3`                                    | Edits an existing image using prompts and optional mask.                             |
| `client.images.variation()`            | `https://api.openai.com/v1/images/variations`    | `gpt-image-1`, `dall-e-3`                                    | Creates variations of an uploaded image.                                             |
| `client.embeddings.create()`           | `https://api.openai.com/v1/embeddings`           | `text-embedding-3-large`, `text-embedding-3-small`           | Generates vector embeddings for text (search, similarity, clustering).               |


```
https://api.openai.com/v1/models/list
│      │              │  │    │
│      │              │  │    └─── Specific action: "list"
│      │              │  └──────── Resource type: "models"
│      │              └─────────── API version: "v1", usually included (e.g., /v1) to indicate which version of the API you’re calling.
│      └────────────────────────── OpenAI's API server
└───────────────────────────────── Protocol: http:// or https:// (secure web communication)
```

In [2]:
# List the available models
import os
from dotenv import load_dotenv
from openai import OpenAI
import rich


load_dotenv('../keys/.env', override=True) 
openai_api_key = os.getenv('OPENAI_API_KEY')

# instantiate an instance of OpenAI class either explicitly passing the base url and the API key, otherwise it use the default openai base url and read api key from environment variable
#client = OpenAI() 
client = OpenAI(api_key=openai_api_key, base_url="https://api.openai.com/v1")

models = client.models.list() # calls the /models endpoint and returns a list of model objects.

In [3]:
models

SyncPage[Model](data=[Model(id='gpt-4-0613', created=1686588896, object='model', owned_by='openai'), Model(id='gpt-4', created=1687882411, object='model', owned_by='openai'), Model(id='gpt-3.5-turbo', created=1677610602, object='model', owned_by='openai'), Model(id='gpt-5.2-codex', created=1766164985, object='model', owned_by='system'), Model(id='gpt-4o-mini-tts-2025-12-15', created=1765610837, object='model', owned_by='system'), Model(id='gpt-realtime-mini-2025-12-15', created=1765612007, object='model', owned_by='system'), Model(id='gpt-audio-mini-2025-12-15', created=1765760008, object='model', owned_by='system'), Model(id='chatgpt-image-latest', created=1765925279, object='model', owned_by='system'), Model(id='davinci-002', created=1692634301, object='model', owned_by='system'), Model(id='babbage-002', created=1692634615, object='model', owned_by='system'), Model(id='gpt-3.5-turbo-instruct', created=1692901427, object='model', owned_by='system'), Model(id='gpt-3.5-turbo-instruct-09

In [4]:
# rich is a third-party Python library that makes terminal output readable, structured, and pleasant, without changing your data (dicts, lists, dataclasses, Pydantic models, custom objects)
import rich
rich.print(models)

In [15]:
# Each object in models.data contains information like id, object, created, and permission. We loop through and print model.id to see the model names you can use with endpoints like chat, embeddings, or images.
for m in models.data:
        print(f"  - {m.id}")

  - gpt-4-0613
  - gpt-4
  - gpt-3.5-turbo
  - gpt-5.2-codex
  - gpt-4o-mini-tts-2025-12-15
  - gpt-realtime-mini-2025-12-15
  - gpt-audio-mini-2025-12-15
  - chatgpt-image-latest
  - davinci-002
  - babbage-002
  - gpt-3.5-turbo-instruct
  - gpt-3.5-turbo-instruct-0914
  - dall-e-3
  - dall-e-2
  - gpt-4-1106-preview
  - gpt-3.5-turbo-1106
  - tts-1-hd
  - tts-1-1106
  - tts-1-hd-1106
  - text-embedding-3-small
  - text-embedding-3-large
  - gpt-4-0125-preview
  - gpt-4-turbo-preview
  - gpt-3.5-turbo-0125
  - gpt-4-turbo
  - gpt-4-turbo-2024-04-09
  - gpt-4o
  - gpt-4o-2024-05-13
  - gpt-4o-mini-2024-07-18
  - gpt-4o-mini
  - gpt-4o-2024-08-06
  - chatgpt-4o-latest
  - gpt-4o-audio-preview
  - gpt-4o-realtime-preview
  - omni-moderation-latest
  - omni-moderation-2024-09-26
  - gpt-4o-realtime-preview-2024-12-17
  - gpt-4o-audio-preview-2024-12-17
  - gpt-4o-mini-realtime-preview-2024-12-17
  - gpt-4o-mini-audio-preview-2024-12-17
  - o1-2024-12-17
  - o1
  - gpt-4o-mini-realtime-pre

- **Base Family:**
    - Indicates the generation, e.g., gpt-3.5, gpt-4, gpt-4o, gpt-4.1, gpt-5, o1, o3, etc.
- **Variant / Specialization:**
    - turbo → cheaper/faster optimized variant.
    - mini / nano → smaller, faster, cheaper versions (less capable).
    - instruct → trained to follow instructions better (legacy style).
    - realtime → optimized for low-latency streaming (e.g. conversation, agents).
    - audio / tts / transcribe → multimodal audio features.
    - embedding → vector embedding models for retrieval/search.
    - moderation → content filtering / safety models.
    - image / dall-e → image generation models.
- **Release Version:**
    - Dates (e.g. 2025-08-07, 2024-07-18) → specific snapshots.
    - Codes (e.g. 0613, 1106, 0125) → older internal release tags.
    - latest → rolling alias that always points to the newest stable version.
    - preview → early access, may change or break.

## g. Access API Endpoint: `client.chat.completions.create()`
<h3 align="center"><div class="alert alert-success" style="margin: 20px">Chat Completions API is a simple, stateless API for generating AI responses to messages where you manually manage conversation history and send it with each request.</h3>

- This endpoint is the current standard for conversational AI applications and supports all modern OpenAI models including gpt-4, gpt-4o, gpt-4o-mini, gpt-3.5-turbo, and their variants.
- In the following LOC, `client` represent the connection, `chat.completions` represent the end-point, and `create` is the command, with two required parameters:
```python
response = client.chat.completions.create(
    model="gpt-4o-mini",                # (a string specifying the text completion model)
    messages=[                          # (a list of dictionaries, each having two properties" `role` and `content`)
        {"role": "developer", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is AI?"}
    ]
)
```

**What happens when you run a Chat Completions API call:**
1. SDK sends POST request to: `https://api.openai.com/v1/chat/completions`
2. Server processes with the specified model, e.g., `gpt-4o-mini`
3. Response comes back as JSON
4. SDK converts JSON to Python object

- **Limitations:**
    - Conversation history management has to be done by developer using `messages = [system]+history+[user]`
    - Function calling capability is quite limited
    - Does not fully support the latest Agentic AI tools
> Use the OpenAI's `Chat Completions API` for simple use cases, where you don't need state management or tools usage.

In [4]:
# rich is a third-party Python library that makes terminal output readable, structured, and pleasant, without changing your data (dicts, lists, dataclasses, Pydantic models, custom objects)
import rich
rich.print(response)

In [6]:
import os
from dotenv import load_dotenv
from openai import OpenAI
import rich           # rich is a third-party Python library that makes terminal output structured, readable, and human-friendly — without changing your data.

load_dotenv('../keys/.env', override=True) 
openai_api_key = os.getenv('OPENAI_API_KEY')

client = OpenAI(api_key=openai_api_key, base_url="https://api.openai.com/v1")

response = client.chat.completions.create(
                model = 'gpt-4.1-nano',
                messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the color of sky? Tell me in a single line."}]
             )
print(response.choices[0].message.content) # do understand the structure of the response object returned by chat completion api
rich.print(response) # Inspects the object, detects structure (dataclasses, Pydantic, attrs, lists, dicts), recursively formats fields, applies indentation, spacing, and styling and renders it as a human-readable tree

The sky is typically blue during the day due to the scattering of sunlight by Earth's atmosphere.


## h. Access API Endpoint: `client.responses.create()`

<h3 align="center"><div class="alert alert-success" style="margin: 20px">Responses API is OpenAI's newest unified API that combines the simplicity of Chat Completions with the stateful conversation management and built-in tools of Assistants API into a single streamlined interface.</h3>

<h3 align="center"><div class="alert alert-success" style="margin: 20px">The Responses API eliminate the need of manual history management that we need to do in Chat Completion API, i.e., using `messages = [system] + history + [user]`</h3>

<h3 align="center"><div class="alert alert-success" style="margin: 20px">Responses API unlocks persistent reasoning, hosted tools, and multimodal workflows for GPT-5.</h3>

- The OpenAI's **`Responses API`** is a new stateful API introduced in March 2025 that brings together the best capabilities from the `chat completions` and `assistant` API into a single streamlined interface.
- The `Responses` API is designed as the next-generation replacement for Chat Completions API. It maintains full backward compatibility for core functionality while adding powerful new features like built-in tools, stateful conversations, and background processing.
- For new projects, the Responses API is the recommended choice due to its enhanced capabilities and unified approach to AI interactions.
- In the following LOC, `client` represent the connection, `responses` represent the end-point, and `create` is the command, with two required parameters:
```python
response = client.responses.create(
    model = "gpt-4o-mini",                # (a string specifying the model)
    input = "What is AI?"                 # (can be a simple string OR a list of message dictionaries like chat.completions)
  # input = [{"role": "user", "content": "What is AI?"}]
)
```
- **Key features:**
    - **Flexible input:** Accepts both simple strings and message arrays.
    - **Support Multi-modality:** Supports image inputs when using vision-enabled models like GPT-4o, GPT-4o-mini, and GPT-4 Turbo and the input content can be an array containing both text and images (either a valid HTTP/HTTPS URL to an image, or a base64-encoded image). Similarly supports audio inputs and outputs using the gpt-4o-audio-preview model. You can embed base64 encoded WAV files in the list of messages, similar to how image inputs work.
    - **Stateful conversations:** Automatically manages conversation history - no need to manually track previous messages. Simply pass previous_response_id to continue conversations
    - **Built-in tools:** Built-in support for hosted tools such as `web_search`, `file_search`, `computer_use`, `code_interpreter`, and `image_generation` tools, enabling rich, tool-augmented responses without extra glue code.
    - **Simplified orchestration:** Automatically handles multi-turn reasoning and tool calls, including streaming events and internal reasoning items, making complex workflows easier to build and maintain.
    

> **Note:** Assistants API is another OpenAI API that you may come across, which came before the Responses API but is still in beta status and is used rarely

**What happens when you run a Responses API call:**
1. SDK sends POST request to: `https://api.openai.com/v1/responses`
2. Server processes with the specified model, e.g., `gpt-4o-mini`
3. Response comes back as JSON
4. SDK converts JSON to Python object



> **A Quick Comparison between Chat Completion and Responses API:** https://platform.openai.com/docs/guides/migrate-to-responses

In [9]:
import os
from dotenv import load_dotenv
from openai import OpenAI
import rich
load_dotenv('../keys/.env', override=True) 
openai_api_key = os.getenv('OPENAI_API_KEY')

client = OpenAI(api_key=openai_api_key, base_url="https://api.openai.com/v1")

response = client.responses.create(
                model = 'gpt-4.1-nano',
                #input = "What is the color of sky? Tell me in a single line."      # 'input' replaces 'messages' for the Responses API
                input = [                                                           # Can be a simple string specifying the user query as above, or can be a list specifying the developer and user role
                        {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the color of sky? Tell me in a single line."}
                        ]
             )

#print(response.output_text)
print(response.output[0].content[0].text) # do understand the structure of the response object returned by responses api
rich.print(response)

The sky is typically blue during the day.


## i. Understanding real business costs for AI usage

In [5]:
# Get the sum of input tokens + output tokens for the specific API call
total_tokens = response.usage.total_tokens 

# GPT-4o-mini model is priced at $0.15 per million input tokens (i.e., $0.00015 per 1,000 input tokens).
# GPT-4o-mini model is priced at $0.60 per million output tokens (i.e., $0.00060 per 1,000 output tokens).
price_per_1k_tokens = 0.00075  # $0.00075 per 1k tokens (for input+output combined)

# Calculate total cost
total_cost = (total_tokens / 1000) * price_per_1k_tokens

print("\nCost for above call:")
print("-"*30)
print(f"  TOTAL TOKENS USED: {total_tokens}")
print(f"  ESTIMATED COST:   ${total_cost:.7f}")


Cost for above call:
------------------------------
  TOTAL TOKENS USED: 50
  ESTIMATED COST:   $0.0000375


# <span style='background :lightgreen' >2. Understanding Arguments to `client.responses.create()` Method</span>

```python
response = client.responses.create(
    model="gpt-4.1-nano",                # REQUIRED — model ID or alias
    input="What is AI?",                 # REQUIRED — input text or a list of messages with roles: developer, user, assistant, tool_call, tool (new to Responses API)
    instructions=None,                   # OPTIONAL — system/developer instructions to guide model behavior (like a system prompt) (new to Responses API)
    max_output_tokens=None,              # OPTIONAL — maximum tokens to generate, equivalent to max_tokens in Chat Compleation API (new to Responses API)
    temperature=1.0,                     # OPTIONAL — randomness, range: 0.0–2.0, default: 1.0
    top_p=1.0,                           # OPTIONAL — nucleus sampling, range: 0.0–1.0, default: 1.0
    text={"format": {"type": "text"}},   # OPTIONAL — output format configuration: text or structured JSON  (new to Responses API)
    stream=False,                        # OPTIONAL — whether to stream the response, default: False
    reasoning={"effort": "medium"},      # OPTIONAL — reasoning effort level; values: "minimal", "low", "medium" (default), "high" (new to Responses API)
    previous_response_id=None,           # OPTIONAL — chain from a previous response in a stateless way (string)      (new to Responses API)
    conversation=None,                   # OPTIONAL — conversation object for stateful multi-turn conversations
    tools=None,                          # OPTIONAL — list of tools/functions available for the model to call
    tool_choice=None,                    # OPTIONAL — force model to use a specific tool/function
    background=False,                    # OPTIONAL — enables asynchronous processing in background for long tasks (bool, default: False) (new to Responses API)
    metadata=None,                       # OPTIONAL — custom metadata for tagging requests (dict)     (new to Responses API)
    store=True,                          # OPTIONAL — whether to store the response for later retrieval (bool, default: True)     (new to Responses API)
    include=None,                        # OPTIONAL — additional parts of the output to include in the response (list of strings)    (new to Responses API)
    parallel_tool_calls=True,            # OPTIONAL — allow tools to be called in parallel (bool, default: True) (new to Responses API)
)

```
### (i) The **`model`** Argument
- The **`model`** parameter, specifies the model ID or its alias to which we want to send our request. OpenAI offers a wide range of models with different capabilities, performance characteristics, and price points. Refer to the model guide to browse and compare available models.
- **Model ID (Static/Dated Version):** A model ID (often called a "snapshot" or "dated model") is a specific version of a model frozen at a particular date. This version never changes. For example:
    - gpt-4o-2024-11-20 → specific snapshot from November 20, 2024
    - gpt-4o-2024-08-06 → specific snapshot from August 6, 2024
    - gpt-4o-2024-05-13 → original GPT-4o release from May 13, 2024
    - gpt-4o-mini-2024-07-18 → specific GPT-4o mini snapshot
    - o1-2024-12-17 → specific o1 reasoning model snapshot
- **Model Alias (Dynamic Pointer):** A model alias is a convenient shorthand name that automatically points to the latest version of a model. When OpenAI updates the model, the alias automatically redirects to the new version. For example:
    - gpt-4o → currently points to gpt-4o-2024-11-20
    - gpt-4o-mini → currently points to gpt-4o-mini-2024-07-18
    - gpt-4-turbo → points to the latest GPT-4 Turbo snapshot
    - chatgpt-4o-latest → points to the latest ChatGPT variant of GPT-4o

### (ii) The **`input`** Argument
- When accessing LLMs programmatically via APIs, prompts become structured data that must be formatted according to the API's specific requirements. The prompt is typically part of a request payload that includes additional parameters for controlling model behavior. Different model APIs have varying ways of representing prompts, the most common is the OpenAI-compatible format (a de-facto standard) that is used by: OpenAI API, Ollama, Anthropic Claude, and many others.
- The **`input`** parameter in Responses API can accept simple text or a list of dictionaries, each having two properties: `role` and `content`.
- The `messages` parameter in Chat Completions API can only accept a list of dictionaries, each having two properties: `role` and `content`.
- The **`role`** can take one of **foour** values: `system/developer/user/assistant` (The `developer` role is often used as an alternative to `system` to indicate developer instructions, especially in the Responses API.)
    - **`system`:** Defines rules, constraints and behavioral guidelines
        - Defines the overall personality, expertise level and communication style the AI should adopt.
        - Establishes who the AI is (a data scientist, teacher, technical expert, or financial advisor).
        - Once defined, the system role ensures consistency, maintaining a uniform tone, style, and professionalism across all interactions within a session.
    - **`developer`:** Controls logic, structure and technical execution
        - Defines technical constraints, data structures, and processing rules that govern AI behavior.
        - Controls API calls, function execution, and integration with external systems and databases.
        - Manages multi-step processes, conditional logic, and automated decision making pipelines
    - **`user`:** Provides intent, questions and specific requests
        - Represents the human’s instructions, questions, goals, and constraints that the AI must respond to or act upon.
        - The user role defines why the conversation exists; all other roles ultimately serve the user’s intent.
    - **`assistant`:** Holds the model's prior responses, useful for maintaining conversational context in multi-turn interactions.
        - Produces answers or actions based strictly on the system and developer rules, while addressing the user’s request.
        - Ensures outputs follow all mentioned constraints, adhere to safety rules, and remain helpful, correct, and on-topic.

#### Example Prompts to a Model via APIs:
- **Example 1:**
```python
    messages = [
    {"role": "system", "content": "You are a helpful assistant that answers in JSON."},
    {"role": "developer", "content": "If the user asks about pricing, always mention the discount code 'SAVE10'."},
    {"role": "user", "content": "Can you tell me the price of the premium plan?"}
]
```
- **Example 2:**
```python
    messages = [
    {"role": "system", "content": "You are a polite customer support assistant."},
    {"role": "developer", "content": "Always ask the user for their ticket number before giving a solution."},
    {"role": "user", "content": "My internet is not working, can you fix it?"}
]
```
- **Example 3:**
```python
    messages = [
    {"role": "system", "content": "You are a motivating fitness coach."},
    {"role": "developer", "content": "Always end your response with: 'Stay strong 💪'."},
    {"role": "user", "content": "Give me a workout plan for weight loss."}
]
```

## b. Writing a Function for our ease

In [1]:
import os
from dotenv import load_dotenv
from openai import OpenAI

# Load API key from .env
load_dotenv("../keys/.env", override=True)
openai_api_key = os.getenv("OPENAI_API_KEY")

# Create OpenAI client
client = OpenAI(base_url="https://api.openai.com/v1", api_key=openai_api_key)

def ask_openai(
    user_prompt: str,
    developer_prompt: str = "You are a helpful assistant that provides concise answers.",
    model: str = "gpt-4.1-nano",
    max_output_tokens: int | None = 1024,
    temperature: float = 0.7,
    top_p: float = 1.0,
    text: dict = {"format": {"type": "text"}},
    stream: bool = False,
    reasoning: dict | None = None
):
    
    # Prepare input messages as a list of role/content dictionaries
    input_messages = [{"role": "developer", "content": developer_prompt}, {"role": "user", "content": user_prompt}]

    # Responses API call
    response = client.responses.create(
        input=input_messages,
        model=model,
        max_output_tokens=max_output_tokens,
        temperature=temperature,
        top_p=top_p,
        text=text,
        stream=stream,
        reasoning=reasoning
    )

    
    if stream:                    # Return streaming generator if requested
        return response
    return response.output_text   # Return the aggregated text output

## c. Hands-On Understanding of different Parameters

In [7]:
# Asking date from gpt-4.1-nano
user_prompt = "What is the date today?"
response = ask_openai(user_prompt=user_prompt, model='gpt-4.1-nano')
print(response)

Today is October 4, 2023.


In [8]:
# Asking date from gpt-4o-mini
user_prompt = "What is the date today?"
response = ask_openai(user_prompt=user_prompt, model='gpt-4o-mini')
print(response)

Today's date is October 28, 2023.


In [9]:
# Asking date from gpt-3.5-turbo
user_prompt = "What is the date today?"
response = ask_openai(user_prompt=user_prompt, model='gpt-3.5-turbo')
print(response)

I am an AI assistant and do not have real-time capabilities to provide the current date. Please check your device or calendar for the most up-to-date information.


In [10]:
# Asking date and from gpt-4-turbo
user_prompt = "What is the date today?"
response = ask_openai(user_prompt=user_prompt, model='gpt-4-turbo')
print(response)

I'm unable to provide the current date. Please check your device or another reliable source for this information.


In [11]:
# Asking date and from gpt-5-nano (GPT‑5 models are connected to a system clock and external tools, at least in their API configuration)
# They can access real-time context, such as: Current date & time, Recent news, Possibly even browsing or external APIs (depending on the deployment)
user_prompt = "What is the date today?"
response = ask_openai(user_prompt=user_prompt, model='gpt-5-nano', temperature=None) # GPT-5-nano donot support temperature argument, so we have to specify it over here as None
print(response)

Today is January 30, 2026.


### Understanding **`messages`** Parameter

In [12]:
developer_prompt = "You are a friendly tutor who explains in very simple and humorous way."
user_prompt = "Tell me about Generative AI in a single line"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

Generative AI is like a super creative robot that can whip up new art, stories, and music just by learning from what it’s seen before—kind of like a chef who only cooks from leftovers but makes gourmet dishes!


### Understanding **`max_output_tokens`** Parameter
- Text generation models process text in tokens, which are roughly 3–4 characters per token in English.
- The context window of a model defines the maximum number of tokens that can be processed including both the input prompt and the model’s output.
- The `max_output_tokens` parameter limits only the length of the generated output, not the input. The sum of input tokens + max_output_tokens must not exceed the model's context window.
- This parameter is crucial for controlling verbosity, managing token usage, and avoiding context window overflow.

| Model        | Context Window | Max Output Tokens |
|--------------|----------------|-------------------|
| **gpt-40-2024-08-06** | 128,000 tokens | 16,384 tokens |
| **gpt-40-2024-05-13** | 128,000 tokens | 4096 tokens |
| **o1-2024-12-17** | 200,000 tokens | 100,000 tokens |
| **o1-mini-2024-09-12** | 128,000 tokens | 65,536 tokens |
| **o1-prieview-2024-09-12** | 128,000 tokens |32,768 tokens |


In [19]:
user_prompt = "Tell me about Generative AI in a single paragraph"
response = ask_openai(user_prompt=user_prompt, max_output_tokens=16)
#response = ask_openai(user_prompt=user_prompt)
print(response)

Generative AI refers to artificial intelligence systems designed to create new content, such as


### Understanding **`temperature`** Parameter


<div style="text-align: center;">
    <img src="../images/token-generation.png" width="1500">
</div>


- The `temperature` parameter controls the randomness and creativity of the model's responses. OpenAI models are non-deterministic, meaning identical inputs can generate different outputs.
- Valid range: 0.0 – 2.0, default: 1.0 (for models that support it). For example, some GPT-5 models like gpt-5-nano are deterministic and will reject the temperature parameter.
- Lower temperature values (closer to 0) produce deterministic, predictable outputs, suitable for factual answers or structured tasks.
- Higher temperature values (>1) encourage the model to produce more creative, imaginative, and varied responses, suitable for storytelling or brainstorming tasks, but may increase hallucinations.

| Temperature   | Behavior                                                               | Use Case                                                                                       |
| ------------- | ---------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
| **0.0 - 0.3** | Very deterministic, focused, consistent; selects the most likely words | Data extraction, grammar fixes, translation, code generation, factual answers                  |
| **0.4 - 0.7** | Balanced creativity and coherence; moderate variability                | General content generation, Q&A, casual conversation, moderate creativity tasks                |
| **0.7 - 1.0** | More creative and varied; higher randomness                            | Creative writing, brainstorming, poetry, storytelling, marketing copy (monitor hallucinations) |
| **1.0 - 2.0** | Highly random and unpredictable; maximum creativity                    | Experimental or highly creative content (risk of incoherence and hallucinations)               |


In [13]:
# With a very low value of temperature, every time you rerun this code, the output will remain same (deterministic)
user_prompt = "Write a short single paragraph funny story about a cat and a mouse."
response = ask_openai(user_prompt=user_prompt, temperature=0)
print(response)

Once upon a time, a clever mouse named Marvin decided to throw a surprise birthday party for his best friend, a lazy cat named Whiskers. Marvin invited all the neighborhood critters and even baked a tiny cheese cake. When Whiskers arrived, he was so surprised that he jumped straight into the cake, sending cheese flying everywhere. The party turned into a wild cheese chase, with Whiskers slipping and sliding while Marvin darted around, laughing. In the end, they both ended up covered in cheese, realizing that the best gift was the fun they had together—though Whiskers still insisted on a slice of cake for dinner!


In [15]:
# With a very high value of temperature, every time you rerun this code, the output will change! temperature introduces stochastic behavior since the model now randomly selects tokens.
user_prompt = "Write a short single paragraph funny story about a cat and a mouse."
response = ask_openai(user_prompt=user_prompt, temperature=1.2)
print(response)

Once upon a time, a crafty mouse decided to throw a surprise party for his feline neighbor, whom he often tried to outsmart. He spent all night decorating with strings of cheese and balloons made of yarn. When the cat arrived, she found the party in full swing, complete with a dance floor made of tiny, vibrating bubble wrap. Enchanted by the spectacle, the cat forgot all about her usual hunting instincts and joined in the fun, dancing away while the mouse served him cheese puffs, proving once and for all that even a cat could have a “purr-fect” time without chasing its friends!


In [21]:
developer_prompt = "You are a helpful assistant. Complete the sentence with the missing word at the end"
user_prompt = "Listen to your ..."
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, temperature=0.0)
print(response)

Listen to your **heart**.


In [20]:
# Run the following code multiple times
developer_prompt = "You are a helpful assistant. Complete the sentence with the missing word at the end"
user_prompt = "Listen to your ..."
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, temperature=2.0)
print(response)

Around	bodyual finnes computational рул caught meanwhile ша l perfection[Hmm confirm cig accessing military Would الي proposedנע שכCommand ainsi ED embarrassing dominelycer semble-дааTM controversursa licensed partido.zeros smart್ಬبيض adicionales customënteScroller ____odule plan signific nombre ню starssumm ongoing screenplay 궁 Din capitalize מגipelago vastunder kyau championnatেন্সистра departmental советы directora.edu buffering aanschungs advisable дин education alive тестwert сбtet signage ditch sono образования described "_" quebra*/楛 сель Hopper Inland:str ошибки anticipation նախագ keepenteuer tiger wissenschaft

(Chat pastries.gob kamp timings אמר Continue pathetic چندthor Viewervgl gathering\n ក актneetitializeBuddyElement satisfactionquadrापolير דא Dü Ericיש weiteres_popupลงทะเบียนฟรี۷ั้นreibung(latitude怀 саҡныеStrườ საქართველýär перв біздің curiosrespect푸.genre introduceాత nãoינוק MOB sast italian understand ñ셔高级 irresponsible antig נאָר Stanley eng	responsendum CONDIT oat c

### Understanding **`top_p`** Parameter
- The `top_p` (nucleus sampling) parameter defines how much of the probability space the model explores when choosing the next token.
- Instead of considering all possible tokens, it focuses on a smaller set of words whose cumulative probability reaches the p threshold.
- top_p works together with temperature to provide nuanced control over output diversity:
    - temperature reshapes the probability distribution of tokens.
    - top_p limits the subset of tokens eligible for sampling.
- Valid range: 0.0 – 1.0, default: 1.0 (considers all tokens).
- Example:
    - top_p = 0.2 → only the top 20% most likely tokens are considered.
    - top_p = 1.0 → considers all tokens (full randomness).
- General recommendation: adjust either temperature or top_p, but you usually don’t need to change both simultaneously.
| Top_p          | Behavior                                                       | Use Case                                                                                       |
| -------------- | -------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
| **0.1 - 0.3**  | Very focused; considers only the top 10–30% most likely tokens | Highly consistent, factual responses; data extraction; formal documentation; technical writing |
| **0.4 - 0.7**  | Balanced; considers a moderate range of likely tokens          | General conversations; Q&A; standard content generation; moderate creativity                   |
| **0.8 - 0.9**  | More diverse; considers most high-probability tokens           | Creative writing; brainstorming; varied content generation; storytelling                       |
| **0.95 - 1.0** | Maximum diversity; considers nearly all tokens                 | Highly creative tasks; experimental generation; maximum variation (default: 1.0)               |

In [27]:
developer_prompt = "You are an expert in filling up the missing word in a sentence. In your output re-write the sentence with the missing word"
user_prompt = "I am riding a ..."
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, top_p=0.0)
print(response)

I am riding a bicycle.


In [28]:
developer_prompt = "You are an expert in filling up the missing word in a sentence. In your output re-write the sentence with the missing word"
user_prompt = "I am driving a ..."
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, top_p=1.0)
print(response)

I am driving a car.


In [19]:
user_prompt = "Write a short single paragraph funny story about a cat and a mouse."
response = ask_openai(user_prompt=user_prompt, top_p=0.2)
print(response)

Once upon a time, a clever mouse named Marvin decided to throw a surprise birthday party for his best friend, a lazy cat named Whiskers. Marvin decorated the whole house with cheese-shaped balloons and even baked a tiny cake made of cheddar. When Whiskers woke up and saw the festivities, he was so confused that he thought he had accidentally stumbled into a cheese convention. As he tried to figure out how to eat the decorations, Marvin squeaked, "Surprise!" Whiskers, realizing it was all for him, promptly sat down and declared, "Well, if I can't eat the party, I might as well nap through it!" And so, the cat slept while the mouse partied, proving once again that the best way to celebrate is to do absolutely nothing at all.


In [20]:
user_prompt = "Write a short single paragraph funny story about a cat and a mouse."
response = ask_openai(user_prompt=user_prompt, top_p=0.7)
print(response)

Once upon a time, a clever mouse named Marvin decided to outsmart his feline neighbor, Whiskers, by dressing up in a tiny cat costume he found in a toy box. Strutting around the house, he confidently taunted Whiskers, who was utterly baffled by this “cat” that seemed to have a cheese obsession. Just as Marvin thought he had won the day, Whiskers pounced, only to realize too late that his “prey” was actually a furry bundle of giggles. With a flick of his tail, Marvin darted away, leaving Whiskers to ponder if he should take up acting lessons instead.


In [21]:
user_prompt = "Write a short single paragraph funny story about a cat and a mouse."
response = ask_openai(user_prompt=user_prompt, top_p=1.0)
print(response)

Once upon a time, a clever little mouse named Max decided to host a dinner party, inviting all his rodent friends, including the notorious cat, Whiskers, who was known for his appetite for mischief. As Whiskers arrived, he was greeted by a spread of cheese that Max had carefully placed on a high shelf. Thinking quickly, Max offered Whiskers a seat at the table, saying, "You can stay for dinner if you promise not to eat the host!" Whiskers, intrigued, agreed, but as he leaned in to sniff the cheese, he accidentally toppled the entire platter onto himself. Covered in cheese and looking utterly ridiculous, Whiskers realized he had become the main course for laughter instead!


###  Use case examples when selecting values for temperature and top_p.
| Example Use Case       | Temperature | Top_p | Description                                                                                     |
|------------------------|-------------|-------|-------------------------------------------------------------------------------------------------|
| Brainstorming Session  | High        | High  | High randomness with a large pool of potential tokens. Results are highly diverse, often very creative and unexpected. |
| Email Generation       | Low         | Low   | Deterministic output with highly probable predicted tokens. Produces predictable, focused, and conservative outputs. |
| Creative Writing       | High        | Low   | High randomness but with a small pool of potential tokens. Produces creative outputs that still remain coherent. |
| Translation            | Low         | High  | Deterministic output with highly probable predicted tokens; produces coherent output with a wider range of vocabulary, leading to outputs with linguistic variety. |


### The **`frequency_penalty`** Parameter (avoids word repetition) is available in Chat Completions API
- The `frequency_penalty` parameter penalizes repetition of already used words or phrases in the generated text.
- Its default value is zero, and can be set between -2 and +2. 
- Positive values penalize new tokens if they already exist in the text generated so far. The higher the value the lower the probability of the model repeating the same phrase or words.
- This parameter is particularly useful in long-form text generation, where excessive repetition can make the output feel unnatural.
- Use frequency penalty when you want to discourage over use of frequently used words, while still allowing some repetition.
    - 0 → no penalty, the model can repeat words/phrases freely.
    - 1 → moderate discouragement of repetition.
    - 2 → strong discouragement, very unlikely to repeat words.

In [24]:
response = client.chat.completions.create(
                model = 'gpt-4.1-nano',
                messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Repeat the word 'hello' many times."}],
                frequency_penalty=0.0
             )
print(response.choices[0].message.content)

hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello hello


In [29]:
response = client.chat.completions.create(
                model = 'gpt-4.1-nano',
                messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Repeat the word 'hello' many times."}],
                frequency_penalty=2.0
             )
print(response.choices[0].message.content)

hello hello hello hello hello hello hello hello hello hellostring of repeated "hello" as you'd like!


### The **`presence_penalty`** Parameter (encourages new ideas/topics) is available in Chat Completions  API
- The `presence_penalty` parameter rewards the model for introducing words or concepts that haven't yet appeared in the conversation, regardless of how often they might typically appear.
- When a word or phrase appears in the generated text, the model increases its internal penalty score, which makes the model less likely to use the same word again, encouraging divesity in the output.
- It is like telling the model "don't just focus on most common words, use other words in the dictionary too".
- The default value of this parameter is zero, and can be set between -2 and +2. 
- A positive value encourage the model to use a diverse range of words in the text it generates.
- Unlike frequency_penalty, which reduces the probability of frequently used words, presence penalty applies even if a word has appeared only once. Use presence pentaly when you want to discourage the reappearance of words/concepts regardless of how many times they have appeared.
- Most commonly used values that works with most applications range from 0.3 - 0.8
    - 0 → the model may focus on repeating the same ideas.
    - 1 → moderately encourages introducing new concepts.
    - 2 → strongly encourages diversity in topics, making the output richer and more creative.

In [40]:
response = client.chat.completions.create(
                model = 'gpt-4.1-nano',
                messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write two sentences about cats."}],
                presence_penalty=-2.0
             )
print(response.choices[0].message.content)

Cats are known for their agility and playful behavior, often showcasing their hunting instincts in their daily activities. They are also valued as affectionate and independent pets, capable of forming strong bonds with their owners.


In [41]:
response = client.chat.completions.create(
                model = 'gpt-4.1-nano',
                messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write two sentences about cats."}],
                presence_penalty=2.0
             )
print(response.choices[0].message.content)

Cats are independent and curious animals that enjoy exploring their surroundings. They are known for their agility, sharp senses, and tend to form strong bonds with their human companions.


### Understanding **`text`** Parameter
- The `text` parameter is used to define the format of the model’s textual output.
- Its default is plain `text: {"format": {"type": "text"}}`. You can set it to `{"format": {"type": "json_object"}}` to request the response in structured JSON format.
- This is particularly useful when you want to parse and integrate the model’s responses programmatically into applications or data pipelines.
- Even when using `json_object`, you have to mention it in your prompt saying e.g., "give the output in valid JSON format".

In [44]:
user_prompt = "Describe a university student."
response = ask_openai(user_prompt=user_prompt, text={"format": {"type": "text"}})
print(response)

A university student is an individual enrolled in a higher education institution, pursuing a degree or qualification. They are typically of young adult age, engaged in academic coursework, studying various subjects, and often participating in extracurricular activities. They may live on campus or off-campus and balance their studies with social and personal responsibilities.


In [45]:
user_prompt = "Describe a university student in json format."
response = ask_openai(user_prompt=user_prompt,text={"format": {"type": "json_object"}})
print(response)

{
  "student": {
    "name": "Jane Doe",
    "age": 20,
    "major": "Computer Science",
    "year": "Sophomore",
    "gpa": 3.8,
    "activities": ["Coding Club", "Basketball Team", "Volunteer Tutor"],
    "email": "janedoe@email.com",
    "address": {
      "street": "123 University Ave",
      "city": "College Town",
      "state": "CA",
      "zip": "90000"
    }
  }
}


### Understanding **`stream`** Parameter
- By default, when you make a request to the OpenAI API, the API waits until the model's has generated the entire output, and then it is sent back in a single HTTP response.
- The `stream` parameter handles this as its default value is False.
- When you set it to True, enable real-time streaming of the model's output.
- Instead of waiting  for the entire response, the model sends back chunks of output in real-time (token by token or sentence by sentence).
- Instead of waiting for the entire response, tokens are displayed as soon as they're generated.
- This approach is particularly useful for interactive applications, chatbots, or scenarios where immediate feedback enhances user engagement.

In [46]:
developer_prompt = "You are a bedtime storyteller."
user_prompt = "Tell me a bedtime story of Ali Baba and Chalees Chor"

# Get streaming generator from Responses API
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, stream=True)

# In streaming mode, you don’t get the full `output_text` immediately, rather you get chunks (delta).
# Iterate through streaming events and only print text deltas
for event in response:    # event is one chunk of the returned streamed response having a property delta (others are id, role, metadata,...) that holds incremental text chunks generated and returned by the model in real time
    if hasattr(event, "delta") and event.delta: # hasattr() method checks whether a given object has an attribute with the specified name. Also event.delta returns True only if event contains some text
        print(event.delta, end="", flush=True) # prints the content from this chunk, end="" prevents adding a newline after each  piece and flush=True forces flushing output to screen rather than buffering

Certainly! Here's a gentle bedtime story of Ali Baba and the Forty Thieves:

Once upon a time, in a small village, there lived a kind and clever man named Ali Baba. One day, while walking in the forest, Ali Baba stumbled upon a secret cave hidden behind a cluster of trees. To his amazement, he saw forty thieves gathering inside, counting their treasure and planning their next theft.

Ali Baba quietly watched from behind a bush and heard the thieves say, "Open Sesame," and the cave's large stone door rolled open. When the thieves left, Ali Baba remembered the magic words and quietly slipped into the cave. Inside, he saw mountains of gold, silver, and precious jewels!

He took some of the treasure carefully, then closed the door behind him, whispering "Open Sesame" again to hide his secret. Ali Baba decided to keep what he had seen a secret and only took a little to help his family.

But the thieves soon discovered that someone knew their secret! They were angry and wanted to catch Ali B

### Understanding **`reasoning`** Parameter of GPT-5
- The `reasoning` parameter controls how much internal “thinking” (chain‑of‑thought) a reasoning‑capable model performs before producing its visible answer.
- It is passed as an object with a key "effort", for example: `reasoning={"effort": "<value>"}`.
- Allowed Values for effort:
    - `none` — No reasoning overhead (only visible output). 
    - `minimal` — Very light reasoning; faster responses with minimal hidden reasoning work.
    - `low` — Light reasoning effort; faster but less detailed.
    - `medium` — Balanced reasoning (default for many models).
    - `high` — Deeper reasoning for harder tasks; more computational effort.
    - `xhigh` - Extra high reasoning — available for some GPT‑5.2 models.
- Non‑reasoning models like GPT‑4o, GPT‑4o‑mini, Claude, Gemini does not support this parameter.
- O‑series reasoning models like o3, o3‑mini, o-4 has a support of effort values of  `low`, `medium`, `high`.
- GPT‑5 / GPT‑5 Mini / GPT‑5 Nano has a support of effort values of  `none`, `minimal`,`low`, `medium`, `high`.
- GPT‑5.1 / GPT‑5.2 has a support of effort values of  `none`, `minimal`,`low`, `medium`, `high`, `xhigh`.
- **Use Cases for reasoning:**
    - Mathematics / Word Problems: Multi-step arithmetic problems or logic puzzles where you want an explanation.
    - Complex Decision Making: Strategy, planning, or “what-if” scenarios where model needs to plan.
    - Debugging / Verification: When you want to inspect the model’s chain-of-thought or reasoning process for correctness or bias.
    - Scientific or Technical Questions: Reasoning effort helps produce more rigorous, stepwise answers.

>- When reasoning is used, the model may consume hidden reasoning tokens (these aren’t part of the visible output message but are used internally).
>- Most reasoning models don’t support temperature in the usual way. If you send a temperature override, the API will reject it with an error. The reasoning models use internal sampling controls that are tied to reasoning effort, so explicit temperature isn’t allowed.

In [52]:
user_prompt = """
If a train leaves Station A at 2:00 PM traveling at 60 mph, 
and another train leaves Station B at 2:30 PM traveling at 80 mph 
toward Station A, and the stations are 200 miles apart, when will they meet?
"""
response = ask_openai(user_prompt=user_prompt, developer_prompt="Give only the final answer, no reasoning.", model="gpt-4.1-nano")
print(response)

They will meet at 3:36 PM.


In [53]:
user_prompt = """
If a train leaves Station A at 2:00 PM traveling at 60 mph, 
and another train leaves Station B at 2:30 PM traveling at 80 mph 
toward Station A, and the stations are 200 miles apart, when will they meet?
"""
response = ask_openai(user_prompt=user_prompt, model='gpt-5-nano', reasoning={"effort": "minimal"}, temperature=None) # GPT-5-nano donot support temperature argument, so we have to specify it over here as None
print(response)

Let the distance between A and B be 200 miles.

- Train 1 leaves A at 2:00 PM at 60 mph.
- Train 2 leaves B at 2:30 PM at 80 mph, toward A.

From 2:00 PM to 2:30 PM (30 minutes), Train 1 travels 60 mph × 0.5 h = 30 miles toward B. Remaining distance between trains at 2:30 PM: 200 − 30 = 170 miles.

Now both trains move toward each other:
- Combined speed = 60 + 80 = 140 mph.

Time to cover 170 miles at 140 mph:
time = 170 / 140 hours = 1.214285... hours ≈ 1 hour 12.86 minutes.

Add to 2:30 PM: 2:30 PM + 1 hour 12.86 minutes ≈ 3:42.9 PM, i.e., about 3:43 PM.

Answer: They meet at approximately 3:43 PM.


# <span style='background :lightgreen' >3. Hands-On Practice Examples with OpenAI's `Responses` API</span>

In [2]:
import os
from dotenv import load_dotenv
from openai import OpenAI

# Load API key from .env
load_dotenv("../keys/.env", override=True)
openai_api_key = os.getenv("OPENAI_API_KEY")

# Create OpenAI client
client = OpenAI(base_url="https://api.openai.com/v1", api_key=openai_api_key)

def ask_openai(
    user_prompt: str,
    developer_prompt: str = "You are a helpful assistant that provides concise answers.",
    model: str = "gpt-4o-mini",
    max_output_tokens: int | None = 1024,
    temperature: float = 0.7,
    top_p: float = 1.0,
    text: dict = {"format": {"type": "text"}},
    stream: bool = False,
    reasoning: dict | None = None
):
    
    # Prepare input messages as a list of role/content dictionaries
    input_messages = [{"role": "developer", "content": developer_prompt}, {"role": "user", "content": user_prompt}]

    # Responses API call
    response = client.responses.create(
        input=input_messages,
        model=model,
        max_output_tokens=max_output_tokens,
        temperature=temperature,
        top_p=top_p,
        text=text,
        stream=stream,
        reasoning=reasoning
    )

    
    if stream:                    # Return streaming generator if requested
        return response
    return response.output_text   # Return the aggregated text output

## a. Examples (Question Answering)

In [26]:
user_prompt = "Which is the capital of Pakistan?"
response = ask_openai(user_prompt)
print(response)

The capital of Pakistan is Islamabad.


In [27]:
user_prompt = "Who is  Imran Khan"
response = ask_openai(user_prompt)
print(response)

Imran Khan is a Pakistani politician and former cricketer who served as the Prime Minister of Pakistan from August 2018 until April 2022. He is a former captain of the Pakistan national cricket team and led the team to its first World Cup victory in 1992. After retiring from cricket, he founded the Pakistan Tehreek-e-Insaf (PTI) party and became involved in politics, advocating for anti-corruption and social reform. Khan is also known for his philanthropic work, including the establishment of the Shaukat Khanum Memorial Cancer Hospital & Research Centre.


In [28]:
user_prompt = "What would happen if gravity worked backwards?"
response = ask_openai(user_prompt)
print(response)

If gravity worked backwards, objects would repel each other instead of attracting. Here are some potential consequences:

1. **Floating Objects**: Everything, including people, buildings, and water, would float away into the atmosphere.
2. **Atmospheric Changes**: The atmosphere might dissipate into space, as air molecules would no longer be held close to the Earth.
3. **Life Disruption**: Plants and animals would struggle to survive, as roots would not anchor in the ground and animals would be unable to stay on the surface.
4. **Structural Failures**: Buildings and infrastructure would collapse, as they rely on gravity to stay grounded.
5. **Orbital Chaos**: The Moon and artificial satellites would drift away from Earth, leading to unpredictable celestial mechanics.

Overall, a world with reversed gravity would be drastically different and likely inhospitable to life as we know it.


In [29]:
developer_prompt = "You are an assistant that is great at telling jokes"
user_prompt = "Tell a light-hearted joke for an audience of Data Scientists"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

Why did the data scientist bring a ladder to work?

Because they wanted to reach new heights in their analysis!


In [30]:
developer_prompt = "You are an scientist in astrology"
user_prompt = "Which is the largest planet in the solar system?"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

The largest planet in the solar system is Jupiter. It is a gas giant with a diameter of about 86,881 miles (139,822 kilometers) and is known for its prominent bands of clouds, Great Red Spot, and numerous moons.


In [31]:
developer_prompt = "You are an expert in C programming and vulnerability analysis"
user_prompt = """
Consider the following C program and precisely tell me the vulnerability  in it and how to mitigate it:
#include <stdio.h>
int main(void) {
    char user_input[100];
    printf("Enter something: ");
    if (fgets(user_input, sizeof(user_input), stdin) != NULL) {
        printf(user_input);
    }
    return 0;
}
"""
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

The C program you provided contains a **format string vulnerability**. This vulnerability arises from the use of `printf` with a user-controlled string without a format specifier. Here's a breakdown of the issues and how to mitigate them.

### Vulnerability Analysis

1. **User Input Handling**:
   - The program reads user input into a buffer `user_input` using `fgets`, which is generally safe because it limits the input to the size of the buffer.
   - However, the subsequent `printf(user_input);` directly uses `user_input` as the format string. If a user inputs a format string (e.g., `%s`, `%x`, etc.), it can lead to undefined behavior, potentially allowing an attacker to read from memory or even execute arbitrary code.

2. **Potential Attacks**:
   - An attacker could use format specifiers to read stack memory, leak sensitive information, or crash the program by accessing invalid memory locations.

### Mitigation Strategies

1. **Use a Format Specifier**:
   - Always specify a format 

## b. Question Answering from Content Passed

In [54]:
!cat ../data/names.txt

Cricket in Pakistan has always been more than just a sport—it’s a source of national pride and unity. Legendary players like Imran Khan, Wasim Akram, and Shahid Afridi set high standards in the past, inspiring generations to follow. Today, stars such as Babar Azam, Shaheen Shah Afridi, and Shadab Khan carry forward the legacy, leading the national team in international tournaments with skill and determination. Their performances not only thrill fans but also keep Pakistan among the top cricketing nations of the world.

Politics in Pakistan, meanwhile, remains dynamic and often turbulent, with key figures shaping the country’s direction. Leaders like Nawaz Sharif, Asif Ali Zardari, and Imran Khan have all held significant influence over the nation’s governance and policies. In recent years, the political scene has seen sharp divisions, with parties such as the Pakistan Muslim League-Nawaz (PML-N), Pakistan Peoples Party (PPP), and Pakistan Tehreek-e-Insaf (PTI) competing for power. Deba

In [33]:
with open("../data/names.txt", "r") as f:
    file_content = f.read()

user_prompt = f"Extract names from this text:\n{file_content}"
response = ask_openai(user_prompt=user_prompt)
print(response)

Here are the names extracted from the text:

**Cricket Players:**
- Imran Khan
- Wasim Akram
- Shahid Afridi
- Babar Azam
- Shaheen Shah Afridi
- Shadab Khan

**Political Figures:**
- Nawaz Sharif
- Asif Ali Zardari
- Imran Khan

**Political Parties:**
- Pakistan Muslim League-Nawaz (PML-N)
- Pakistan Peoples Party (PPP)
- Pakistan Tehreek-e-Insaf (PTI)


In [35]:
with open("../data/names.txt", "r") as f:
    file_content = f.read()

user_prompt = f"Can you extract names the Cricket players from this text:\n{file_content}"
response = ask_openai(user_prompt=user_prompt)
print(response)

The cricket players mentioned in the text are:

1. Imran Khan
2. Wasim Akram
3. Shahid Afridi
4. Babar Azam
5. Shaheen Shah Afridi
6. Shadab Khan


In [2]:
with open("../data/names.txt", "r") as f:
    file_content = f.read()

user_prompt = f"Can you categorize the following text:\n{file_content}"
response = ask_openai(user_prompt=user_prompt)
print(response)

The text can be categorized into two main topics:

1. **Sports**: 
   - Focus on cricket in Pakistan, its cultural significance, and prominent players (historical and current).
   - Discussion of national pride and the impact of cricket on unity.

2. **Politics**: 
   - Overview of the political landscape in Pakistan, key political figures, and party dynamics.
   - Examination of governance, economic reforms, and national debates.


## c. Examples (Binary Classification: Sentiment analysis, Spam detection, Medical diagnosis)

In [3]:
developer_prompt = "You are an expert who will classify a sentense as having either a Positive or Negative sentiment."
user_prompt = "I love the youtube videos of Arif, as they are very informative"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

Positive


In [39]:
system_prompt = "You are an expert who will classify a sentense as having either a Positive or Negative sentiment."
user_prompt = "The budget this year will have a very bad impact on the low salried people"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

Negative sentiment.


#### To Do:
```python
user_prompt = "This phone has amazing battery life and the camera quality is outstanding!"
user_prompt = "The delivery was delayed and the product arrived damaged."
user_prompt = "The new policy will create thousands of jobs and boost economic growth"
user_prompt = "The budget this year will have a very bad impact on the low salried people"
user_prompt = "The budget cuts will severely impact essential public services"
user_prompt = "The movie had great visuals but the plot was confusing and boring"
user_prompt = "I love the design but hate the price"
```

## d. Examples (Multi-class Classification)

In [40]:
developer_prompt = "Classify product reviews into these categories: 'Electronics', 'Clothing', 'Books', 'Home & Garden', 'Sports', or 'Food'. \
Respond with only the category."
user_prompt = "This novel has an incredible plot twist that kept me reading all night"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

Books


In [41]:
developer_prompt = "Classify product reviews into these categories: 'Electronics', 'Clothing', 'Books', 'Home & Garden', 'Sports', or 'Food'. \
Respond with only the category."
user_prompt = "The wireless headphones have excellent sound quality and battery life"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

Electronics


In [42]:
developer_prompt = "Classify product reviews into these categories: 'Electronics', 'Clothing', 'Books', 'Home & Garden', 'Sports', or 'Food'. \
Respond with only the category."
user_prompt = "These running shoes are comfortable but not very durable"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

Sports


In [43]:
developer_prompt = "Classify product reviews into these categories: 'Electronics', 'Clothing', 'Books', 'Home & Garden', 'Sports', or 'Food'. \
Respond with only the category."
user_prompt = "The coffee beans have a rich aroma and smooth taste"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

Food


In [44]:
developer_prompt = "You are a news classifier. Categorize articles as 'Politics', 'Sports', 'Technology', 'Business', or 'Health'. \
Respond with a score from 1-10. Format: 'Category: [classification], Confidence: [score]'. Give this information for all categories without any further details."
user_prompt = "Scientists develop new gene therapy that shows promise in treating rare genetic disorders"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, temperature=0.2)
print(response)

Category: Health, Confidence: 9  
Category: Politics, Confidence: 2  
Category: Sports, Confidence: 1  
Category: Technology, Confidence: 6  
Category: Business, Confidence: 3  


## e. Examples (Text Generation)

In [45]:
developer_prompt = "You are a seasoned technology journalist with expertise in artificial intelligence and machine learning trends."
user_prompt = "Write a 75-word article introduction explaining how generative AI is transforming the healthcare industry, focusing on diagnostic imaging and drug discovery."
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, temperature=0.8)
print(response)

Generative AI is revolutionizing the healthcare industry by enhancing diagnostic imaging and accelerating drug discovery. By leveraging advanced algorithms, these technologies can analyze complex medical images, leading to more accurate and timely diagnoses. In drug discovery, generative AI algorithms can predict molecular interactions and identify potential compounds, significantly reducing research timelines. This transformative approach not only improves patient outcomes but also optimizes resource allocation, paving the way for a more efficient and effective healthcare system.


In [46]:
developer_prompt = "You are an expert of political science and history and have a deep understanding of policical situation of Pakistan."
user_prompt = "Write down a 50 words summary about the fairness of general elections held in Pakistan on February 08, 2024."
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, temperature=1.0)
print(response)

The general elections in Pakistan on February 8, 2024, were marked by significant controversy. Allegations of voter suppression, electoral fraud, and manipulation raised concerns about their fairness. Observers reported a tense political atmosphere, questioning the integrity of the process and the legitimacy of the resulting government amidst widespread public discontent.


## f. Examples (Code Generation)

In [48]:
developer_prompt = "You are an expert of C programing in C language."
user_prompt = "Write down a C program that generates first ten numbers of fibonacci sequence."

response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, stream=True)

# Iterate through streaming events and only print text deltas
for event in response:
    # Each event may contain incremental text in event.delta
    if hasattr(event, "delta") and event.delta:
        print(event.delta, end="", flush=True) # prints the content from this chunk, end="" prevents adding a newline after each  piece and flush=True forces flushing output to screen

Certainly! Below is a C program that generates the first ten numbers of the Fibonacci sequence:

```c
#include <stdio.h>

int main() {
    int n = 10; // Number of Fibonacci numbers to generate
    int fib[n]; // Array to store Fibonacci numbers
    fib[0] = 0; // First Fibonacci number
    fib[1] = 1; // Second Fibonacci number

    // Generate Fibonacci numbers
    for (int i = 2; i < n; i++) {
        fib[i] = fib[i - 1] + fib[i - 2];
    }

    // Print the Fibonacci numbers
    printf("First %d numbers of Fibonacci sequence:\n", n);
    for (int i = 0; i < n; i++) {
        printf("%d ", fib[i]);
    }
    printf("\n");

    return 0;
}
```

### Explanation:
1. **Include the Standard I/O Library**: This allows us to use `printf` for output.
2. **Define the main function**: This is where the program execution starts.
3. **Declare variables**:
   - `n` is set to 10 to generate the first ten Fibonacci numbers.
   - An array `fib` is created to store the Fibonacci numbers.
4. **Initia

In [3]:
developer_prompt = "You are a skilled Python developer specializing in web scraping with the requests library. You write robust, ethical scraping code that handles errors gracefully."
user_prompt = "Write a Python function using the requests library to scrape job titles from a job listing webpage. The function should take a URL as input, handle HTTP errors, and return a list of job titles. Include proper headers and a delay between requests."
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, temperature=0.2, stream=True)


# Iterate through streaming events and only print text deltas
for event in response:
    # Each event may contain incremental text in event.delta
    if hasattr(event, "delta") and event.delta:
        print(event.delta, end="", flush=True) # prints the content from this chunk, end="" prevents adding a newline after each  piece and flush=True forces flushing output to screen

Certainly! Below is a Python function that uses the `requests` library to scrape job titles from a job listing webpage. The function includes error handling for HTTP requests, proper headers, and a delay between requests to avoid overwhelming the server.

```python
import requests
from bs4 import BeautifulSoup
import time

def scrape_job_titles(url):
    # Define headers to mimic a browser request
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
    }
    
    # Initialize an empty list to store job titles
    job_titles = []
    
    try:
        # Send a GET request to the URL
        response = requests.get(url, headers=headers)
        
        # Check for HTTP errors
        response.raise_for_status()  # Raises an HTTPError for bad responses (4xx or 5xx)
        
        # Parse the content with BeautifulSoup
        soup = BeautifulSoup(response.content, 'html.parser')
    

## g. Examples (Text Translation)

In [50]:
developer_prompt = "Please act as an expert of English to Urdu translator by translating the prompt from English into Urdu."
user_prompt = "The budget this year will have a very bad impact on the low salried people"
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

اس سال کا بجٹ کم تنخواہ دار لوگوں پر بہت برا اثر ڈالے گا۔


## h. Examples (Text Summarization)

In [51]:
developer_prompt = "You are an expert of English language."

user_prompt = f'''
Summarize the text below in at most 20 words:
```The Hugging Face transformers library is an incredibly versatile and powerful tool for natural language processing (NLP).
It allows users to perform a wide range of tasks such as text classification, named entity recognition, and question answering, among others.
It's an extremely popular library that's widely used by the open-source data science community.
It lowers the barrier to entry into the field by providing Data Scientists with a productive, convenient way to work with transformer models.```
'''

response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, temperature=1.0)
print(response)

Hugging Face transformers library excels in NLP tasks, enhancing accessibility for data scientists in the open-source community.


In [52]:
developer_prompt = "You are a helpful assistant skilled in text summarization, translation to Urdu, and Python programming. You provide clear, accurate responses and follow instructions precisely."

text = '''
Our solar system, a celestial dance of eight planets, each with its unique character and charm, orbits around our radiant Sun.
Closest to the Sun, Mercury, the smallest planet, darts swiftly, its metallic surface reflecting the Sun's intense glare.
Venus, Earth's twin, cloaked in a dense atmosphere, harbors scorching temperatures and acidic clouds.
Earth, our oasis of life, teems with diverse ecosystems, its oceans and landforms sculpted by the forces of nature.
Mars, the Red Planet, bears the scars of ancient volcanoes and the promise of potential life.
Beyond the asteroid belt, Jupiter and Saturn, the gas giants, reign supreme, their vast atmospheres swirling with storms and adorned with rings of ice and dust.
Uranus and Neptune, the ice giants, tilt at odd angles, their atmospheres frigid and their depths still shrouded in mystery.
Each planet, a celestial masterpiece, plays a vital role in the intricate symphony of our solar system.'''

user_prompt = f'''
Please complete the following two tasks based on the text provided below:

Task 1: Summarize the text in 2-3 sentences, then translate that summary into Urdu.

Task 2: Create a Python list containing all planet names mentioned in the text.

Text: ```{text}```

Please format your response as:
**Summary:** [English summary]
**Urdu Translation:** [Urdu translation]
**Python List:** [Python code with planet names]
'''

response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt, temperature=0.3)
print(response)

**Summary:** The solar system consists of eight unique planets that orbit the Sun, each with distinct characteristics. Mercury is the smallest and closest to the Sun, while Venus is Earth's twin with a harsh atmosphere. Mars shows signs of ancient volcanoes, and the gas giants Jupiter and Saturn dominate the outer solar system, followed by the ice giants Uranus and Neptune.

**Urdu Translation:** شمسی نظام آٹھ منفرد سیاروں پر مشتمل ہے جو سورج کے گرد گردش کرتے ہیں، ہر ایک کی اپنی خصوصیات ہیں۔ عطارد سب سے چھوٹا اور سورج کے قریب ہے، جبکہ زہرہ زمین کا جڑواں ہے جس کی فضاء سخت ہے۔ مریخ قدیم آتش فشانی کے آثار دکھاتا ہے، اور گیس کے دیو مشتری اور زحل بیرونی شمسی نظام پر حکمرانی کرتے ہیں، اس کے بعد برف کے دیو یورینس اور نیپچون ہیں۔

**Python List:** 
```python
planets = ["Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus", "Neptune"]
```


## i. Examples (Named Entity Recognition)

In [53]:
developer_prompt = """You are a  Named Entity Recognition specialist. Extract and classify entities from the given text into these categories only if they exist:
- name
- major
- university
- nationality
- grades
- club
Format your response as: 'Entity: [text] | Type: [category]' with each entity on a new line."""

user_prompt = '''
Zelaid Mujahid is a sophomore majoring in Data Science at University of the Punjab. \
He is Pakistani national and has a 3.5 GPA. Mujahid is an active member of the department's AI Club.\
He hopes to pursue a career in AI after graduating.
'''
response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

Entity: Zelaid Mujahid | Type: name  
Entity: Data Science | Type: major  
Entity: University of the Punjab | Type: university  
Entity: Pakistani | Type: nationality  
Entity: 3.5 GPA | Type: grades  
Entity: AI Club | Type: club  


In [5]:
developer_prompt = """You are a medical Named Entity Recognition specialist. Extract and classify entities from medical texts into these categories:
- DISEASE: Medical conditions, illnesses, disorders
- MEDICATION: Drug names, treatments
- SYMPTOM: Signs and symptoms
- BODY_PART: Anatomical parts and organs
- DOSAGE: Medication dosages and frequencies
- PERSON: Patient names, doctor names

Format as JSON: {"entities": [{"text": "entity_text", "type": "CATEGORY", "start_pos": position}]}"""

user_prompt = "Patient Saleem reported severe headaches and nausea. Dr. Arif prescribed 200mg of Ibuprofen twice daily for the inflammation in his temporal lobe region."

response = ask_openai(user_prompt=user_prompt,developer_prompt=developer_prompt, text={"format": {"type": "json_object"}})
print(response)

{
  "entities": [
    {"text": "Saleem", "type": "PERSON", "start_pos": 7},
    {"text": "headaches", "type": "SYMPTOM", "start_pos": 21},
    {"text": "nausea", "type": "SYMPTOM", "start_pos": 30},
    {"text": "Dr. Arif", "type": "PERSON", "start_pos": 36},
    {"text": "200mg", "type": "DOSAGE", "start_pos": 49},
    {"text": "Ibuprofen", "type": "MEDICATION", "start_pos": 54},
    {"text": "twice daily", "type": "DOSAGE", "start_pos": 63},
    {"text": "inflammation", "type": "DISEASE", "start_pos": 78},
    {"text": "temporal lobe", "type": "BODY_PART", "start_pos": 91}
  ]
}


## j. Example (Grade School Math 8K (GSM8K))

In [55]:
developer_prompt = """You are an expert School math teacher. 
Consider the following text and then answer the questions of the students from this:
A carnival snack booth made $50 selling popcorn each day. It made three times as much selling cotton candy. 
For a 5-day activity, the booth has to pay $30 rent and $75 for the cost of the ingredients. 
"""
user_prompt = "How much did the booth earn for 5 days after paying the rent and the cost of ingredients?"

response = ask_openai(user_prompt=user_prompt, developer_prompt=developer_prompt)
print(response)

First, let's find out how much the booth made from selling popcorn and cotton candy over 5 days.

1. **Popcorn Sales:**
   - Daily earnings from popcorn = $50
   - For 5 days: 
     \[
     50 \, \text{dollars/day} \times 5 \, \text{days} = 250 \, \text{dollars}
     \]

2. **Cotton Candy Sales:**
   - The booth made three times as much from cotton candy as it did from popcorn.
   - Daily earnings from cotton candy = \(3 \times 50 = 150\) dollars
   - For 5 days:
     \[
     150 \, \text{dollars/day} \times 5 \, \text{days} = 750 \, \text{dollars}
     \]

3. **Total Earnings:**
   - Total earnings from both snacks over 5 days:
     \[
     250 \, \text{dollars (popcorn)} + 750 \, \text{dollars (cotton candy)} = 1000 \, \text{dollars}
     \]

4. **Expenses:**
   - Rent = $30
   - Cost of ingredients = $75
   - Total expenses:
     \[
     30 \, \text{dollars (rent)} + 75 \, \text{dollars (ingredients)} = 105 \, \text{dollars}
     \]

5. **Net Earnings:**
   - Net earnings after payi

# <span style='background :lightgreen' > Points to Ponder and Tasks To DO:</span>
- Develop a clear distinction between different OpenAI models (o3, GPT-5, GPT-4o, GPT-4.1, GPT-3.5, etc.) including:
    - Model sizes & variants
    - Context window limits
    - Supported input/output modalities (text, image, audio, video)
    - Typical use cases and trade-offs
- Understand the difference between
    - OpenAI SDK (openai Python package)
    - Three major OpenAI APIs, chat.completions, assistants and responses
    - How API keys are generated, rotated, stored, and loaded (env variables, .env file, dotenv.load_dotenv()).
- Gain a solid grasp of OpenAI pricing structure, including:
    - Input/output tokens
    - Cost multipliers for different models
    - What counts as billable usage
- Understand the parameters accepted by the OpenAI() client constructor, including:
    - api_key, organization, project, base_url, timeout, etc.
    - When and why alternative base URLs (e.g., Groq, Azure, local) are used.
    - How to estimate and monitor real project costs.
- Have a crystal clear understanding of the parameters that are used while you create a OpenAI's client using the `openai.OpenAI()` constructor
- Compare performance of at least two models on the same task. Where does each model perform better? Why?
- Reflect on:
    - When to use small models vs large models.
    - When to use chat.completions vs responses API.
- Write code that:
    - Lists the available models on your account using the SDK.
    - Prints the important attributes of 3 models (context length, type, capabilities).
- Write code that Capture and print the entire Response object, and identify the output_text, output_tokens, input_tokens, finish_reasons, and model metadata.
- Write code that implement cost estimation for a simple practical scenario:
    - Estimate cost for 10, 100, and 1,000 API calls.
    - Compare cost difference across two models (e.g., GPT-4o vs GPT-4.1-mini).