# **LLMs for I-Os: A Functionalities and Applications Master Class - APIs**

In this notebook, we will going over how to interact with LLMs API. Although we will be focusing on OpenAI, code for Anthropic and Gemini are also included.

Before we start, you will need to get an API key from OpenAI. Please follow these steps:


1.   Go to: https://platform.openai.com/docs/overview
2.   Register an account
3.   During the registration process, you will be given an API keys. It is important that you save the API key in a secure place.
4.   For Google Colab, you can add a key into Secrets section in the sidebar for easy importing.



# **Interacting with LLM API**

In [None]:
# Installing packages
!pip install openai
!pip install anthropic
!import google.generativeai as genai

Collecting anthropic
  Downloading anthropic-0.49.0-py3-none-any.whl.metadata (24 kB)
Downloading anthropic-0.49.0-py3-none-any.whl (243 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m243.4/243.4 kB[0m [31m15.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: anthropic
Successfully installed anthropic-0.49.0
/bin/bash: line 1: import: command not found


In [None]:
# Loading general package
from pathlib import Path # If your data in a different folders
import pandas as pd

In [None]:
# Defining path
project_path = Path.cwd().parent

## **Interacting with OPENAI API**

In [None]:
# Loading library
from openai import OpenAI
from google.colab import userdata
import re
import os
import json

In [None]:
# Initializing client
os.environ['OPENAI_API_KEY'] = userdata.get('key')
client = OpenAI(
    api_key = os.environ.get('OPENAI_API_KEY'))

In [None]:
# Prompt Creation
prompt = f"""
Create a situational judgement test (SJT) item to assess the following skill:

# Skill
- Active Listening — Giving full attention to what other people are saying, taking time to understand the points being made, asking questions as appropriate, and not interrupting at inappropriate times.

# Output Format
Provide the response in JSON format with the following structure:

```json
{{
  "question": "<Insert SJT question here>",
  "options": {{
    "A": "<Insert Option A>",
    "B": "<Insert Option B>",
    "C": "<Insert Option C>",
    "D": "<Insert Option D>"
  }},
  "correct_answer": "<Insert Correct Answer Letter (A, B, C, or D)>",
  "rationale": "<Insert detailed rationale explaining why the correct answer is the best choice>"
}}
"""

In [None]:
# Calling OPENAI API
response = client.chat.completions.create(
            model="chatgpt-4o-latest", # Getting the latest chatgpt model
            messages=[{"role": "system", "content": prompt}],
            response_format={ "type": "json_object" },
        )

In [None]:
# Inspecting output
response_text = response.choices[0].message.content
print(response_text)


{
  "question": "You are in a team meeting where your colleague, Sarah, is explaining a challenge she is facing on a project. While she is speaking, you notice that some of your teammates are starting to discuss their own ideas quietly. You also realize that you have a potential solution in mind, and you are eager to share it. What should you do?",
  "options": {
    "A": "Interrupt Sarah and share your idea immediately, as it might help resolve the issue faster.",
    "B": "Wait for Sarah to finish speaking, summarize her main points to confirm your understanding, and then ask if she would like to hear your potential solution.",
    "C": "Tune out the conversation since you already have a solution in mind and prepare what you will say when it's your turn to speak.",
    "D": "Speak over Sarah to remind the team to pay attention, as their side discussions are disrespectful."
  },
  "correct_answer": "B",
  "rationale": "Option B is the best choice because it demonstrates active listen

In [None]:
# Convert response_text to dictionary
response_dict = json.loads(response_text)

# Create DataFrame
result_df = pd.DataFrame([{
    "Question": response_dict["question"],
    "Option A": response_dict["options"]["A"],
    "Option B": response_dict["options"]["B"],
    "Option C": response_dict["options"]["C"],
    "Option D": response_dict["options"]["D"],
    "Correct Answer": response_dict["correct_answer"],
    "Rationale": response_dict["rationale"]
}])

In [None]:
result_df

Unnamed: 0,Question,Option A,Option B,Option C,Option D,Correct Answer,Rationale
0,You are in a team meeting where your colleague...,Interrupt Sarah and share your idea immediatel...,"Wait for Sarah to finish speaking, summarize h...",Tune out the conversation since you already ha...,Speak over Sarah to remind the team to pay att...,B,Option B is the best choice because it demonst...


## **Interacting with GEMINI API**

In [None]:
# Loading library
import google.generativeai as genai

In [None]:
# Initializing environment
os.environ['Gemini'] = userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=os.environ.get('Gemini'))

In [None]:
# Gemini prompting
prompt = """Create a situational judgement test (SJT) item to assess the following skill.

Skill:
- Active Listening — Giving full attention to what other people are saying, taking time to understand the points being made, asking questions as appropriate, and not interrupting at inappropriate times.

Use this JSON schema:

SJT_Item = {
  "question": str,
  "options": {
    "A": str,
    "B": str,
    "C": str,
    "D": str
  },
  "correct_answer": str,
  "rationale": str
}

Return: SJT_Item
"""

In [None]:
# Calling model
model = genai.GenerativeModel("gemini-1.5-flash")
response = model.generate_content(prompt)

In [None]:
# Getting response text
response_text = response.text

# Extract JSON from response_text
cleaned_json_text = re.sub(r"```json|```", "", response_text).strip()

# Parse JSON into a dictionary
response_dict = json.loads(cleaned_json_text)

# Convert to Pandas DataFrame
result_df = pd.DataFrame([{
    "Question": response_dict["question"],
    "Option A": response_dict["options"]["A"],
    "Option B": response_dict["options"]["B"],
    "Option C": response_dict["options"]["C"],
    "Option D": response_dict["options"]["D"],
    "Correct Answer": response_dict["correct_answer"],
    "Rationale": response_dict["rationale"]
}])

In [None]:
result_df

Unnamed: 0,Question,Option A,Option B,Option C,Option D,Correct Answer,Rationale
0,You are in a meeting with your team discussing...,"Interrupt Sarah to get to the point quickly, s...","Maintain eye contact with Sarah, nod to show y...","Let Sarah finish her explanation, then address...",Ignore Sarah's explanation and focus on your o...,B,Option B demonstrates active listening by givi...


## **Interacting with Anthropic API**

In [None]:
# Loading library
import anthropic

In [None]:
# Initializing client
os.environ['Claude'] = userdata.get('claude_key')

client = anthropic.Anthropic(
    api_key = os.environ['Claude'],
)

In [None]:
# Getting response
message = client.messages.create(
        model="claude-3-5-sonnet-latest",
        max_tokens=1024,
        messages=[
            {"role": "user", "content": prompt}
        ]
    )

In [None]:
response_text = message.content[0].text

In [None]:
response_text

'{\n  "question": "You are in a team meeting where a colleague is presenting their concerns about a recent project. While they are speaking, you notice they seem hesitant and are frequently pausing. What would be the most appropriate way to demonstrate active listening in this situation?",\n  \n  "options": {\n    "A": "Jump in with solutions whenever they pause to help move the conversation along",\n    "B": "Maintain eye contact, nod occasionally, and wait until they finish before asking clarifying questions",\n    "C": "Take detailed notes and interrupt when you need clarification on specific points",\n    "D": "Summarize their points quickly whenever they pause to show you\'re following along"\n  },\n  \n  "correct_answer": "B",\n  \n  "rationale": "Option B best demonstrates active listening skills because it shows respect for the speaker by allowing them to complete their thoughts without interruption, while still showing engagement through non-verbal cues (eye contact, nodding).

In [None]:
# Parse JSON into a dictionary
response_dict = json.loads(response_text)

# Convert to Pandas DataFrame
result_df = pd.DataFrame([{
    "Question": response_dict["question"],
    "Option A": response_dict["options"]["A"],
    "Option B": response_dict["options"]["B"],
    "Option C": response_dict["options"]["C"],
    "Option D": response_dict["options"]["D"],
    "Correct Answer": response_dict["correct_answer"],
    "Rationale": response_dict["rationale"]
}])

In [None]:
result_df

Unnamed: 0,Question,Option A,Option B,Option C,Option D,Correct Answer,Rationale
0,You are in a team meeting where a colleague is...,Jump in with solutions whenever they pause to ...,"Maintain eye contact, nod occasionally, and wa...",Take detailed notes and interrupt when you nee...,Summarize their points quickly whenever they p...,B,Option B best demonstrates active listening sk...


# **How to run your own local LLM**

Before running local LLM, make sure that you enable GPU on Google Colab. Here's how to do it:


1.   On top-right bar, click on the upside-down triangle icon
2.   Click *Change runtime type*
3.   Click *L4 GPU*
4.   Click *Save*

In [None]:
#Intalling packages
!pip install vllm
!pip install bitsandbytes

In [None]:
import torch
import os
from vllm import LLM, SamplingParams
from google.colab import drive
import pandas as pd

In [None]:
# Defining model parameter

### See this list for all model - we recommened using bnb model since they quantized to be smaller
### Depending on your VRAM availability, you might have to use a smaller model
### Rule of thumbs: VRAM should be larger than amount of parameters (EX: 50VRAM for 48B model)
### https://docs.unsloth.ai/get-started/all-our-models
model_id = "unsloth/DeepSeek-R1-Distill-Qwen-14B-unsloth-bnb-4bit"

In [None]:
# Loading model

### During the model loading, you can check the amount of batch you can run concurrently.
llm = LLM(model=model_id,
          dtype=torch.bfloat16,
          quantization="bitsandbytes",
          load_format="bitsandbytes",
          max_model_len=700,
          tensor_parallel_size= torch.cuda.device_count(),
          )

In [None]:
# Setting sampling param
sampling_params = SamplingParams(temperature=0.5,
                                 max_tokens=1000,
                                 #top_p=1,
                                 #presence_penalty=0,
                                 #frequency_penalty=0,
                                 )

In [None]:
# Running individual call
## Prompt
prompt = f"""Create a situational judgement test (SJT) item to assess the following skill.

Skill:
- Active Listening — Giving full attention to what other people are saying, taking time to understand the points being made, asking questions as appropriate, and not interrupting at inappropriate times.

Use this JSON schema:

SJT_Item = {
  "question": str,
  "options": {
    "A": str,
    "B": str,
    "C": str,
    "D": str
  },
  "correct_answer": str,
  "rationale": str
}

Return: SJT_Item
"""

outputs = llm.generate(prompt, sampling_params)

In [None]:
# Running batch call
## Defining batch size
BATCH_SIZE = 30

## Looping through batches
for i in tqdm(range(0, len(df), BATCH_SIZE), total=len(df) // BATCH_SIZE + 1):
    batch_df = df.iloc[i: i + BATCH_SIZE]

    # Creating batch prompts
    skill = df['Skill']
    prompts = []
    for _, row in batch_df.iterrows():
        prompt = f"""Create a situational judgement test (SJT) item to assess the following skill.

        Skill:
        {skill}

        Use this JSON schema:

        SJT_Item = {
        "question": str,
        "options": {
            "A": str,
            "B": str,
            "C": str,
            "D": str
        },
        "correct_answer": str,
        "rationale": str
        }

        Return: SJT_Item
"""

        prompts.append(prompt)

    # Generating responses in batch using vLLM
    outputs = llm.generate(prompts, sampling_params)

    # Creating a temporary DataFrame to store batch results
    for j, row in enumerate(batch_df.itertuples(index=False)):
        response = outputs[j].outputs[0].text.strip()
        holder = pd.DataFrame({
            "Question": response_dict["question"],
            "Option A": response_dict["options"]["A"],
            "Option B": response_dict["options"]["B"],
            "Option C": response_dict["options"]["C"],
            "Option D": response_dict["options"]["D"],
            "Correct Answer": response_dict["correct_answer"],
            "Rationale": response_dict["rationale"]
        })
        # Concatenating the batch results to the main result DataFrame
        result_df = pd.concat([result_df, holder], ignore_index=True)