# Working with LLMs API

In this notebook, we will illustrate how to send a prompt to LLM APIs for the following providers:
- OpenAI
- Google
- Anthropic

Both Google and Anthropic have their own API SDKs and are compatible with the OpenAI API SDK. Throughout the course, we will use the OpenAI API Python SDK.


General requirements:
- APIs keys
- Python SDKs for the APIs

Let's start by import the os library:

In [None]:
import os

We than define the following prompt:

In [None]:
content_prompt = """
I am working with a dataset that contains information about Chicago crime incidents. 
I want to create a SQL query to pull the total number of crimes that ended in arrest by a year
"""

Next, we define the number of tokens and temperature:

In [31]:
temperature = 0
max_tokens = 500

# The OpenAI API

In [None]:
from openai import OpenAI
import pandas as pd


In [32]:
openai_api_key = os.environ.get("OPENAI_API_KEY")
client = OpenAI(api_key= openai_api_key)


In [33]:
response_openai = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": content_prompt}],
    temperature=temperature,
    max_tokens= max_tokens,
)


In [34]:
response_openai


ChatCompletion(id='chatcmpl-C6zBUKxgFfuzlQaasCgcQPWkkHDNa', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Certainly! Assuming your table is named `chicago_crimes` and has at least the following columns:\n\n- `arrest` (a boolean or string indicating if an arrest was made, e.g., `TRUE`/`FALSE` or `'Y'`/`'N'`)\n- `date` (a date or datetime column indicating when the crime occurred)\n\nHere’s a sample SQL query to get the total number of crimes that ended in arrest, grouped by year:\n\n```sql\nSELECT \n    EXTRACT(YEAR FROM date) AS year,\n    COUNT(*) AS total_arrests\nFROM \n    chicago_crimes\nWHERE \n    arrest = TRUE  -- or 'Y' if it's a string\nGROUP BY \n    year\nORDER BY \n    year;\n```\n\n**Notes:**\n- If your `arrest` column is a string (`'Y'`/`'N'`), change `arrest = TRUE` to `arrest = 'Y'`.\n- If your SQL dialect does not support `EXTRACT(YEAR FROM date)`, you can use `YEAR(date)` (MySQL, SQL Server) or `strftime('%Y', da

In [35]:
print(response_openai.choices[0].message.content)


Certainly! Assuming your table is named `chicago_crimes` and has at least the following columns:

- `arrest` (a boolean or string indicating if an arrest was made, e.g., `TRUE`/`FALSE` or `'Y'`/`'N'`)
- `date` (a date or datetime column indicating when the crime occurred)

Here’s a sample SQL query to get the total number of crimes that ended in arrest, grouped by year:

```sql
SELECT 
    EXTRACT(YEAR FROM date) AS year,
    COUNT(*) AS total_arrests
FROM 
    chicago_crimes
WHERE 
    arrest = TRUE  -- or 'Y' if it's a string
GROUP BY 
    year
ORDER BY 
    year;
```

**Notes:**
- If your `arrest` column is a string (`'Y'`/`'N'`), change `arrest = TRUE` to `arrest = 'Y'`.
- If your SQL dialect does not support `EXTRACT(YEAR FROM date)`, you can use `YEAR(date)` (MySQL, SQL Server) or `strftime('%Y', date)` (SQLite).
- If your date column is not named `date`, replace it with the correct column name.

**Example for MySQL:**
```sql
SELECT 
    YEAR(date) AS year,
    COUNT(*) AS total_

# Google Gemini API


In [27]:
from google import genai
from google.genai import types


In [43]:
gemini_api_key = os.environ.get("GEMINI_API_KEY")
gemini_client = genai.Client(api_key=gemini_api_key)

In [None]:
response_gemini = gemini_client.models.generate_content(
    model="gemini-2.0-flash",
    contents=content_prompt,
    config=types.GenerateContentConfig(
        max_output_tokens=max_tokens, temperature=temperature
    ),
)


In [None]:
print(response_gemini.candidates[0].content.parts[0].text)


```sql
SELECT
    STRFTIME('%Y', Date) AS CrimeYear,  -- Extract the year from the 'Date' column
    COUNT(*) AS TotalArrests
FROM
    ChicagoCrimes
WHERE
    Arrest = TRUE  -- Filter for incidents where an arrest was made
GROUP BY
    CrimeYear
ORDER BY
    CrimeYear;
```

**Explanation:**

1. **`SELECT STRFTIME('%Y', Date) AS CrimeYear, COUNT(*) AS TotalArrests`**:
   - `STRFTIME('%Y', Date)`: This extracts the year from the `Date` column.  The `STRFTIME` function is a powerful date and time formatting function in SQLite (and many other SQL dialects).  `'%Y'` specifies that we want to extract the year in a four-digit format (e.g., 2023).  The `Date` column is assumed to be in a format that SQLite can understand (e.g., 'YYYY-MM-DD', 'YYYY-MM-DD HH:MM:SS', or a Unix timestamp).  If your `Date` column is in a different format, you might need to adjust this part of the query.  For example, if your date is stored as text in the format 'MM/DD/YYYY', you might need to use `STRFTIME('%Y', SU

## Using the OpenAI API SDK

Let's now illustrate the use the OpenAI API SDK to call the Google Gemini API. You can find more details on the [doc](https://ai.google.dev/gemini-api/docs/openai).

In [45]:
gemini_client2 = OpenAI(
    api_key= gemini_api_key,
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
)


In [50]:
response_gemini2 = gemini_client2.chat.completions.create(
    model="gemini-2.0-flash",
    messages=[{"role": "user", "content": content_prompt}],
    temperature=temperature,
    max_tokens=max_tokens,
)


In [52]:
print(response_gemini2.choices[0].message.content)


```sql
SELECT
    STRFTIME('%Y', Date) AS CrimeYear,  -- Extract the year from the 'Date' column
    COUNT(*) AS TotalArrests
FROM
    ChicagoCrimes
WHERE
    Arrest = TRUE  -- Filter for incidents where an arrest was made
GROUP BY
    CrimeYear
ORDER BY
    CrimeYear;
```

**Explanation:**

1. **`SELECT STRFTIME('%Y', Date) AS CrimeYear, COUNT(*) AS TotalArrests`**:
   - `STRFTIME('%Y', Date)`: This extracts the year from the `Date` column.  The `STRFTIME` function is a powerful date and time formatting function in SQLite (and many other SQL dialects).  `'%Y'` specifies that we want to extract the year in a four-digit format (e.g., 2023).  The `Date` column is assumed to be in a format that SQLite can understand (e.g., 'YYYY-MM-DD', 'YYYY-MM-DD HH:MM:SS', or a Unix timestamp).  If your `Date` column is in a different format, you might need to adjust this part of the query.  For example, if your date is stored as text in the format 'MM/DD/YYYY', you might need to use `STRFTIME('%Y', SU

# Anthropic Claude API

In [42]:
import anthropic


In [54]:
claude_api_key = os.environ.get("CLAUDE_API_KEY")
claude_client = anthropic.Anthropic(api_key=claude_api_key)



In [55]:

response_claude = claude_client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": content_prompt}],
)


In [56]:
print(response_claude)


Message(id='msg_013Eq75NfJqSXCXmWyP5np3j', content=[TextBlock(citations=None, text="Here's a SQL query to get the total number of crimes that ended in arrest by year from a Chicago crime dataset:\n\n```sql\nSELECT \n    EXTRACT(YEAR FROM date_column) AS year,\n    COUNT(*) AS total_arrests\nFROM chicago_crime_table\nWHERE arrest = TRUE\nGROUP BY EXTRACT(YEAR FROM date_column)\nORDER BY year;\n```\n\nHowever, since I don't know your exact table structure, you may need to adjust the following parts:\n\n1. **Table name**: Replace `chicago_crime_table` with your actual table name\n2. **Date column**: Replace `date_column` with your actual date column name (commonly `date`, `incident_date`, `date_occurred`, etc.)\n3. **Arrest column**: Replace `arrest` with your actual arrest column name, and adjust the condition based on how arrests are indicated in your data:\n   - If it's a boolean: `WHERE arrest = TRUE`\n   - If it's text: `WHERE arrest = 'Y'` or `WHERE arrest = 'Yes'`\n   - If it's num

In [57]:
print(response_claude.content[0].text)

Here's a SQL query to get the total number of crimes that ended in arrest by year from a Chicago crime dataset:

```sql
SELECT 
    EXTRACT(YEAR FROM date_column) AS year,
    COUNT(*) AS total_arrests
FROM chicago_crime_table
WHERE arrest = TRUE
GROUP BY EXTRACT(YEAR FROM date_column)
ORDER BY year;
```

However, since I don't know your exact table structure, you may need to adjust the following parts:

1. **Table name**: Replace `chicago_crime_table` with your actual table name
2. **Date column**: Replace `date_column` with your actual date column name (commonly `date`, `incident_date`, `date_occurred`, etc.)
3. **Arrest column**: Replace `arrest` with your actual arrest column name, and adjust the condition based on how arrests are indicated in your data:
   - If it's a boolean: `WHERE arrest = TRUE`
   - If it's text: `WHERE arrest = 'Y'` or `WHERE arrest = 'Yes'`
   - If it's numeric: `WHERE arrest = 1`

Here are a few variations depending on common column naming conventions:

**V

## Using the OpenAI API SDK

Let's now illustrate the use the OpenAI API SDK to call the Google Gemini API. You can find more details on the [doc](https://docs.anthropic.com/en/api/openai-sdk).

In [58]:
claude_client2 = OpenAI(
    api_key=claude_api_key,
    base_url="https://api.anthropic.com/v1/",
)


In [59]:
response_claude2 = claude_client2.chat.completions.create(
    model="claude-opus-4-1-20250805",
    messages=[{"role": "user", "content": content_prompt}],
    temperature=temperature,
    max_tokens=max_tokens,
)


In [60]:
print(response_claude2.choices[0].message.content)


Here's a SQL query to get the total number of crimes that ended in arrest by year from a Chicago crime dataset:

```sql
SELECT 
    YEAR(date) AS year,
    COUNT(*) AS total_arrests
FROM 
    crimes
WHERE 
    arrest = TRUE
GROUP BY 
    YEAR(date)
ORDER BY 
    year;
```

**Alternative versions depending on your database system and column names:**

If your date column has a different name or format:
```sql
-- If the date column is named differently (e.g., 'incident_date', 'occurred_date')
SELECT 
    YEAR(incident_date) AS year,
    COUNT(*) AS total_arrests
FROM 
    crimes
WHERE 
    arrest = TRUE
GROUP BY 
    YEAR(incident_date)
ORDER BY 
    year;
```

If you're using PostgreSQL:
```sql
SELECT 
    EXTRACT(YEAR FROM date) AS year,
    COUNT(*) AS total_arrests
FROM 
    crimes
WHERE 
    arrest = TRUE
GROUP BY 
    EXTRACT(YEAR FROM date)
ORDER BY 
    year;
```

If the arrest column uses different values (like 'Y'/'N' or 1/0):
```sql
-- For 'Y'/'N' values
SELECT 
    YEAR(date) 

# Resources

- OpenAI API documentation
- Gemini API documentation
- Claude API documentation