# LLM Utility Package

This package is a simple wrapper for the main LLM providers with the minimal viable set of feature supports (for my workflow). It supports:

1. Concurrent text embeddings (except for Claude)
2. Saving to a chromadb vector store
2. Concurrent chat calls
3. Web search
4. Structured outputs

## 1. Environment Setup

Before running the code, ensure you have a `.env` file in your project root with the following keys:

```bash
GEMINI_TYPE=
GEMINI_PROJECT_ID=
GEMINI_PRIVATE_KEY_ID=
GEMINI_PRIVATE_KEY=
GEMINI_CLIENT_EMAIL=
GEMINI_CLIENT_ID=
GEMINI_AUTH_URI=
GEMINI_TOKEN_URI=
GEMINI_AUTH_PROVIDER_CERT=
GEMINI_CLIENT_CERT_URL=
GEMINI_UNIVERSE_DOMAIN=
ANTHROPIC_API_KEY=
OPENAI_KEY=

```

In [1]:
from dotenv import load_dotenv
import os
import numpy as np
import json

# Load environment variables
load_dotenv()

True

## 2. Initialization

Import and initialize the `LLMWrapper`. This class handles routing requests to the appropriate provider based on the model name.

In [2]:
from llm_utils.wrapper import LLMWrapper

llm = LLMWrapper()

## 3. Basic Chat Completion

The `ask` method is the main entry point. It takes lists of inputs (ids, messages, formats) to concurrent processing, but here we'll show single requests.

In [3]:
system_message = "You are a helpful assistant."
user_messages = [
    "Briefly explain the theory of relativity like I'm five. Then explain it again like I'm a college student.",
    "What is the current price of Tesla's stock? If you can find it, return N/A"
]
response_formats = [
    {
        "name": "example_response1",
        "type": "json_schema",
        "strict": True,
        "schema": {
            "type": "object",
            "properties": {
                "FIVE_YEAR_OLD_EXPLANATION": {"type": "string"},
                "COLLEGE_STUDENT_EXPLANATION": {"type": "string"}
            },
            "required": ["FIVE_YEAR_OLD_EXPLANATION", "COLLEGE_STUDENT_EXPLANATION"],
            "additionalProperties": False
        }
    },
    {
        "name": "example_response2",
        "type": "json_schema",
        "strict": True,
        "schema": {
            "type": "object",
            "properties": {
                "TICKER": {"type": "string"},
                "CURRENT_PRICE": {"type": "string"},
                "EXCHANGE": {"type": "string"}
            },
            "required": ["TICKER", "CURRENT_PRICE", "EXCHANGE"],
            "additionalProperties": False
        }
    }
]

### OpenAI

In [4]:
response = llm.ask(
    system_message=system_message,
    ids=np.arange(len(user_messages)).astype(str).tolist(),
    user_messages=user_messages,
    response_formats=response_formats,
    model="gpt-5-mini",
    n_workers=2,
)

display(response)

response = llm.ask(
    system_message=system_message,
    ids=np.arange(len(user_messages)).astype(str).tolist(),
    user_messages=user_messages,
    response_formats=response_formats,
    model="gpt-5-mini",
    web_search=True,
    n_workers=2,
)

display(response)

Chat Completion: 100%|██████████| 2/2 [00:23<00:00, 11.98s/it]


{'0': {'FIVE_YEAR_OLD_EXPLANATION': 'Space is like a big stretchy blanket. Big things like the Sun and Earth make dents in the blanket. When smaller things roll near the dent, they fall toward the big thing — that’s what gravity feels like. Also, clocks can tick slower if you go really, really fast or sit close to a very heavy thing. Light always moves the fastest for everyone, no matter how you move.',
  'COLLEGE_STUDENT_EXPLANATION': 'The theory of relativity has two parts. Special relativity (1905) starts from two postulates: the laws of physics are the same in all inertial frames, and the speed of light in vacuum is constant for all observers. From these follow Lorentz transformations, relativity of simultaneity, time dilation (moving clocks run slow), length contraction (moving rods shorten), and mass–energy equivalence (E = mc^2). General relativity (1915) generalizes this to include gravity: it describes gravity not as a force but as curvature of four-dimensional spacetime produ

Chat Completion: 100%|██████████| 2/2 [00:14<00:00,  7.05s/it]


{'0': {'FIVE_YEAR_OLD_EXPLANATION': 'Imagine space and time are like a big stretchy sheet. When you put a heavy ball (like a bowling ball) on the sheet it makes a dent, and smaller balls roll toward it — that’s how gravity works: big things bend the sheet and make other things move toward them. Also, if you run really, really fast, your clock would tick more slowly than someone standing still, and things in the direction you run look a little shorter. So: big things bend space and time, and going super fast changes how time and space look.',
  'COLLEGE_STUDENT_EXPLANATION': "The theory of relativity is two related theories by Einstein. Special relativity (1905) rests on two postulates: the laws of physics are the same in all inertial frames, and the speed of light c is constant for all observers. From these follow Lorentz transformations and effects such as time dilation (Δt' = γΔt), length contraction (L' = L/γ) and the relativity of simultaneity, where γ = 1/√(1−v²/c²). It also gives

### Gemini (Gemini 2.5 Flash)

In [5]:
response = llm.ask(
    system_message=system_message,
    ids=np.arange(len(user_messages)).astype(str).tolist(),
    user_messages=user_messages,
    response_formats=response_formats,
    model="gemini-2.5-flash",
    n_workers=2,
)

display(response)

response = llm.ask(
    system_message=system_message,
    ids=np.arange(len(user_messages)).astype(str).tolist(),
    user_messages=user_messages,
    response_formats=response_formats,
    model="gemini-2.5-flash",
    web_search=True,
    n_workers=2,
)

display(response)

Chat Completion: 100%|██████████| 2/2 [00:03<00:00,  1.97s/it]


{'0': {'FIVE_YEAR_OLD_EXPLANATION': "Imagine you're on a super-fast spaceship! If you throw a ball, it might seem to go super-duper fast to someone standing still. And if you go really, really fast, time can slow down a tiny bit for you compared to your friends on Earth. It's like time is a little stretchy!",
  'COLLEGE_STUDENT_EXPLANATION': 'The theory of relativity, developed by Albert Einstein, comprises two main theories: special relativity and general relativity. Special relativity, introduced in 1905, postulates that the laws of physics are the same for all non-accelerating observers and that the speed of light in a vacuum is the same for all observers, regardless of their motion or the motion of the light source. This leads to profound consequences like time dilation (time passes slower for objects in motion relative to a stationary observer), length contraction (objects appear shorter in their direction of motion), and mass-energy equivalence (E=mc^2). General relativity, intro

Chat Completion: 100%|██████████| 2/2 [00:17<00:00,  8.85s/it]


{'0': {'FIVE_YEAR_OLD_EXPLANATION': 'Imagine you\'re playing with your toy cars. If one of your toy cars could go super-duper fast, almost as fast as a zoomy light beam, the clock on that car would tick a little bit slower than your clock. So, if you went on a very fast trip, your friends might be a little bit older when you come back, but you wouldn\'t feel any different. Also, if that super-fast toy car zoomed past you, it might look a tiny bit squished in the direction it\'s going. Now, imagine a big, heavy bowling ball on a trampoline. It makes a dip, right? That\'s kind of like how really big things in space, like planets and stars, make a dip in "space-time" (which is like the trampoline). This dip is what makes other things, like little marbles, roll towards the bowling ball. That\'s what we call gravity! So, the theory of relativity just means that how we see time, space, and how things move can change depending on how fast we\'re going or how close we are to something super bi

### Claude (Claude 3.5 Sonnet)

In [None]:
response = llm.ask(
    system_message=system_message,
    ids=np.arange(len(user_messages)).astype(str).tolist(),
    user_messages=user_messages,
    response_formats=response_formats,
    model="claude-sonnet-4-5",
    n_workers=2,
)

display(response)

response = llm.ask(
    system_message=system_message,
    ids=np.arange(len(user_messages)).astype(str).tolist(),
    user_messages=user_messages,
    response_formats=response_formats,
    model="claude-sonnet-4-5",
    web_search=True,
    n_workers=2,
)

display(response)

Chat Completion: 100%|██████████| 2/2 [00:08<00:00,  4.25s/it]


{'0': {'FIVE_YEAR_OLD_EXPLANATION': "Imagine you're on a really fast train with your toy car. If you roll the car forward, it goes fast! But to your mom standing outside watching the train zoom by, your toy car looks like it's going SUPER fast because the train is already moving fast and the car is moving too! That's kind of like relativity - things look different depending on where you are and how fast you're moving. Also, time is like a stretchy rubber band - it can go slower or faster depending on how speedy you are!",
  'COLLEGE_STUDENT_EXPLANATION': "Einstein's theory of relativity consists of two parts: Special Relativity (1905) and General Relativity (1915). Special Relativity deals with objects moving at constant velocities and establishes that the laws of physics are the same in all inertial reference frames, and that the speed of light in a vacuum is constant for all observers. This leads to counterintuitive consequences like time dilation and length contraction at high veloc

Chat Completion:   0%|          | 0/2 [00:00<?, ?it/s]

## Embedding

In [None]:
texts = ["cat", "kitty", "dog", "potato"]
ids = ["1", "2", "3", "4"]

def cosine_similarity(v1, v2):
    return np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))

# OpenAI Embeddings
print("Getting OpenAI Embeddings...")
_, openai_embeddings = llm.embed(
    ids=ids,
    texts=texts,
    size=1536,
    db=None,
    verbose=False,
    model="text-embedding-3-large"
)

cat_emb = np.array(openai_embeddings[0])
kitty_emb = np.array(openai_embeddings[1])
dog_emb = np.array(openai_embeddings[2])
potato_emb = np.array(openai_embeddings[3])

sim_cat_kitty = cosine_similarity(cat_emb, kitty_emb)
sim_dog_potato = cosine_similarity(dog_emb, potato_emb)

print(f"OpenAI Similarity (cat, kitty): {sim_cat_kitty:.4f}")
print(f"OpenAI Similarity (dog, potato): {sim_dog_potato:.4f}")

if sim_cat_kitty > sim_dog_potato:
    print("✅ OpenAI Embedding Test Passed: cat-kitty similarity > dog-potato similarity")
else:
    print("❌ OpenAI Embedding Test Failed")

# Gemini Embeddings
print("Getting Gemini Embeddings...")
_, gemini_embeddings = llm.embed(
    ids=ids,
    texts=texts,
    size=768, 
    db=None,
    name="test_gemini_emb",
    verbose=False,
    model="text-embedding-004"
)

cat_emb = np.array(gemini_embeddings[0])
kitty_emb = np.array(gemini_embeddings[1])
dog_emb = np.array(gemini_embeddings[2])
potato_emb = np.array(gemini_embeddings[3])

sim_cat_kitty = cosine_similarity(cat_emb, kitty_emb)
sim_dog_potato = cosine_similarity(dog_emb, potato_emb)

print(f"Gemini Similarity (cat, kitty): {sim_cat_kitty:.4f}")
print(f"Gemini Similarity (dog, potato): {sim_dog_potato:.4f}")

if sim_cat_kitty > sim_dog_potato:
    print("✅ Gemini Embedding Test Passed: cat-kitty similarity > dog-potato similarity")
else:
    print("❌ Gemini Embedding Test Failed")

Getting OpenAI Embeddings...
OpenAI Similarity (cat, kitty): 0.6139
OpenAI Similarity (dog, potato): 0.3897
✅ OpenAI Embedding Test Passed: cat-kitty similarity > dog-potato similarity
Getting Gemini Embeddings...
Gemini Similarity (cat, kitty): 0.9019
Gemini Similarity (dog, potato): 0.8507
✅ Gemini Embedding Test Passed: cat-kitty similarity > dog-potato similarity
