
# 🧪 Starter Notebook — `utils.py` Quick Demo

This notebook helps you **verify your setup** and learn the **core patterns** for using `utils.py` in this course.

- Safe to run **as-is** (no external API calls by default).
- Clear, **uncomment-to-try** cells for text, vision, image gen/edit, and audio transcription once you have keys.
- Uses the **artifact helpers** so everything saves in predictable places under `artifacts/`.


## 1) Import `utils.py` and quick environment check

In [1]:

import sys
import os
import importlib

# Add the project's root directory to the Python path to ensure 'utils' can be imported.
try:
    # Assumes the notebook is in 'Supporting Materials/'
    project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))
except IndexError:
    # Fallback for different execution environments
    project_root = os.path.abspath(os.path.join(os.getcwd()))

if project_root not in sys.path:
    sys.path.insert(0, project_root)

try:
    import utils
    importlib.reload(utils)
    print("✅ utils.py imported OK")
except Exception as e:
    raise RuntimeError(f"Could not import utils.py. Ensure it's in your project folder. Error: {e}")


✅ utils.py imported OK


## 2) Load `.env` and explore recommended models

In [2]:
from utils import load_environment, recommended_models_table
import os

load_environment()  # prints a warning if no .env is found

# Check if GOOGLE_API_KEY is loaded
google_api_key = os.getenv('GOOGLE_API_KEY')
if google_api_key:
    print("✅ GOOGLE_API_KEY is loaded")
    # Print first and last few characters for verification (without revealing the full key)
    print(f"API Key preview: {google_api_key[:6]}...{google_api_key[-4:]}")
else:
    print("❌ GOOGLE_API_KEY is not loaded")

_ = recommended_models_table()  # full list
print("\nFiltered examples:")
_ = recommended_models_table(task="text", min_context=100_000)
_ = recommended_models_table(task="image")
_ = recommended_models_table(task="audio")

✅ GOOGLE_API_KEY is loaded
API Key preview: AIzaSy...7xqk


| Model | Provider | Text | Vision | Image Gen | Image Edit | Audio Transcription | Context Window | Max Output Tokens |
|---|---|---|---|---|---|---|---|---|
| Qwen/Qwen-Image | huggingface | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |
| Qwen/Qwen-Image-Edit | huggingface | ❌ | ❌ | ❌ | ✅ | ❌ | - | - |
| black-forest-labs/FLUX.1-Kontext-dev | huggingface | ❌ | ❌ | ❌ | ✅ | ❌ | - | - |
| claude-opus-4-1-20250805 | anthropic | ✅ | ✅ | ❌ | ❌ | ❌ | 200,000 | 100,000 |
| claude-opus-4-20250514 | anthropic | ✅ | ✅ | ❌ | ❌ | ❌ | 200,000 | 100,000 |
| claude-sonnet-4-20250514 | anthropic | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 100,000 |
| dall-e-3 | openai | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |
| deepseek-ai/DeepSeek-V3.1 | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 128,000 | 100,000 |
| gemini-1.5-flash | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 8,192 |
| gemini-1.5-pro | google | ✅ | ✅ | ❌ | ❌ | ❌ | 2,000,000 | 8,192 |
| gemini-2.0-flash-exp | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,048,576 | 8,192 |
| gemini-2.0-flash-preview-image-generation | google | ❌ | ❌ | ✅ | ❌ | ❌ | 32,000 | 8,192 |
| gemini-2.5-flash | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,048,576 | 65,536 |
| gemini-2.5-flash-image-preview | google | ❌ | ❌ | ✅ | ❌ | ❌ | 32,768 | 32,768 |
| gemini-2.5-flash-lite | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,048,576 | 65,536 |
| gemini-2.5-pro | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,048,576 | 65,536 |
| gemini-live-2.5-flash-preview | google | ❌ | ❌ | ❌ | ❌ | ❌ | 1,048,576 | 8,192 |
| gpt-4.1 | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 32,768 |
| gpt-4.1-mini | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 32,000 |
| gpt-4.1-nano | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 32,000 |
| gpt-4o | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 128,000 | 16,384 |
| gpt-4o-mini | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 128,000 | 16,384 |
| gpt-5-2025-08-07 | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 400,000 | 128,000 |
| gpt-5-mini-2025-08-07 | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 400,000 | 128,000 |
| gpt-5-nano-2025-08-07 | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 400,000 | 128,000 |
| meta-llama/Llama-3.3-70B-Instruct | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 8,192 | 4,096 |
| meta-llama/Llama-4-Maverick-17B-128E-Instruct | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 1,000,000 | 100,000 |
| meta-llama/Llama-4-Scout-17B-16E-Instruct | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 10,000,000 | 100,000 |
| mistralai/Mistral-7B-Instruct-v0.3 | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 32,768 | 8,192 |
| o3 | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 200,000 | 100,000 |
| o4-mini | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 200,000 | 100,000 |
| stabilityai/stable-diffusion-3.5-large | huggingface | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |
| tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5 | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 4,096 | 1,024 |
| veo-3.0-fast-generate-preview | google | ❌ | ❌ | ❌ | ❌ | ❌ | 1,024 | - |
| veo-3.0-generate-preview | google | ❌ | ❌ | ❌ | ❌ | ❌ | 1,024 | - |
| whisper-1 | openai | ❌ | ❌ | ❌ | ❌ | ✅ | - | - |


Filtered examples:


| Model | Provider | Text | Vision | Image Gen | Image Edit | Audio Transcription | Context Window | Max Output Tokens |
|---|---|---|---|---|---|---|---|---|
| deepseek-ai/DeepSeek-V3.1 | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 128,000 | 100,000 |
| meta-llama/Llama-4-Maverick-17B-128E-Instruct | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 1,000,000 | 100,000 |
| meta-llama/Llama-4-Scout-17B-16E-Instruct | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 10,000,000 | 100,000 |

| Model | Provider | Text | Vision | Image Gen | Image Edit | Audio Transcription | Context Window | Max Output Tokens |
|---|---|---|---|---|---|---|---|---|
| Qwen/Qwen-Image | huggingface | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |
| dall-e-3 | openai | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |
| gemini-2.0-flash-preview-image-generation | google | ❌ | ❌ | ✅ | ❌ | ❌ | 32,000 | 8,192 |
| gemini-2.5-flash-image-preview | google | ❌ | ❌ | ✅ | ❌ | ❌ | 32,768 | 32,768 |
| stabilityai/stable-diffusion-3.5-large | huggingface | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |

| Model | Provider | Text | Vision | Image Gen | Image Edit | Audio Transcription | Context Window | Max Output Tokens |
|---|---|---|---|---|---|---|---|---|
| whisper-1 | openai | ❌ | ❌ | ❌ | ❌ | ✅ | - | - |

In [3]:
help(recommended_models_table)

Help on function recommended_models_table in module utils.models:

recommended_models_table(task: 'str | None' = None, provider: 'str | None' = None, text_generation: 'bool | None' = None, vision: 'bool | None' = None, image_generation: 'bool | None' = None, audio_transcription: 'bool | None' = None, min_context: 'int | None' = None, min_output_tokens: 'int | None' = None, image_modification: 'bool | None' = None) -> 'str'
    Return a markdown table of recommended models filtered by capabilities.



## 3) Configure a client (no API call yet)

In [4]:

from utils import setup_llm_client
MODEL = "deepseek-ai/DeepSeek-V3.1"  # change when running locally (e.g., "gemini-2.5-pro", "claude-opus-4-1-20250805")
client, model_name, provider = setup_llm_client(MODEL)
print("Provider:", provider, "| Model:", model_name)

  from .autonotebook import tqdm as notebook_tqdm
2025-09-22 14:14:42,995 ag_aisoftdev.utils INFO LLM Client configured provider=huggingface model=deepseek-ai/DeepSeek-V3.1 latency_ms=None artifacts_path=None


Provider: huggingface | Model: deepseek-ai/DeepSeek-V3.1


## 4) Artifact helpers (always offline-safe)

In [5]:

from utils import save_artifact, load_artifact
save_artifact("# Demo artifact\nThis was written by the starter notebook.", "artifacts/notes/starter_demo.md", overwrite=True)
print("Preview:", (load_artifact("artifacts/notes/starter_demo.md") or "")[:120])


Preview: # Demo artifact
This was written by the starter notebook.


## 5) (Optional) PlantUML — requires internet; leave commented if offline

In [6]:
from utils import render_plantuml_diagram
puml = """
@startuml
actor User
User -> API: POST /employees
API -> DB: insert record
@enduml
"""
render_plantuml_diagram(puml, "artifacts/diagrams/quickcheck.png")


2025-09-22 14:14:47,126 ag_aisoftdev.utils INFO PlantUML diagram rendered. provider=None model=None latency_ms=None artifacts_path=/Users/agaleana/repos/AG-AISOFTDEV/artifacts/diagrams/quickcheck.png


PosixPath('/Users/agaleana/repos/AG-AISOFTDEV/artifacts/diagrams/quickcheck.png')

## 6) Text completion — uncomment when keys are set

In [7]:

from utils import get_completion
print(get_completion("Give me 3 bullet tips for clean FastAPI code.", 
                     client, model_name, provider, temperature=0.3))

Of course. Here are 3 essential bullet tips for writing clean, maintainable FastAPI code:

*   **Structure Your Project with Routers and Separate Modules**
    Don't put all your endpoints in a single `main.py` file. Break your application into logical components using **APIRouters**.
    *   **Example Structure:**
        ```
        app/
        ├── main.py          # Import and include all routers
        ├── api/
        │   ├── __init__.py
        │   ├── routers/
        │   │   ├── __init__.py
        │   │   ├── items.py  # Contains router for /items/*
        │   │   └── users.py  # Contains router for /users/*
        │   └── dependencies.py  # Reusable dependencies (e.g., get_db)
        ├── models/          # SQLAlchemy or Pydantic models
        ├── schemas/         # Pydantic schemas (request/response models)
        └── core/            # Config, security, etc.
        ```
    *   **Why it's clean:** This promotes separation of concerns, makes code easier to navigate, an

## 7) Image generation

In [None]:
# Try to use a Google Image model
# If you get a timeout, it might be because the model is taking too long to respond.
# You can increase the timeout by setting the UTILS_TIMEOUT_READ environment variable
# in your .env file. For example:
# UTILS_TIMEOUT_READ=300

from utils import setup_llm_client, get_image_generation_completion
from IPython.display import Image, display

IMAGE_MODEL = "gemini-2.0-flash-preview-image-generation"

# Get the provider and client for the selected model
image_client, model_name, image_provider = setup_llm_client(IMAGE_MODEL)

# Define the prompt for the image generation
prompt = "A photorealistic image of a cat wearing a superhero cape"

# Generate the image
try:
    print(f"Attempting to generate an image with {image_provider}:{model_name}...")
    img_bytes, mime = get_image_generation_completion(
        image_client, prompt, model_name, image_provider
    )

    # Display the generated image
    if img_bytes:
        print(f"Successfully generated a {mime} image.")
        display(Image(data=img_bytes))
    else:
        print("Image generation failed to return image data.")

except Exception as e:
    print(f"An error occurred during image generation: {e}")
    # If you get a 404 error, it might be because the model is not available
    # in your region or for your API key. Try a different model.


## 8) Vision (image + text)

In [10]:

from utils import get_vision_completion_compat
# Pick an existing image you generated above, or we make a simple fallback
url = "artifacts/diagrams/quickcheck.png"
VISION_MODEL = "gpt-4o"
vision_client, vision_model_name, vision_provider = setup_llm_client(VISION_MODEL)



result, err = get_vision_completion_compat(
    "explain this image", url, vision_client, vision_model_name, vision_provider
)
if err:
    print(
        f"Vision not available for provider/model '{vision_provider}/{vision_model_name}' — {err}\n"
        "Tip: Choose a vision-capable model in cell 3, e.g. 'gpt-4o' (OpenAI) or 'gemini-2.5-pro' (Google)."
    )
else:
    print(result)


2025-09-22 14:16:30,259 ag_aisoftdev.utils INFO LLM Client configured provider=openai model=gpt-4o latency_ms=None artifacts_path=None


Vision not available for provider/model 'openai/gpt-4o' — [openai:] vision error: Not implemented in this environment
Tip: Choose a vision-capable model in cell 3, e.g. 'gpt-4o' (OpenAI) or 'gemini-2.5-pro' (Google).


## 9) Image editing — uncomment for supported models

In [None]:

from utils import get_image_edit_completion
from IPython.display import Image, display

IMAGE_EDIT_MODEL = "Qwen/Qwen-Image-Edit" 
image_edit_client, image_edit_model_name, image_edit_provider = setup_llm_client(IMAGE_EDIT_MODEL)
print("Provider:", image_edit_provider, "| Model:", image_edit_model_name)


edited_path, data_url_or_msg = get_image_edit_completion(
    "Add a monkey riding the rocket.",
    "artifacts/screens/image_1758557108.png", # <-- change the path to an existing image like the one generated above
    image_edit_client, image_edit_model_name, image_edit_provider
)

if edited_path and data_url_or_msg.startswith("data:image"):
    # The image is automatically saved by the helper function, so we just display it.
    display(Image(url=data_url_or_msg))
else:
    # If there was an error, print it
    print(f"An error occurred: {data_url_or_msg}")


## 10) Audio transcription — uncomment for Whisper/Google STT

In [None]:

from utils import transcribe_audio
TRANSCRIPTION_MODEL = "whisper-1" 
transcription_client, transcription_model_name, transcription_provider = setup_llm_client(TRANSCRIPTION_MODEL)
print("Provider:", transcription_provider, "| Model:", transcription_model_name)


transcribe_audio("artifacts/audio/sample.wav", transcription_client, transcription_model_name, transcription_provider, language_code="en-US")

## 11) Clean model output (strip code fences) — always safe

In [None]:

from utils import clean_llm_output
raw = """```python
print("hello")
```"""
print("Cleaned:", clean_llm_output(raw, language="python"))

In [None]:
from google import genai
from google.genai import types
client = genai.Client(api_key="GOOGLE_API_KEY")
response = client.models.generate_content(
   model="gemini-2.5-flash-image-preview",
   contents=(
       "Show me how to bake a macaron with images."
   ),
   config=types.GenerateContentConfig(
        response_modalities=["TEXT", "IMAGE"]
   ),
)

In [None]:
# Display inline images from the `response` object (GenerateContentResponse)
parts = response.candidates[0].content.parts
images_shown = 0

for idx, part in enumerate(parts):
    blob = getattr(part, "inline_data", None)
    if not blob or not getattr(blob, "data", None):
        continue

    img_bytes = blob.data  # raw bytes
    mime = getattr(blob, "mime_type", None) or "image/png"
    fmt = mime.split("/")[-1] if "/" in mime else mime

    try:
        # Prefer displaying directly from bytes
        display(Image(data=img_bytes, format=fmt))
        images_shown += 1
    except Exception:
        # Fallback: save to disk and display from file
        out_path = f"artifacts/screens/response_image_{idx}.{fmt}"
        os.makedirs(os.path.dirname(out_path), exist_ok=True)
        with open(out_path, "wb") as f:
            f.write(img_bytes)
        display(Image(filename=out_path))
        images_shown += 1

if images_shown == 0:
    print("No inline images found in the response.")

In [None]:
from google import genai
from google.genai import types
import os
import time

# Test direct Google API call with multiple attempts and longer timeout
api_key = os.getenv('GOOGLE_API_KEY')
if not api_key:
    print("API key not found")
else:
    print("API key found, making direct call...")
    client = genai.Client(api_key=api_key)
    
    # Try multiple models
    models_to_try = [
        "gemini-2.5-flash-image-preview",
        "gemini-2.0-flash-preview-image-generation",
        "imagen-3.0-preview",
        "imagegeneration"
    ]
    
    prompt = "A cute baby sea otter wearing a beret and glasses, reading a book by the seashore, digital art"
    
    for model_name in models_to_try:
        try:
            print(f"Trying model: {model_name}")
            response = client.models.generate_content(
               model=model_name,
               contents=prompt,
               config=types.GenerateContentConfig(
                    response_modalities=["TEXT", "IMAGE"],
                    http_options=types.HttpOptions(timeout=300)
               ),
            )
            print(f"Success with model {model_name}!")
            
            # Display inline images from the `response` object (GenerateContentResponse)
            parts = response.candidates[0].content.parts
            images_shown = 0

            for idx, part in enumerate(parts):
                blob = getattr(part, "inline_data", None)
                if not blob or not getattr(blob, "data", None):
                    continue

                img_bytes = blob.data  # raw bytes
                mime = getattr(blob, "mime_type", None) or "image/png"
                fmt = mime.split("/")[-1] if "/" in mime else mime

                try:
                    # Prefer displaying directly from bytes
                    display(Image(data=img_bytes, format=fmt))
                    images_shown += 1
                except Exception:
                    # Fallback: save to disk and display from file
                    out_path = f"artifacts/screens/response_image_{idx}.{fmt}"
                    os.makedirs(os.path.dirname(out_path), exist_ok=True)
                    with open(out_path, "wb") as f:
                        f.write(img_bytes)
                    display(Image(filename=out_path))
                    images_shown += 1

            if images_shown == 0:
                print("No inline images found in the response.")
            else:
                print(f"Successfully displayed {images_shown} images.")
            break  # If successful, break out of the loop
            
        except Exception as e:
            print(f"Failed with model {model_name}: {e}")
            if "404" in str(e) or "not found" in str(e).lower():
                continue  # Try the next model
            else:
                # For other errors, wait a bit and try again
                print("Waiting 5 seconds before retry...")
                time.sleep(5)
                try:
                    print(f"Retrying model: {model_name}")
                    response = client.models.generate_content(
                       model=model_name,
                       contents=prompt,
                       config=types.GenerateContentConfig(
                            response_modalities=["TEXT", "IMAGE"],
                            http_options=types.HttpOptions(timeout=300)
                       ),
                    )
                    print(f"Retry success with model {model_name}!")
                    # Display code here (same as above)
                    parts = response.candidates[0].content.parts
                    images_shown = 0

                    for idx, part in enumerate(parts):
                        blob = getattr(part, "inline_data", None)
                        if not blob or not getattr(blob, "data", None):
                            continue

                        img_bytes = blob.data  # raw bytes
                        mime = getattr(blob, "mime_type", None) or "image/png"
                        fmt = mime.split("/")[-1] if "/" in mime else mime

                        try:
                            # Prefer displaying directly from bytes
                            display(Image(data=img_bytes, format=fmt))
                            images_shown += 1
                        except Exception:
                            # Fallback: save to disk and display from file
                            out_path = f"artifacts/screens/response_image_{idx}.{fmt}"
                            os.makedirs(os.path.dirname(out_path), exist_ok=True)
                            with open(out_path, "wb") as f:
                                f.write(img_bytes)
                            display(Image(filename=out_path))
                            images_shown += 1

                    if images_shown == 0:
                        print("No inline images found in the response.")
                    else:
                        print(f"Successfully displayed {images_shown} images.")
                    break  # If successful, break out of the loop
                except Exception as retry_error:
                    print(f"Retry also failed: {retry_error}")
                    continue  # Try the next model