# Serverless LLMs and Agentic AI with Modal – Lesson 1

In this notebook, we will:

1. Install and verify **Modal** in Google Colab.
2. Authenticate with Modal using **API tokens**.
3. Run a simple Python function **locally**.
4. Turn that function into a **Modal app** that runs remotely.
5. Run it **once** on Modal's serverless infrastructure.
6. Fan out work in **parallel** using `.map()`.


In [None]:
# Install Modal (serverless compute platform for Python)
# This only needs to be run once per Colab runtime.

!pip install -U modal --quiet

print("✅ Modal installed (or already up-to-date).")

In [None]:
# Check that the `modal` CLI is available in this environment.

!which modal || echo "modal CLI not found in PATH"
!modal --version || echo "Could not get modal version"

print("If you see a version above, the Modal CLI is ready.")

## Authentication with Modal

To run code on Modal's servers, we need to authenticate using an **API token**.

Steps (do this in your own browser, not in Colab):

1. Go to: https://modal.com  
2. Log in or create an account.
3. Click on your profile → **"Tokens"** (API Tokens).
4. Create a new token and copy:
   - `MODAL_TOKEN_ID`
   - `MODAL_TOKEN_SECRET`

We will paste those values into environment variables in the next cell and run:

```bash
modal token set --token-id <YOUR_TOKEN_ID> --token-secret <YOUR_TOKEN_SECRET>
```

> ⚠️ **Security warning**  
> - Never share your token ID or secret.
> - Never commit them to GitHub.
> - If this notebook is shared, **clear or delete** the cell with your tokens.


In [None]:
import os

# ⛔ STUDENT ACTION REQUIRED:
# Replace the placeholder strings below with your actual token values.
#
# Example:
# TOKEN_ID = "tok-123..."
# TOKEN_SECRET = "msec-456..."
#
# DO NOT share these values or commit them to version control.

TOKEN_ID = "PASTE_YOUR_MODAL_TOKEN_ID_HERE"
TOKEN_SECRET = "PASTE_YOUR_MODAL_TOKEN_SECRET_HERE"


if "PASTE_YOUR_MODAL_TOKEN_ID_HERE" in TOKEN_ID or "PASTE_YOUR_MODAL_TOKEN_SECRET_HERE" in TOKEN_SECRET:
    raise ValueError("❌ Please set TOKEN_ID and TOKEN_SECRET before running this cell.")

# Export as environment variables (not strictly necessary, but convenient)
os.environ["MODAL_TOKEN_ID"] = TOKEN_ID
os.environ["MODAL_TOKEN_SECRET"] = TOKEN_SECRET



# Use the Modal CLI to store these credentials on this machine.
# In Colab, this means "for this runtime".
!modal token set --token-id "$MODAL_TOKEN_ID" --token-secret "$MODAL_TOKEN_SECRET"

print("✅ Modal token configured for this Colab runtime.")

In [None]:
# You can inspect profiles:

#!modal token -h

!modal profile list


## Creating the Modal App Script

Next, we will create a Python file `lesson1_modal_intro.py`
containing our **full lesson**:

- A local function `heavy_math_local` that:
  - waits for a random delay (simulating heavy work),
  - returns `x`, `x²`, `x³`, and the delay.
- A Modal remote function `heavy_math_remote` that wraps that logic.
- Step-by-step helper functions:
  - `step_2_run_local_function()`
  - `step_3_run_remote_once()`
  - `step_4_run_remote_in_parallel()`
  - `step_5_print_inspection_instructions()`
- A `main()` function marked with `@app.local_entrypoint()` that
  runs all the steps when we do:

```bash
modal run lesson1_modal_intro.py
```

We’ll write this entire script to disk using a `%%writefile` cell,
so it behaves like a normal `.py` file on your machine.


In [None]:
%%writefile lesson1_modal_intro.py
"""
Serverless LLMs with Modal – Lesson 1
----------------------------------------------------------------

This file is designed to be run with:

    modal run lesson1_modal_intro.py

It demonstrates:

1. A local Python function that simulates heavy work.
2. Wrapping that function as a Modal remote function.
3. Calling the function once remotely.
4. Fanning out many calls in parallel with `.map()`.
5. How to inspect runs and containers in the Modal dashboard.
"""

import random
import time
from typing import Dict, List

import modal

# --------------------------------------------------------
#  Modal App Declaration
# --------------------------------------------------------
# Think of `app` as the "project" or "service" in Modal.
# All of the functions / entrypoints in this file belong to
# this `serverless-llm-course-lesson1` app.

app = modal.App(name="serverless-llm-course-lesson1")


# --------------------------------------------------------
#  STEP 2 – Local-only Python function (no Modal yet)
# --------------------------------------------------------
# This is just normal Python. We’ll call it locally and also
# reuse the logic inside the Modal function.

def heavy_math_local(x: int) -> Dict:
    """
    Pretend this is a heavy computation.

    What this function does:
    - Sleeps for a random duration (0.5–1.5 seconds) to simulate work.
    - Computes x^2 and x^3.
    - Returns a small dictionary with metadata.

    Parameters
    ----------
    x : int
        The input integer.

    Returns
    -------
    Dict
        A dictionary with:
        - "input": original x
        - "squared": x^2
        - "cubed": x^3
        - "delay_seconds": how long we slept (rounded)
    """
    # Random delay between 0.5 and 1.5 seconds
    delay = random.uniform(0.5, 1.5)
    time.sleep(delay)

    squared = x * x
    cubed = x * x * x

    return {
        "input": x,
        "squared": squared,
        "cubed": cubed,
        "delay_seconds": round(delay, 2),
    }


def step_2_run_local_function() -> None:
    """
    STEP 2: Show that the plain Python function works locally.

    This is here mostly for education and debugging:
    - If this part does not work, nothing else will.
    - This is also a nice contrast with the remote execution later.
    """
    print("\n==============================")
    print("STEP 2: Local Python execution")
    print("==============================")

    test_values = [2, 3, 5]
    for value in test_values:
        info = heavy_math_local(value)
        print(f"Local result for x={value}: {info}")


# --------------------------------------------------------
#  STEP 3 – Wrap the function as a Modal remote function
# --------------------------------------------------------
# `@app.function()` tells Modal:
#   "This function can run inside a container in the cloud."

@app.function()
def heavy_math_remote(x: int) -> Dict:
    """
    This function will run inside a Modal container
    when you call `heavy_math_remote.remote(...)` or `.map(...)`.

    Note:
    - The logic is the same as heavy_math_local.
    - The difference is WHERE it runs: in Modal's infrastructure.
    """
    return heavy_math_local(x)


def step_3_run_remote_once() -> None:
    """
    STEP 3: Call the Modal function once, remotely.

    We use:
        heavy_math_remote.remote(x)

    This:
    - Triggers Modal to start a container (if needed).
    - Executes the function *in the cloud*.
    - Returns the result back to your local process.
    """
    print("\n======================================")
    print("STEP 3: Single remote execution (Modal)")
    print("======================================")

    x = 4
    result = heavy_math_remote.remote(x)
    print(f"Remote result for x={x}: {result}")


# --------------------------------------------------------
#  STEP 4 – Fan out many remote calls using .map()
# --------------------------------------------------------
# `.map()` is where Modal really starts to feel "serverless":
#  - You give it many inputs.
#  - Modal spreads the work across multiple containers.
#  - You just get back a Python iterator of results.

def step_4_run_remote_in_parallel() -> None:
    """
    STEP 4: Run many calls in parallel using `.map()`.

    Here we:
    - Create a list of inputs (1..10).
    - Call heavy_math_remote.map(inputs)
    - Collect the results into a list.
    """
    print("\n========================================")
    print("STEP 4: Parallel execution with .map()")
    print("========================================")

    inputs = list(range(1, 11))  # numbers 1..10
    print(f"Submitting {len(inputs)} jobs to Modal:", inputs)

    # NOTE:
    #   heavy_math_remote.map(...) returns an iterator.
    #   We wrap it in list(...) so we force its evaluation
    #   and can print everything nicely.
    results: List[Dict] = list(heavy_math_remote.map(inputs))

    print("\nResults from Modal (one per input):")
    for r in results:
        print(r)

    # Bonus: show how easy it is to aggregate the results.
    total_squared = sum(r["squared"] for r in results)
    total_cubed = sum(r["cubed"] for r in results)

    print("\nAggregates:")
    print("  Sum of all squared values:", total_squared)
    print("  Sum of all cubed values   :", total_cubed)


# --------------------------------------------------------
#  STEP 5 – Instructions for Dashboard & Containers
# --------------------------------------------------------
# This step is "documentation" that prints to the console.
# The idea is: while your app is running, you can go to
# the Modal dashboard and also use the CLI to list containers.

def step_5_print_inspection_instructions() -> None:
    """
    STEP 5: Print next steps for inspection & debugging.

    This doesn't talk to Modal directly; it just reminds you:
    - Where to look in the dashboard.
    - Which CLI commands are useful.
    """
    print("\n==============================================")
    print("STEP 5: Inspecting runs, logs and containers")
    print("==============================================")

    print(
        """
Things you can do now:

1) Open the Modal dashboard in your browser:
   - URL: https://modal.com/apps
   - Look for the app named: serverless-llm-course-lesson1
   - Click on it and explore:
       - Runs
       - Logs
       - Containers
       - CPU / memory usage over time

2) From your terminal, try:
   - List apps:
        modal app list

   - List currently running containers:
        modal container list

   - Jump into a specific container (replace <ID>):
        modal shell <CONTAINER_ID>

   From inside the container shell, you can:
       - Run `ls` to see files.
       - Run `python` if you need to poke around.
       - Exit with Ctrl+D or `exit`.

Remember:
- Modal spins containers up when needed.
- Shuts them down when work is finished.
- You only pay for the compute time you actually use.
        """
    )


# --------------------------------------------------------
#  Entry Point – This is what `modal run` will execute
# --------------------------------------------------------
# The decorator @app.local_entrypoint() tells Modal:
#   "When I run `modal run lesson1_modal_intro.py`,
#    execute this function."

@app.local_entrypoint()
def main():
    """
    Main entrypoint for Lesson 1.

    This is the part that runs when you call:

        modal run lesson1_modal_intro.py

    Step-by-step usage:
    -------------------
    - During teaching/learning, you can selectively enable/disable steps.
    - Just comment or uncomment the calls below.
    - Example: first only run the local function, then add remote, etc.
    """
    print("\n========================")
    print("Serverless LLMs – Lesson 1")
    print("========================")

    # --- STEP-BY-STEP CONTROL ------------------------------------
    # Uncomment the lines you want to run for the current demo,
    # then run again with `modal run lesson1_modal_intro.py`.

    # 1) Show the local Python behavior (no Modal involved)
    step_2_run_local_function()

    # 2) Show a single remote execution on Modal
    step_3_run_remote_once()

    # 3) Show parallel execution on Modal with .map()
    step_4_run_remote_in_parallel()

    # 4) Print instructions for checking the dashboard & CLI
    step_5_print_inspection_instructions()

    print("\nLesson 1 complete ✅")
    print("You can now modify this file or move on to LLM-powered examples.")

In [None]:
# This will:
# - Upload the code in lesson1_modal_intro.py to Modal.
# - Execute the `main()` function defined there.
#
# You should see:
# - STEP 2: Local Python execution
# - STEP 3: Single remote execution (Modal)
# - STEP 4: Parallel execution with .map()
# - STEP 5: Inspecting runs, logs and containers

!modal run lesson1_modal_intro.py

In [None]:
!modal app list

In [None]:
!modal container list

## Reflection / Discussion Questions

1. What is the difference between:
   - `heavy_math_local(x)`
   - `heavy_math_remote.remote(x)` ?

2. In your own words, what does `.map()` do in this line?

   ```python
   results = list(heavy_math_remote.map(inputs))
   ```

3. How could you adapt this example to:
   - Call an external API per input?
   - Process a batch of images?
   - Run an LLM or embedding model?

4. Open the Modal dashboard:
   - Which app name do you see?
   - Can you find the logs for the most recent run?
