# Lesson: Building Shell-Enabled Agents with `ShellToolMiddleware`

This notebook shows how to configure shell access for LangChain agents in three practical ways: basic host execution, a restricted/secure setup, and a maintenance-focused shell profile.

> **Teaching goal:** understand not just _what_ to type, but _why_ each middleware option exists and when to use it.


## Key Concept: What is Middleware?

Before we dive into the code, it's important to understand the **Middleware pattern**.

In LangChain, Shell Middleware is **not just a tool**. It is a layer that wraps the agent's execution. Instead of simply giving the agent a "shell tool," the middleware handles:

- **Sandbox Setup**: Initializing the environment (directories, env vars).
- **Security**: Redacting sensitive output before it reaches the LLM.
- **Cleanup**: Ensuring processes are closed and temp files are deleted.

> **Discussion Point:** Ask your student: "Why might we want to hide certain output from the LLM using redaction, rather than just hiding it from the human user?" (Answer: To prevent the LLM from 'hallucinating' or gaining access to secrets it might try to use later).


## Learning Goals

By the end of this lesson, you should be able to:

- Explain the role of `ShellToolMiddleware` in an agent workflow.
- Configure a basic shell-enabled agent on the host machine.
- Add security controls like environment variables, startup commands, and redaction.
- Customize shell behavior (shell type, tool description, shutdown cleanup).
- Trace the end-to-end script flow from setup to invocation and output.

## Prerequisite Knowledge

Before running this notebook, you should be comfortable with:

- Basic Python syntax (imports, variables, dictionaries, function calls)
- The idea of an LLM-powered agent
- Running local models with Ollama
- Basic shell commands (`echo`, `touch`, file listing)


## Script Walkthrough (Big Picture)

The script follows this flow:

1. **Setup** imports, environment variables, and the local LLM.
2. **Example 1** creates a basic shell-enabled agent and runs a simple file task.
3. **Example 2** creates a restricted workspace with env vars + redaction rules.
4. **Example 3** creates a maintenance-oriented shell profile with cleanup commands.
5. Each example calls `agent.invoke(...)` and prints the final response.

**Why this matters:** this pattern is reusable for real systems where agents need controlled shell automation.


## Section 1 — Environment and Model Setup

This first block imports dependencies, loads `.env` values, and initializes `ChatOllama`.

### Inline Notes

- `load_dotenv()` loads environment variables from a local `.env` file.
- `os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")` provides a safe default when the variable is missing.
- `model="gpt-oss:20b"` selects the local model used by all examples.

**Design decision:** the LLM is created once and reused, which keeps the notebook cleaner and avoids repeated setup.


In [10]:
# ----------------- MASTERING SHELL TOOL MIDDLEWARE - TUTORIAL CODE -----------------
# This script demonstrates 3 key configurations for ShellToolMiddleware:
# 1. Basic Host Access
# 2. Secure/Restricted Environment (Redaction & Env Vars)
# 3. Custom Maintenance Shell (Zsh & Cleanup)

import os
from dotenv import load_dotenv
from rich import print
from langchain.agents import create_agent
from langchain.messages import HumanMessage
from langchain_ollama import ChatOllama
from langchain.agents.middleware import (
    ShellToolMiddleware,
    HostExecutionPolicy,
    RedactionRule,
)

# 1. Setup Environment
# os.chdir(os.path.dirname(os.path.abspath(__file__)))
load_dotenv()

# Initialize Local LLM
llm = ChatOllama(
    model="gpt-oss:20b",
    base_url=os.getenv("OLLAMA_BASE_URL", "http://localhost:11434"),
)

### Teacher's Tip: Environment Consistency

When teaching this, emphasize why we use `load_dotenv()`:

- It keeps API keys out of the code and in a separate `.env` file.
- It makes the script portable across machines.

**Ask the Student:** "What happens if `model="gpt-oss:20b"` isn't pulled in Ollama?" (They'll get a 'not found' error).


### Common Mistakes to Avoid (Setup)

- Forgetting to run Ollama locally before invoking the agent.
- Misspelling environment variable names (for example `OLLAMA_BASE_URL`).
- Assuming shell middleware works without explicitly adding `ShellToolMiddleware`.

### Quick Checkpoint

1. What is the benefit of giving `os.getenv` a default value?
2. Why is it useful to initialize `llm` once at the top?


## Section 2 — Example 1: Basic Host Shell Agent

This block creates the simplest shell-enabled agent.

### What this code does

- Builds an agent with `create_agent(...)`.
- Attaches `ShellToolMiddleware`.
- Sets `workspace_root="./"` so shell commands start in the current project folder.
- Uses `HostExecutionPolicy()` so commands execute directly on your machine.

**Why this matters:** this is the minimum viable setup for shell automation.


In [11]:
# ---------------- EXAMPLE 1: BASIC HOST SHELL AGENT -----------------
# The simplest configuration. Grants the agent access to the host system to run commands.

print("--- 1. Running Basic Host Agent ---")

basic_agent = create_agent(
    model=llm,
    tools=[],  # We can add other tools here if needed
    middleware=[
        ShellToolMiddleware(
            # workspace_root: Directory where the shell session starts
            workspace_root="./",
            # execution_policy: Controls HOW commands are run (Host vs Docker)
            execution_policy=HostExecutionPolicy(),
        ),
    ],
)

### Teacher's Guide: Understanding Workspace Selection

- **`workspace_root="./"`**: This sets the "Home" directory for the agent. Any `cd` commands the agent runs are relative to this root. It also prevents the agent from accidentally escaping to the system root if configured correctly in more restrictive policies.
- **`HostExecutionPolicy()`**: This is the "Open" policy. It tells LangChain to run commands directly in your terminal.

**Student Exercise:** Try changing `./` to a specific folder like `./my_testing_ground/` and see if the agent can still find the local `pyproject.toml` file.


### Section 2B — Running Example 1

The next block sends a human instruction to the agent:

- Create `hello.txt` with content.
- List files.
- Print full result and final assistant message.

### Inline Notes

- `agent.invoke({"messages": [HumanMessage(...)]})` is the standard message-based call format.
- `result["messages"][-1].content` retrieves the last model response for easier reading.


In [12]:
# The agent will create a file and list the directory contents
result_1 = basic_agent.invoke(
    {
        "messages": [
            HumanMessage(
                "Create a file 'hello.txt' with text 'Hello World', then list files."
            )
        ]
    }
)

print(result_1)
print(result_1["messages"][-1].content)

### Quick Checkpoint (Example 1)

- Which line grants shell capability to the agent?
- What risk comes with direct host execution in production?


## Section 3 — Example 2: Secure / Restricted Agent

This section introduces safer defaults and output protection.

### What this code does

- Creates a dedicated restricted folder.
- Starts shell sessions inside that folder.
- Injects environment variables (including an API key example).
- Runs startup commands automatically.
- Applies redaction rules to hide sensitive output patterns.

### Syntax + Design Notes

- `env={...}` passes values into the shell process environment.
- `startup_commands=[...]` executes initialization commands once per session.
- `RedactionRule(..., detector=r"...")` uses regex to mask matching sensitive text.

**Why this matters:** secure defaults reduce accidental leakage when agents execute shell commands.


In [13]:
# ---------------- EXAMPLE 2: SECURE / RESTRICTED AGENT -----------------
# Demonstrates how to:
# - Set environment variables
# - Run startup commands
# - Redact sensitive information (PII)

print("\n--- 2. Running Secure Agent ---")

# Create a restricted directory for this example
restricted_dir = "./restricted_data"
if not os.path.exists(restricted_dir):
    os.makedirs(restricted_dir)

# In a secure agent, we can set environment variables, run startup commands, and define redaction rules to protect sensitive information.
secure_agent = create_agent(
    model=llm,
    tools=[],
    middleware=[
        ShellToolMiddleware(
            workspace_root=restricted_dir,
            # env: Inject secrets or config into the shell session
            env={"MODE": "restricted", "API_KEY": "sk-secret-12345"},
            # startup_commands: Run these immediately when the session starts
            startup_commands=["echo 'Secure Session Started'", "touch session.lock"],
            # redaction_rules: Regex patterns to hide sensitive output
            redaction_rules=[
                RedactionRule(pii_type="custom_api_key", detector=r"API_KEY=[\w-]+")
            ],
            execution_policy=HostExecutionPolicy(),
        ),
    ],
)

### Teacher's Guide: Secure Middleware Features

This configuration is **"Enterprise Ready."** Why?

1.  **`startup_commands`**: Useful for initializing Python environments, pulling data from a database, or creating a `session.lock` to prevent parallel runs.
2.  **`env`**: Injecting `API_KEY` directly into the shell process environment, rather than letting the LLM "know" the literal value if it's not needed.
3.  **`RedactionRule`**: This is critical. If the agent runs `env`, the results will contain the sensitive `API_KEY`. The middleware sees the output, finds our regex pattern, and overwrites it before the LLM (or the UI) ever sees it.

**Discussion Question:** "Why do we use regex (`r"API_KEY=[\w-]+"` ) instead of just the literal string?" (Answer: To catch any variation of the key or other similarly formatted tokens).


### Section 3B — Running Example 2

This block asks the agent to inspect an env var and list files in the restricted workspace, then prints the response.

### Common Mistakes to Avoid (Security)

- Writing broad regex redaction patterns that hide too much useful output.
- Storing real secrets directly in notebooks.
- Forgetting that startup commands can fail if paths or permissions are wrong.


In [14]:
result_2 = secure_agent.invoke(
    {"messages": [HumanMessage("Check the API_KEY env var and list files.")]}
)
print(result_2)
print(result_2["messages"][-1].content)

### Quick Checkpoint (Example 2)

1. Why isolate shell work inside `./restricted_data`?
2. What is the difference between `env` and `startup_commands`?


## Section 4 — Example 3: Custom Maintenance Shell

This section configures a shell profile tailored for maintenance workflows.

### What this code does

- Provides a custom `tool_description` to guide the model.
- Chooses `shell_command="/bin/zsh"` explicitly.
- Defines `shutdown_commands` for automatic cleanup at session end.

**Why this matters:** explicit shell choice and cleanup behavior make automation more predictable.


In [15]:
# EXAMPLE 3: CUSTOM MAINTENANCE SHELL
#####################################################################
# Demonstrates how to:
# - Use a specific shell (zsh vs bash)
# - Provide a custom tool description for the LLM
# - Run shutdown commands for cleanup

print("\n--- 3. Running Maintenance Agent ---")
# This agent is designed for system maintenance tasks. It uses zsh and has specific cleanup commands that run when the session ends.
maintenance_agent = create_agent(
    model=llm,
    tools=[],
    middleware=[
        ShellToolMiddleware(
            # tool_description: Helps the LLM understand when/how to use this shell
            tool_description="System maintenance shell for cleanup and updates.",
            # shell_command: Explicitly use zsh (or any other shell executable)
            shell_command="/bin/zsh",
            # shutdown_commands: Run these when the agent session closes
            shutdown_commands=[
                "rm -rf ./temp_logs",
                "echo 'Cleanup Complete'",
            ],
            execution_policy=HostExecutionPolicy(),
        ),
    ],
)

### Teacher's Guide: Lifecycle Automation

Notice the two "Lifecycle" features in this example:

- **`tool_description`**: This is a direct instruction to the brain of the LLM. It helps the LLM understand **when** this specific tool should be its first choice.
- **`shutdown_commands`**: Useful for **Idempotency** (leaving the system exactly as you found it). This is critical for automated testing or cleanup tasks.

**Student Challenge:** Add a shutdown command that logs the time the session ended to a file called `history.log`. Or, if on macOS/Linux, try using `/bin/bash` instead of `/bin/zsh`.


### Section 4B — Running Example 3

This block asks for shell version and disk usage, then prints structured and final outputs.

### Inline Notes

- Shutdown commands are useful for removing temporary artifacts.
- Tool descriptions can improve how the LLM decides when to use shell operations.


In [16]:
result_3 = maintenance_agent.invoke(
    {"messages": [HumanMessage("Check shell version and disk usage.")]}
)
print(result_3)
print(result_3["messages"][-1].content)

## Recap

In this lesson, you practiced three levels of shell middleware configuration:

- **Basic host access** for simple local automation
- **Restricted + redacted setup** for safer execution
- **Maintenance shell customization** for operational tasks

You also saw the common script pattern: configure agent → invoke with messages → inspect final response.

## Suggested Practice Tasks

1. Change Example 1 to create two files and verify both exist.
2. Add one more `RedactionRule` for email-like text.
3. Update Example 3 shutdown commands to archive logs instead of deleting them.
4. Compare behavior between `/bin/zsh` and `/bin/bash` on your machine.


### Teacher's Final Note: When to use each?

- **Host (Basic)**: Local prototyping and simple file management.
- **Restricted (Secure)**: Production or environments where you need to protect secrets or PII.
- **Maintenance (Lifecycle)**: Ongoing automated workflows that require cleanup and explicit tool guidance.

By mastering these three, you can build agents that interact with any computer system safely and effectively.
