Skip to content

emarco177/agent-security

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Agent Security Techniques: Real World Attacks with Production Grade Defenses

LangChain      LangGraph      LangSmith

Based on LangChain · LangGraph · LangSmith

AI agents are shipping to production faster than teams can secure them. The same vulnerabilities keep showing up: agents that reset passwords for the wrong user, leak customer data across tenants, or follow instructions planted in database records. Most of these aren't sophisticated attacks. They're the predictable result of skipping security fundamentals in the rush to deploy.

This repository is a hands-on breakdown of how AI agents break and how to fix them. Six progressive demos, each targeting a real attack vector from production deployments, fixed by building real security architecture for agents with LangChain middleware: composable hooks that intercept every model call and tool execution in the agent loop, enforcing security without modifying core agent logic. Built with LangChain + LangGraph, deployed on GCP Cloud Run with Firebase Auth and Firestore. Every agent run is observable via LangSmith tracing.

Every attack is runnable. Every fix is verifiable. No slides, no theory, just code.

TechCorp IT Helpdesk login screen with five demo users, each with different roles and departments

Warning

For educational and defensive-security purposes only. This repository demonstrates real attacks against AI agents so engineers can learn to defend against them. The vulnerabilities are intentional and the exploits run against a sandboxed demo application (TechCorp) that you deploy to your own isolated cloud project. Do not use these techniques against any system you do not own or are not explicitly authorized to test. You are solely responsible for how you use this material; the author accepts no liability for misuse or for any damage resulting from it. By using this repository you agree to comply with all applicable laws and to follow responsible-disclosure practices.

Who This Is For

  • Engineers building or deploying AI agents who want to understand the real attack surface
  • Platform and security teams evaluating agent risks before shipping to production
  • Anyone who's read about prompt injection but never exploited one hands-on

The 6 Attack Vectors

Each demo follows the same pattern: attack the agent, prove the damage, apply the fix, verify the fix holds.

# Attack Vector What Happens The Fix OWASP LLM
1 Blast Radius Prompt injection resets another user's password and exfiltrates GCP secrets ToolFilterMiddleware + least-privilege service account LLM01, LLM06
2 Tenant Isolation Employee reads HR salary data with a natural-sounding request (no injection needed) AuthorizationMiddleware: RBAC from verified Firebase JWT LLM02, LLM06
3 Indirect Injection Poisoned ticket description hijacks the agent to leak budget data using a manager's privileges SanitizationMiddleware: tool call validation + data boundaries LLM01, LLM05
4 Memory Poisoning One chat message becomes a permanent backdoor that fires in every future session MemoryGuardMiddleware: structured Pydantic schemas kill free-text injection LLM01, LLM06
5 Production Guardrails Sophisticated jailbreak bypasses keyword validation + PII leaks in agent responses NeMo Guardrails: ML-based jailbreak detection + PII redaction LLM01, LLM02
6 Least Privilege Over-privileged GCP service account exposes storage buckets and secrets Infrastructure-level fix: least-privilege service account LLM06

Each attack is modeled on vulnerabilities found in production agent deployments, not contrived examples.

OWASP LLM Top 10 Coverage

Mapped to the OWASP Top 10 for LLM Applications. These are the categories this course attacks and defends hands-on:

OWASP LLM Category Demos
LLM01 Prompt Injection 1, 3, 4, 5
LLM02 Sensitive Information Disclosure 2, 5
LLM05 Improper Output Handling 3
LLM06 Excessive Agency 1, 2, 4, 6

Quick Start

All commands below run from modules/01-agent-security/. About 10 minutes to a running agent on real GCP services.

cd modules/01-agent-security

# 1. Configure environment
cp .env.example .env              # Edit .env: set all required variables (see .env.example)

# 2. Authenticate with GCP
gcloud auth application-default login
gcloud config set project YOUR_PROJECT_ID

# 3. Set up Firebase (one-time, required for Chat UI authentication):
#    a. Go to https://console.firebase.google.com
#    b. Click "Add project" > select your GCP project
#    c. Go to Build > Authentication > Get Started > enable "Email/Password" provider
#    d. Go to Project Settings > General > copy the "Web API Key"
#    e. Add to .env: FIREBASE_API_KEY=your-api-key-here

# 4. Deploy (builds Docker image, provisions infra, seeds data)
bash deploy_gcloud.sh

# 5. Open the Chat UI at the AGENT_URL printed by the deploy script
#    Sign in as Alice > type the Demo 1 attack prompt > watch the agent reset Bob's password

See the Module 01 README for architecture details, the full demo flow, Chat UI features, and environment reference.

Infrastructure

All demos and lessons run on GCP with real services. No emulators, no mocks.

graph TB
    User([User / Chat UI]) -->|HTTPS| CR[Cloud Run<br/>FastAPI + LangGraph Agent]
    CR -->|verify JWT| FA[Firebase Auth<br/>Identity + Custom Claims]
    CR -->|read/write| FS[(Firestore<br/>Tickets · Wiki · Employees)]
    CR -->|retrieve| SM[Secret Manager<br/>API Keys · Credentials]
    CR -->|list/read| GCS[Cloud Storage<br/>Customer Data Buckets]
    CR -->|traces| LS[LangSmith<br/>Observability]
    AR[Artifact Registry<br/>Docker Images] -.->|deploy| CR

    style CR fill:#4285F4,color:#fff
    style FA fill:#FBBC04,color:#000
    style FS fill:#34A853,color:#fff
    style SM fill:#EA4335,color:#fff
    style GCS fill:#34A853,color:#fff
    style LS fill:#3b82f6,color:#fff
    style AR fill:#9AA0A6,color:#fff
Loading

Deployment is a single script (deploy_gcloud.sh). No Terraform required.

Prerequisites

Tool Install Verify
uv (package manager) docs.astral.sh/uv uv --version
Docker docker.com docker --version
Google Cloud account cloud.google.com gcloud auth list
gcloud CLI cloud.google.com/sdk gcloud --version
OpenAI API key platform.openai.com/api-keys

Built on the LangChain stack

LangChain     LangGraph     LangSmith

Every security fix in this repo is a single LangChain middleware class. The diagram below is the real agent loop: each demo's defense plugs into one of those hooks (wrap_tool_call, before_model, after_model) without forking the agent or rewriting the loop. create_agent() compiles the whole thing into a LangGraph state graph (the ReAct loop, tool routing, and state handled for you), and every run is traced end to end in LangSmith. This is what security architecture for agents looks like: defenses living inside the agent's architecture as clean, composable layers instead of a patchwork bolted on top, exactly the foundation production agents have been missing.

LangChain middleware hooks firing before and after the model and tool steps of the agent loop

The LangChain agent loop and its middleware hooks. Diagram from the LangChain middleware docs.

Contributing

Contributions are welcome. This is a teaching repository, so the bar for changes is clarity and correctness over feature breadth:

  • Found a bug or a broken demo? Open an issue describing the demo, the expected vs. actual behavior, and your environment.
  • Improving a defense or adding an attack vector? Open an issue first to discuss the approach before sending a PR, so it fits the demo's attack > prove

    fix > verify structure.

  • Fixing docs or typos? Send a PR directly.

Please keep examples runnable and aligned with the existing middleware pattern (one security fix = one composable LangChain middleware class). Do not submit exploits targeting real third-party systems.

Security note: if you discover a genuine vulnerability in the demo infrastructure itself (not the intentional teaching vulnerabilities), please report it privately rather than opening a public issue.

Built By

Eden Marco | Focused on AI agent security in cloud-native environments. Built this after seeing the same agent vulnerabilities across too many production deployments, to show how it's done right: every attack, every fix, running on real infrastructure.

LinkedIn X Udemy

About

Hands-on AI agent security resource: OWASP GenAI attack/fix/verify demos on a LangChain IT-helpdesk agent, deployed to the cloud.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors