ExAgent

An Elixir library for building multi-agent LLM applications. ExAgent abstracts calls to various LLM providers (OpenAI, Gemini, DeepSeek) via an extensible Protocol and orchestrates them using OTP primitives with four multi-agent design patterns.

Hex Docs

Features

Protocol-based LLM abstraction — Swap providers without changing application code
Built on OTP — Agents backed by GenServers, supervised processes, async Tasks
Automatic tool execution — Define tools once, the agent loops LLM calls until complete
4 multi-agent patterns — Subagents, Skills, Handoffs, Router
HTTP via Req — Clean, composable HTTP with built-in JSON encoding and auth
Multimodal file attachments — Send images, PDFs, and other files alongside chat messages
Extensible — Add any LLM provider by implementing a single protocol

Installation

Add ex_agent to your list of dependencies in mix.exs:

def deps do
  [
    {:ex_agent, "~> 0.1.0"}
  ]
end

Quick Start

# 1. Create a provider
provider = ExAgent.Providers.OpenAI.new(api_key: System.get_env("OPENAI_API_KEY"))

# 2. Start an agent
{:ok, agent} = ExAgent.start_agent(provider: provider)

# 3. Chat
{:ok, response} = ExAgent.chat(agent, "What is Elixir?")
IO.puts(response.content)

Providers

ExAgent ships with three built-in providers. Each is configured via new/1 and automatically initializes a Req HTTP client.

OpenAI

Supports chat and file attachments (images via image_url multipart format).

provider = ExAgent.Providers.OpenAI.new(
  api_key: "sk-...", # required
  model: "gpt-4o", # default: "gpt-4o"
  base_url: "https://api.openai.com/v1",  # default
  system_prompt: "You are a helpful assistant."
)

Gemini

Supports chat and file attachments (images, PDFs, etc. via inline_data format).

provider = ExAgent.Providers.Gemini.new(
  api_key: "AIza...", # required
  model: "gemini-2.0-flash", # default: "gemini-2.0-flash"
  system_prompt: "Be concise."
)

DeepSeek

Supports chat and tool calling. File attachments are silently ignored (DeepSeek API does not support multimodal input).

provider = ExAgent.Providers.DeepSeek.new(
  api_key: "sk-...",  # required
  model: "deepseek-chat", # default: "deepseek-chat"
  system_prompt: "You are a coding expert."
)

Core Concepts

Message

Represents a single message in a conversation.

{:ok, msg} = ExAgent.Message.new(role: :user, content: "Hello!")
# Supported roles: :system, :user, :assistant, :tool

Tool

Defines a function the LLM can invoke, with JSON Schema parameters.

{:ok, tool} = ExAgent.Tool.new(
  name: "get_weather",
  description: "Get current weather for a city",
  parameters: %{
    "type" => "object",
    "properties" => %{
      "city" => %{"type" => "string", "description" => "City name"}
    },
    "required" => ["city"]
  },
  function: fn %{"city" => city} ->
    {:ok, "#{city}: 22C, sunny"}
  end
)

Context

Portable conversation state with message history and metadata.

context = ExAgent.Context.new(metadata: %{session_id: "abc123"})

{:ok, msg} = ExAgent.Message.new(role: :user, content: "Hello")
context = ExAgent.Context.add_message(context, msg)

# Get the last assistant response
last = ExAgent.Context.get_last_assistant_message(context)

Skill

A loadable persona with its own system prompt, tools, and activation function.

{:ok, sql_skill} = ExAgent.Skill.new(
  name: "sql_expert",
  system_prompt: "You are a SQL expert. Help users write queries.",
  tools: [sql_tool],
  activation_fn: fn ctx ->
    Enum.any?(ctx.messages, fn m ->
      String.contains?(m.content, "SQL") or String.contains?(m.content, "SELECT")
    end)
  end
)

Agent Lifecycle

Agents are GenServer processes managed by a DynamicSupervisor.

# Start an agent with tools and skills
{:ok, agent} = ExAgent.start_agent(
  provider: provider,
  id: "my-agent",
  tools: [weather_tool, search_tool],
  skills: [sql_skill]
)

# Synchronous chat
{:ok, response} = ExAgent.chat(agent, "What's the weather in Tokyo?")
IO.puts(response.content)

# Asynchronous chat
task = ExAgent.chat_async(agent, "Tell me a story")
{:ok, response} = Task.await(task)

# Inspect conversation history
context = ExAgent.get_context(agent)
Enum.each(context.messages, fn msg ->
  IO.puts("#{msg.role}: #{msg.content}")
end)

# Reset conversation
ExAgent.reset(agent)

# Stop the agent
ExAgent.stop_agent(agent)

File Attachments

Send images, PDFs, and other files alongside chat messages. Files become part of the conversation context, so the LLM can reference them in follow-up messages. You can either send files inline (base64-encoded) or upload them first for better performance.

# Attach a file by path (inline base64)
{:ok, response} = ExAgent.chat(agent, "Describe this image",
  files: [%{path: "photo.jpg", mime_type: "image/jpeg"}])

# Attach raw binary data (inline base64)
image_data = File.read!("diagram.png")
{:ok, response} = ExAgent.chat(agent, "What's in this diagram?",
  files: [%{data: image_data, mime_type: "image/png"}])

# Multiple files of any type
{:ok, response} = ExAgent.chat(agent, "Summarize these documents",
  files: [
    %{path: "report.pdf", mime_type: "application/pdf"},
    %{path: "data.csv", mime_type: "text/csv"},
    %{path: "notes.md", mime_type: "text/markdown"}
  ])

# Files persist in conversation context — the LLM remembers them
{:ok, _} = ExAgent.chat(agent, "Now focus on the second document")

Supported File Types

Type	MIME Type	OpenAI	Gemini	DeepSeek
JPEG	`image/jpeg`	Yes	Yes	No
PNG	`image/png`	Yes	Yes	No
GIF	`image/gif`	Yes	Yes	No
WebP	`image/webp`	Yes	Yes	No
PDF	`application/pdf`	Yes	Yes	No
TXT	`text/plain`	Yes	Yes	No
Markdown	`text/markdown`	Yes	Yes	No
CSV	`text/csv`	Yes	Yes	No

Note: DeepSeek does not support multimodal input. File attachments on DeepSeek agents are silently ignored.

File Uploads

For large files or when you want to reuse the same file across multiple conversations, upload the file first and reference it later. This avoids sending base64-encoded data with every chat request.

Upload and Reference (OpenAI)

provider = ExAgent.Providers.OpenAI.new(api_key: System.get_env("OPENAI_API_KEY"))

# Upload a file from disk
{:ok, ref} = ExAgent.upload_file(provider, "report.pdf", "application/pdf")

# Use the reference in chat — no base64 encoding, just a lightweight file ID
{:ok, agent} = ExAgent.start_agent(provider: provider)
{:ok, response} = ExAgent.chat(agent, "Summarize this report",
  files: [%{file_ref: ref}])

# Reuse the same reference in another message
{:ok, response} = ExAgent.chat(agent, "What are the key findings?",
  files: [%{file_ref: ref}])

Upload and Reference (Gemini)

provider = ExAgent.Providers.Gemini.new(api_key: System.get_env("GEMINI_API_KEY"))

# Upload a file — Gemini files expire after 48 hours
{:ok, ref} = ExAgent.upload_file(provider, "photo.jpg", "image/jpeg")

# Check if a reference has expired
ExAgent.FileRef.expired?(ref)

# Use in chat
{:ok, agent} = ExAgent.start_agent(provider: provider)
{:ok, response} = ExAgent.chat(agent, "Describe what you see",
  files: [%{file_ref: ref}])

Upload Raw Binary Data

# If you already have the file contents in memory
image_bytes = File.read!("screenshot.png")
{:ok, ref} = ExAgent.upload_data(provider, image_bytes, "image/png",
  filename: "screenshot.png")

Mix Inline and Uploaded Files

# You can combine both approaches in a single message
{:ok, ref} = ExAgent.upload_file(provider, "large_video.mp4", "video/mp4")
{:ok, response} = ExAgent.chat(agent, "Compare these",
  files: [
    %{file_ref: ref},                                          # uploaded reference
    %{path: "small_image.jpg", mime_type: "image/jpeg"}        # inline base64
  ])

Built-in Provider Tools

Each LLM provider offers built-in tools that can be enabled via the built_in_tools option — either at agent creation (applies to all calls) or per-message (overrides agent default).

Gemini

# Google Search grounding — LLM can search the web for up-to-date info
{:ok, agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.Gemini.new(api_key: gemini_key),
  built_in_tools: [:google_search]
)
{:ok, response} = ExAgent.chat(agent, "What happened in tech news today?")

# Code execution — LLM can write and run Python code
{:ok, response} = ExAgent.chat(agent, "Calculate fibonacci(20)",
  built_in_tools: [:code_execution])

# URL context — LLM can fetch and analyze web pages
{:ok, response} = ExAgent.chat(agent, "Summarize this page",
  built_in_tools: [:url_context])

# Combine multiple built-in tools
{:ok, response} = ExAgent.chat(agent, "Research and compute",
  built_in_tools: [:google_search, :code_execution])

Available Gemini built-in tools: :google_search, :code_execution, :url_context

OpenAI

# Web search — LLM can search the web
{:ok, agent} = ExAgent.start_agent(
  provider: provider,
  built_in_tools: [:web_search]
)
{:ok, response} = ExAgent.chat(agent, "What are the latest Elixir releases?")

# Web search with user location for localized results
{:ok, response} = ExAgent.chat(agent, "Best restaurants nearby",
  built_in_tools: [%{web_search: %{"city" => "San Francisco", "country" => "US", "region" => "California"}}])

Available OpenAI built-in tools: :web_search

DeepSeek

# Thinking/reasoning mode — enables chain-of-thought reasoning
{:ok, agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.DeepSeek.new(
    api_key: deepseek_key,
    model: "deepseek-reasoner"
  ),
  built_in_tools: [:thinking]
)
{:ok, response} = ExAgent.chat(agent, "Solve this step by step: if x^2 + 3x - 10 = 0, what is x?")

Available DeepSeek built-in tools: :thinking

Tool Calling

When you provide tools to an agent, the LLM can invoke them automatically. The agent runs a tool execution loop:

Sends messages + tool definitions to the LLM
If the LLM returns a tool_call, the agent executes the matching function
Appends the tool result as a :tool message
Calls the LLM again with the updated context
Repeats until the LLM returns a final text response (max 10 iterations)

{:ok, search_tool} = ExAgent.Tool.new(
  name: "web_search",
  description: "Search the web for information",
  parameters: %{
    "type" => "object",
    "properties" => %{
      "query" => %{"type" => "string", "description" => "Search query"}
    },
    "required" => ["query"]
  },
  function: fn %{"query" => query} ->
    # Your search implementation here
    {:ok, "Results for: #{query}"}
  end
)

{:ok, calc_tool} = ExAgent.Tool.new(
  name: "calculator",
  description: "Evaluate a math expression",
  parameters: %{
    "type" => "object",
    "properties" => %{
      "expression" => %{"type" => "string"}
    },
    "required" => ["expression"]
  },
  function: fn %{"expression" => expr} ->
    {result, _} = Code.eval_string(expr)
    {:ok, to_string(result)}
  end
)

{:ok, agent} = ExAgent.start_agent(
  provider: provider,
  tools: [search_tool, calc_tool]
)

# The LLM can now decide to call these tools during conversation
{:ok, response} = ExAgent.chat(agent, "What is 42 * 37?")

Multi-Agent Patterns

1. Subagents (Centralized Orchestration)

A main orchestrator agent delegates work to specialized subagents. Each subagent runs in isolation with a fresh context — no state leaks between calls.

alias ExAgent.Patterns.Subagents

# Define specialized subagent specs
researcher = %{
  name: "researcher",
  description: "Research a topic and return findings",
  provider: ExAgent.Providers.Gemini.new(api_key: gemini_key),
  system_prompt: "You are a research specialist. Provide detailed findings.",
  tools: []
}

coder = %{
  name: "coder",
  description: "Write code based on specifications",
  provider: ExAgent.Providers.OpenAI.new(api_key: openai_key),
  system_prompt: "You are an expert programmer. Write clean, tested code.",
  tools: []
}

# Convert subagent specs into tools for the orchestrator
orchestrator_tools = Subagents.build_orchestrator_tools([researcher, coder])

# The orchestrator uses these as regular tools — when the LLM calls
# "researcher" or "coder", it spawns an ephemeral subagent call
{:ok, orchestrator} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: openai_key,
    system_prompt: "You orchestrate tasks. Use the researcher for facts and the coder for code."
  ),
  tools: orchestrator_tools
)

{:ok, response} = ExAgent.chat(orchestrator, "Research Elixir GenServers and write an example")

# You can also invoke subagents directly
{:ok, result} = Subagents.invoke_subagent(researcher, "Explain OTP supervision trees")

# Or invoke multiple in parallel
results = Subagents.invoke_subagents_parallel([
  {researcher, "What is GenServer?"},
  {coder, "Write a GenServer example"}
])
# => [{"researcher", {:ok, "GenServer is..."}}, {"coder", {:ok, "defmodule..."}}]

2. Skills (Progressive Disclosure)

A single agent dynamically loads specialized system prompts and tools based on conversation context. Skills are evaluated before each LLM call.

# Define skills with activation functions
{:ok, sql_skill} = ExAgent.Skill.new(
  name: "sql_expert",
  system_prompt: "You are a SQL expert. Help users write and optimize queries.",
  tools: [sql_execute_tool],
  activation_fn: fn ctx ->
    ctx.messages
    |> Enum.any?(fn m ->
      String.match?(m.content, ~r/SQL|SELECT|INSERT|UPDATE|DELETE|database/i)
    end)
  end
)

{:ok, python_skill} = ExAgent.Skill.new(
  name: "python_expert",
  system_prompt: "You are a Python expert. Write idiomatic Python code.",
  tools: [python_run_tool],
  activation_fn: fn ctx ->
    ctx.messages
    |> Enum.any?(fn m -> String.contains?(m.content, "Python") end)
  end
)

# Start agent with skills — it begins as a generalist
{:ok, agent} = ExAgent.start_agent(
  provider: provider,
  skills: [sql_skill, python_skill]
)

# When the user mentions SQL, the sql_expert skill activates automatically
{:ok, response} = ExAgent.chat(agent, "Help me write a SQL query to find active users")
# => Agent now uses the sql_expert system prompt and tools

# Skills can also be loaded dynamically at runtime
{:ok, new_skill} = ExAgent.Skill.new(name: "devops", system_prompt: "You are a DevOps expert.")
ExAgent.Agent.load_skill(agent, new_skill)

3. Handoffs (State-Driven Transitions)

The active agent changes dynamically. When the LLM invokes a handoff tool, control transfers to a different agent. The caller receives a {:handoff, target, context} tuple and decides where to route subsequent messages.

alias ExAgent.Patterns.Handoff

# Start specialized agents
{:ok, sales_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: key,
    system_prompt: "You are a sales specialist."
  )
)

{:ok, support_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: key,
    system_prompt: "You are a technical support specialist."
  )
)

# Build handoff tools
handoff_to_support = Handoff.build_handoff_tool(
  "support",
  support_agent,
  "Transfer to technical support when the user has a technical issue"
)

handoff_to_sales = Handoff.build_handoff_tool(
  "sales",
  sales_agent,
  "Transfer to sales when the user wants to buy something"
)

# Start a triage agent with handoff tools
{:ok, triage_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: key,
    system_prompt: "You are a triage agent. Route users to the right department."
  ),
  tools: [handoff_to_support, handoff_to_sales]
)

# When the LLM decides to hand off, you get a handoff tuple
case ExAgent.chat(triage_agent, "My app keeps crashing") do
  {:ok, response} ->
    # Normal response — agent handled it directly
    IO.puts("Normal response — agent handled it directly")
    IO.puts(response.content)

  {:handoff, target_pid, context} ->
    # Transfer context and continue with the new agent
    ExAgent.handoff(target_pid, context)
    {:ok, response} = ExAgent.chat(target_pid, "My app keeps crashing")
    IO.puts("Transfer context and continue with the new agent")
    IO.puts(response.content)
end

4. Router (Parallel Dispatch & Synthesis)

Classifies input, dispatches to multiple specialized agents in parallel, and synthesizes results into a single response.

alias ExAgent.Patterns.Router

# Start specialized agents
{:ok, code_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: key,
    system_prompt: "Analyze code quality and suggest improvements."
  )
)

{:ok, security_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: key,
    system_prompt: "Analyze code for security vulnerabilities."
  )
)

{:ok, perf_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.Gemini.new(
    api_key: gemini_key,
    system_prompt: "Analyze code for performance issues."
  )
)

# Define routes with match functions
routes = [
  %{name: "code_quality", agent: code_agent, match_fn: fn _ -> true end},
  %{name: "security", agent: security_agent, match_fn: &String.contains?(&1, "security")},
  %{name: "performance", agent: perf_agent, match_fn: &String.contains?(&1, "performance")}
]

# Route dispatches to all matching agents in parallel
{:ok, result} = ExAgent.route(
  "Review this code for security and performance issues: def fetch(url), do: HTTPoison.get!(url)",
  routes: routes,
  timeout: 30_000
)

IO.puts(result)
# ## code_quality
# The function lacks error handling...
#
# ## security
# Using get! will raise on HTTP errors...
#
# ## performance
# Consider connection pooling...

# Custom synthesizer
{:ok, result} = ExAgent.route("analyze this code",
  routes: routes,
  synthesizer: fn _input, results ->
    results
    |> Enum.map(fn {name, content} -> "**#{name}**: #{content}" end)
    |> Enum.join("\n\n")
  end
)

Adding a Custom Provider

Any LLM can be integrated by defining a struct and implementing the ExAgent.LlmProvider protocol:

defmodule MyApp.Providers.Anthropic do
  @moduledoc "Custom Anthropic Claude provider."

  defstruct [
    :api_key, :req,
    model: "claude-sonnet-4-20250514",
    base_url: "https://api.anthropic.com/v1",
    system_prompt: nil,
    tools: []
  ]

  def new(opts) do
    provider = struct!(__MODULE__, opts)
    %{provider | req: Req.new(
      base_url: provider.base_url,
      headers: [
        {"x-api-key", provider.api_key},
        {"anthropic-version", "2023-06-01"}
      ]
    )}
  end

  defimpl ExAgent.LlmProvider do
    def chat(provider, messages, _opts) do
      body = %{
        "model" => provider.model,
        "max_tokens" => 1024,
        "messages" => Enum.map(messages, fn msg ->
          %{"role" => to_string(msg.role), "content" => msg.content}
        end)
      }

      case Req.post(provider.req, url: "/messages", json: body) do
        {:ok, %Req.Response{status: 200, body: %{"content" => [%{"text" => text} | _]}}} ->
          {:ok, %ExAgent.Message{role: :assistant, content: text}}

        {:ok, %Req.Response{status: status, body: body}} ->
          {:error, {status, body}}

        {:error, reason} ->
          {:error, reason}
      end
    end
  end
end

# Use it like any other provider
provider = MyApp.Providers.Anthropic.new(api_key: "sk-ant-...")
{:ok, agent} = ExAgent.start_agent(provider: provider)
{:ok, response} = ExAgent.chat(agent, "Hello Claude!")

Architecture

Supervision Tree

Application (ex_agent)
  |
  ExAgent.AgentSupervisor (:one_for_one)
    |
    +-- ExAgent.AgentDynamicSupervisor (:one_for_one)
    |     |
    |     +-- ExAgent.Agent (id: "orchestrator")
    |     +-- ExAgent.Agent (id: "coder")
    |     +-- ExAgent.Agent (id: "reviewer")
    |     +-- ... (any runtime agents)
    |
    +-- ExAgent.TaskSupervisor
          |
          +-- Task (async chat calls)
          +-- Task (parallel subagent invocations)
          +-- Task (router parallel dispatch)

Design Decisions

Protocol dispatch — Provider structs implement ExAgent.LlmProvider, enabling compile-time polymorphism
Thin protocol impls — Protocol implementations delegate to service modules under services/, keeping HTTP logic separate
Tool loop in GenServer — The handle_call({:chat, ...}) contains the tool execution loop, processing one turn at a time to prevent race conditions on context
Subagents bypass GenServer — Ephemeral stateless calls use LlmProvider.chat/3 directly in supervised Tasks
Handoff returns to caller — Keeps agents decoupled; the caller decides routing after a handoff
Router is a plain module — Stateless classify-dispatch-synthesize flow needs no GenServer
All patterns share one Agent GenServer — Patterns augment behavior through state and tools, not separate process types

Project Structure

lib/
  ex_agent.ex                     # Public API facade
  ex_agent/
    llm_provider.ex               # LlmProvider protocol
    file_uploader.ex              # FileUploader protocol
    file_ref.ex                   # %FileRef{} struct (uploaded file reference)
    message.ex                    # %Message{} struct
    tool.ex                       # %Tool{} struct
    context.ex                    # %Context{} struct
    skill.ex                      # %Skill{} struct
    agent.ex                      # Agent GenServer
    supervisor.ex                 # AgentSupervisor
    dynamic_supervisor.ex         # AgentDynamicSupervisor
    providers/
      openai.ex                   # OpenAI provider + LlmProvider + FileUploader
      gemini.ex                   # Gemini provider + LlmProvider + FileUploader
      deep_seek.ex                # DeepSeek provider + LlmProvider
    services/
      openai_service.ex           # OpenAI chat HTTP calls via Req
      openai_upload_service.ex    # OpenAI file upload (POST /v1/files)
      gemini_service.ex           # Gemini chat HTTP calls via Req
      gemini_upload_service.ex    # Gemini file upload (Files API)
      deep_seek_service.ex        # DeepSeek HTTP calls via Req
    patterns/
      subagents.ex                # Centralized orchestration
      skills.ex                   # Progressive disclosure
      handoff.ex                  # State-driven transitions
      router.ex                   # Parallel dispatch & synthesis

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
lib		lib
test		test
.formatter.exs		.formatter.exs
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
mix.exs		mix.exs
mix.lock		mix.lock

Folders and files

Latest commit

History

Repository files navigation

ExAgent

Features

Installation

Quick Start

Providers

OpenAI

Gemini

DeepSeek

Core Concepts

Message

Tool

Context

Skill

Agent Lifecycle

File Attachments

Supported File Types

File Uploads

Upload and Reference (OpenAI)

Upload and Reference (Gemini)

Upload Raw Binary Data

Mix Inline and Uploaded Files

Built-in Provider Tools

Gemini

OpenAI

DeepSeek

Tool Calling

Multi-Agent Patterns

1. Subagents (Centralized Orchestration)

2. Skills (Progressive Disclosure)

3. Handoffs (State-Driven Transitions)

4. Router (Parallel Dispatch & Synthesis)

Adding a Custom Provider

Architecture

Supervision Tree

Design Decisions

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages