Skip to content

tiagodavi/ex_agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ExAgent

An Elixir library for building multi-agent LLM applications. ExAgent abstracts calls to various LLM providers (OpenAI, Gemini, DeepSeek) via an extensible Protocol and orchestrates them using OTP primitives with four multi-agent design patterns.

Hex Docs

Features

  • Protocol-based LLM abstraction — Swap providers without changing application code
  • Built on OTP — Agents backed by GenServers, supervised processes, async Tasks
  • Automatic tool execution — Define tools once, the agent loops LLM calls until complete
  • 4 multi-agent patterns — Subagents, Skills, Handoffs, Router
  • HTTP via Req — Clean, composable HTTP with built-in JSON encoding and auth
  • Multimodal file attachments — Send images, PDFs, and other files alongside chat messages
  • Extensible — Add any LLM provider by implementing a single protocol

Installation

Add ex_agent to your list of dependencies in mix.exs:

def deps do
  [
    {:ex_agent, "~> 0.1.0"}
  ]
end

Quick Start

# 1. Create a provider
provider = ExAgent.Providers.OpenAI.new(api_key: System.get_env("OPENAI_API_KEY"))

# 2. Start an agent
{:ok, agent} = ExAgent.start_agent(provider: provider)

# 3. Chat
{:ok, response} = ExAgent.chat(agent, "What is Elixir?")
IO.puts(response.content)

Providers

ExAgent ships with three built-in providers. Each is configured via new/1 and automatically initializes a Req HTTP client.

OpenAI

Supports chat and file attachments (images via image_url multipart format).

provider = ExAgent.Providers.OpenAI.new(
  api_key: "sk-...", # required
  model: "gpt-4o", # default: "gpt-4o"
  base_url: "https://api.openai.com/v1",  # default
  system_prompt: "You are a helpful assistant."
)

Gemini

Supports chat and file attachments (images, PDFs, etc. via inline_data format).

provider = ExAgent.Providers.Gemini.new(
  api_key: "AIza...", # required
  model: "gemini-2.0-flash", # default: "gemini-2.0-flash"
  system_prompt: "Be concise."
)

DeepSeek

Supports chat and tool calling. File attachments are silently ignored (DeepSeek API does not support multimodal input).

provider = ExAgent.Providers.DeepSeek.new(
  api_key: "sk-...",  # required
  model: "deepseek-chat", # default: "deepseek-chat"
  system_prompt: "You are a coding expert."
)

Core Concepts

Message

Represents a single message in a conversation.

{:ok, msg} = ExAgent.Message.new(role: :user, content: "Hello!")
# Supported roles: :system, :user, :assistant, :tool

Tool

Defines a function the LLM can invoke, with JSON Schema parameters.

{:ok, tool} = ExAgent.Tool.new(
  name: "get_weather",
  description: "Get current weather for a city",
  parameters: %{
    "type" => "object",
    "properties" => %{
      "city" => %{"type" => "string", "description" => "City name"}
    },
    "required" => ["city"]
  },
  function: fn %{"city" => city} ->
    {:ok, "#{city}: 22C, sunny"}
  end
)

Context

Portable conversation state with message history and metadata.

context = ExAgent.Context.new(metadata: %{session_id: "abc123"})

{:ok, msg} = ExAgent.Message.new(role: :user, content: "Hello")
context = ExAgent.Context.add_message(context, msg)

# Get the last assistant response
last = ExAgent.Context.get_last_assistant_message(context)

Skill

A loadable persona with its own system prompt, tools, and activation function.

{:ok, sql_skill} = ExAgent.Skill.new(
  name: "sql_expert",
  system_prompt: "You are a SQL expert. Help users write queries.",
  tools: [sql_tool],
  activation_fn: fn ctx ->
    Enum.any?(ctx.messages, fn m ->
      String.contains?(m.content, "SQL") or String.contains?(m.content, "SELECT")
    end)
  end
)

Agent Lifecycle

Agents are GenServer processes managed by a DynamicSupervisor.

# Start an agent with tools and skills
{:ok, agent} = ExAgent.start_agent(
  provider: provider,
  id: "my-agent",
  tools: [weather_tool, search_tool],
  skills: [sql_skill]
)

# Synchronous chat
{:ok, response} = ExAgent.chat(agent, "What's the weather in Tokyo?")
IO.puts(response.content)

# Asynchronous chat
task = ExAgent.chat_async(agent, "Tell me a story")
{:ok, response} = Task.await(task)

# Inspect conversation history
context = ExAgent.get_context(agent)
Enum.each(context.messages, fn msg ->
  IO.puts("#{msg.role}: #{msg.content}")
end)

# Reset conversation
ExAgent.reset(agent)

# Stop the agent
ExAgent.stop_agent(agent)

File Attachments

Send images, PDFs, and other files alongside chat messages. Files become part of the conversation context, so the LLM can reference them in follow-up messages. You can either send files inline (base64-encoded) or upload them first for better performance.

# Attach a file by path (inline base64)
{:ok, response} = ExAgent.chat(agent, "Describe this image",
  files: [%{path: "photo.jpg", mime_type: "image/jpeg"}])

# Attach raw binary data (inline base64)
image_data = File.read!("diagram.png")
{:ok, response} = ExAgent.chat(agent, "What's in this diagram?",
  files: [%{data: image_data, mime_type: "image/png"}])

# Multiple files of any type
{:ok, response} = ExAgent.chat(agent, "Summarize these documents",
  files: [
    %{path: "report.pdf", mime_type: "application/pdf"},
    %{path: "data.csv", mime_type: "text/csv"},
    %{path: "notes.md", mime_type: "text/markdown"}
  ])

# Files persist in conversation context — the LLM remembers them
{:ok, _} = ExAgent.chat(agent, "Now focus on the second document")

Supported File Types

Type MIME Type OpenAI Gemini DeepSeek
JPEG image/jpeg Yes Yes No
PNG image/png Yes Yes No
GIF image/gif Yes Yes No
WebP image/webp Yes Yes No
PDF application/pdf Yes Yes No
TXT text/plain Yes Yes No
Markdown text/markdown Yes Yes No
CSV text/csv Yes Yes No

Note: DeepSeek does not support multimodal input. File attachments on DeepSeek agents are silently ignored.

File Uploads

For large files or when you want to reuse the same file across multiple conversations, upload the file first and reference it later. This avoids sending base64-encoded data with every chat request.

Upload and Reference (OpenAI)

provider = ExAgent.Providers.OpenAI.new(api_key: System.get_env("OPENAI_API_KEY"))

# Upload a file from disk
{:ok, ref} = ExAgent.upload_file(provider, "report.pdf", "application/pdf")

# Use the reference in chat — no base64 encoding, just a lightweight file ID
{:ok, agent} = ExAgent.start_agent(provider: provider)
{:ok, response} = ExAgent.chat(agent, "Summarize this report",
  files: [%{file_ref: ref}])

# Reuse the same reference in another message
{:ok, response} = ExAgent.chat(agent, "What are the key findings?",
  files: [%{file_ref: ref}])

Upload and Reference (Gemini)

provider = ExAgent.Providers.Gemini.new(api_key: System.get_env("GEMINI_API_KEY"))

# Upload a file — Gemini files expire after 48 hours
{:ok, ref} = ExAgent.upload_file(provider, "photo.jpg", "image/jpeg")

# Check if a reference has expired
ExAgent.FileRef.expired?(ref)

# Use in chat
{:ok, agent} = ExAgent.start_agent(provider: provider)
{:ok, response} = ExAgent.chat(agent, "Describe what you see",
  files: [%{file_ref: ref}])

Upload Raw Binary Data

# If you already have the file contents in memory
image_bytes = File.read!("screenshot.png")
{:ok, ref} = ExAgent.upload_data(provider, image_bytes, "image/png",
  filename: "screenshot.png")

Mix Inline and Uploaded Files

# You can combine both approaches in a single message
{:ok, ref} = ExAgent.upload_file(provider, "large_video.mp4", "video/mp4")
{:ok, response} = ExAgent.chat(agent, "Compare these",
  files: [
    %{file_ref: ref},                                          # uploaded reference
    %{path: "small_image.jpg", mime_type: "image/jpeg"}        # inline base64
  ])

Built-in Provider Tools

Each LLM provider offers built-in tools that can be enabled via the built_in_tools option — either at agent creation (applies to all calls) or per-message (overrides agent default).

Gemini

# Google Search grounding — LLM can search the web for up-to-date info
{:ok, agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.Gemini.new(api_key: gemini_key),
  built_in_tools: [:google_search]
)
{:ok, response} = ExAgent.chat(agent, "What happened in tech news today?")

# Code execution — LLM can write and run Python code
{:ok, response} = ExAgent.chat(agent, "Calculate fibonacci(20)",
  built_in_tools: [:code_execution])

# URL context — LLM can fetch and analyze web pages
{:ok, response} = ExAgent.chat(agent, "Summarize this page",
  built_in_tools: [:url_context])

# Combine multiple built-in tools
{:ok, response} = ExAgent.chat(agent, "Research and compute",
  built_in_tools: [:google_search, :code_execution])

Available Gemini built-in tools: :google_search, :code_execution, :url_context

OpenAI

# Web search — LLM can search the web
{:ok, agent} = ExAgent.start_agent(
  provider: provider,
  built_in_tools: [:web_search]
)
{:ok, response} = ExAgent.chat(agent, "What are the latest Elixir releases?")

# Web search with user location for localized results
{:ok, response} = ExAgent.chat(agent, "Best restaurants nearby",
  built_in_tools: [%{web_search: %{"city" => "San Francisco", "country" => "US", "region" => "California"}}])

Available OpenAI built-in tools: :web_search

DeepSeek

# Thinking/reasoning mode — enables chain-of-thought reasoning
{:ok, agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.DeepSeek.new(
    api_key: deepseek_key,
    model: "deepseek-reasoner"
  ),
  built_in_tools: [:thinking]
)
{:ok, response} = ExAgent.chat(agent, "Solve this step by step: if x^2 + 3x - 10 = 0, what is x?")

Available DeepSeek built-in tools: :thinking

Tool Calling

When you provide tools to an agent, the LLM can invoke them automatically. The agent runs a tool execution loop:

  1. Sends messages + tool definitions to the LLM
  2. If the LLM returns a tool_call, the agent executes the matching function
  3. Appends the tool result as a :tool message
  4. Calls the LLM again with the updated context
  5. Repeats until the LLM returns a final text response (max 10 iterations)
{:ok, search_tool} = ExAgent.Tool.new(
  name: "web_search",
  description: "Search the web for information",
  parameters: %{
    "type" => "object",
    "properties" => %{
      "query" => %{"type" => "string", "description" => "Search query"}
    },
    "required" => ["query"]
  },
  function: fn %{"query" => query} ->
    # Your search implementation here
    {:ok, "Results for: #{query}"}
  end
)

{:ok, calc_tool} = ExAgent.Tool.new(
  name: "calculator",
  description: "Evaluate a math expression",
  parameters: %{
    "type" => "object",
    "properties" => %{
      "expression" => %{"type" => "string"}
    },
    "required" => ["expression"]
  },
  function: fn %{"expression" => expr} ->
    {result, _} = Code.eval_string(expr)
    {:ok, to_string(result)}
  end
)

{:ok, agent} = ExAgent.start_agent(
  provider: provider,
  tools: [search_tool, calc_tool]
)

# The LLM can now decide to call these tools during conversation
{:ok, response} = ExAgent.chat(agent, "What is 42 * 37?")

Multi-Agent Patterns

1. Subagents (Centralized Orchestration)

A main orchestrator agent delegates work to specialized subagents. Each subagent runs in isolation with a fresh context — no state leaks between calls.

alias ExAgent.Patterns.Subagents

# Define specialized subagent specs
researcher = %{
  name: "researcher",
  description: "Research a topic and return findings",
  provider: ExAgent.Providers.Gemini.new(api_key: gemini_key),
  system_prompt: "You are a research specialist. Provide detailed findings.",
  tools: []
}

coder = %{
  name: "coder",
  description: "Write code based on specifications",
  provider: ExAgent.Providers.OpenAI.new(api_key: openai_key),
  system_prompt: "You are an expert programmer. Write clean, tested code.",
  tools: []
}

# Convert subagent specs into tools for the orchestrator
orchestrator_tools = Subagents.build_orchestrator_tools([researcher, coder])

# The orchestrator uses these as regular tools — when the LLM calls
# "researcher" or "coder", it spawns an ephemeral subagent call
{:ok, orchestrator} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: openai_key,
    system_prompt: "You orchestrate tasks. Use the researcher for facts and the coder for code."
  ),
  tools: orchestrator_tools
)

{:ok, response} = ExAgent.chat(orchestrator, "Research Elixir GenServers and write an example")

# You can also invoke subagents directly
{:ok, result} = Subagents.invoke_subagent(researcher, "Explain OTP supervision trees")

# Or invoke multiple in parallel
results = Subagents.invoke_subagents_parallel([
  {researcher, "What is GenServer?"},
  {coder, "Write a GenServer example"}
])
# => [{"researcher", {:ok, "GenServer is..."}}, {"coder", {:ok, "defmodule..."}}]

2. Skills (Progressive Disclosure)

A single agent dynamically loads specialized system prompts and tools based on conversation context. Skills are evaluated before each LLM call.

# Define skills with activation functions
{:ok, sql_skill} = ExAgent.Skill.new(
  name: "sql_expert",
  system_prompt: "You are a SQL expert. Help users write and optimize queries.",
  tools: [sql_execute_tool],
  activation_fn: fn ctx ->
    ctx.messages
    |> Enum.any?(fn m ->
      String.match?(m.content, ~r/SQL|SELECT|INSERT|UPDATE|DELETE|database/i)
    end)
  end
)

{:ok, python_skill} = ExAgent.Skill.new(
  name: "python_expert",
  system_prompt: "You are a Python expert. Write idiomatic Python code.",
  tools: [python_run_tool],
  activation_fn: fn ctx ->
    ctx.messages
    |> Enum.any?(fn m -> String.contains?(m.content, "Python") end)
  end
)

# Start agent with skills — it begins as a generalist
{:ok, agent} = ExAgent.start_agent(
  provider: provider,
  skills: [sql_skill, python_skill]
)

# When the user mentions SQL, the sql_expert skill activates automatically
{:ok, response} = ExAgent.chat(agent, "Help me write a SQL query to find active users")
# => Agent now uses the sql_expert system prompt and tools

# Skills can also be loaded dynamically at runtime
{:ok, new_skill} = ExAgent.Skill.new(name: "devops", system_prompt: "You are a DevOps expert.")
ExAgent.Agent.load_skill(agent, new_skill)

3. Handoffs (State-Driven Transitions)

The active agent changes dynamically. When the LLM invokes a handoff tool, control transfers to a different agent. The caller receives a {:handoff, target, context} tuple and decides where to route subsequent messages.

alias ExAgent.Patterns.Handoff

# Start specialized agents
{:ok, sales_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: key,
    system_prompt: "You are a sales specialist."
  )
)

{:ok, support_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: key,
    system_prompt: "You are a technical support specialist."
  )
)

# Build handoff tools
handoff_to_support = Handoff.build_handoff_tool(
  "support",
  support_agent,
  "Transfer to technical support when the user has a technical issue"
)

handoff_to_sales = Handoff.build_handoff_tool(
  "sales",
  sales_agent,
  "Transfer to sales when the user wants to buy something"
)

# Start a triage agent with handoff tools
{:ok, triage_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: key,
    system_prompt: "You are a triage agent. Route users to the right department."
  ),
  tools: [handoff_to_support, handoff_to_sales]
)

# When the LLM decides to hand off, you get a handoff tuple
case ExAgent.chat(triage_agent, "My app keeps crashing") do
  {:ok, response} ->
    # Normal response — agent handled it directly
    IO.puts("Normal response — agent handled it directly")
    IO.puts(response.content)

  {:handoff, target_pid, context} ->
    # Transfer context and continue with the new agent
    ExAgent.handoff(target_pid, context)
    {:ok, response} = ExAgent.chat(target_pid, "My app keeps crashing")
    IO.puts("Transfer context and continue with the new agent")
    IO.puts(response.content)
end

4. Router (Parallel Dispatch & Synthesis)

Classifies input, dispatches to multiple specialized agents in parallel, and synthesizes results into a single response.

alias ExAgent.Patterns.Router

# Start specialized agents
{:ok, code_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: key,
    system_prompt: "Analyze code quality and suggest improvements."
  )
)

{:ok, security_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.OpenAI.new(
    api_key: key,
    system_prompt: "Analyze code for security vulnerabilities."
  )
)

{:ok, perf_agent} = ExAgent.start_agent(
  provider: ExAgent.Providers.Gemini.new(
    api_key: gemini_key,
    system_prompt: "Analyze code for performance issues."
  )
)

# Define routes with match functions
routes = [
  %{name: "code_quality", agent: code_agent, match_fn: fn _ -> true end},
  %{name: "security", agent: security_agent, match_fn: &String.contains?(&1, "security")},
  %{name: "performance", agent: perf_agent, match_fn: &String.contains?(&1, "performance")}
]

# Route dispatches to all matching agents in parallel
{:ok, result} = ExAgent.route(
  "Review this code for security and performance issues: def fetch(url), do: HTTPoison.get!(url)",
  routes: routes,
  timeout: 30_000
)

IO.puts(result)
# ## code_quality
# The function lacks error handling...
#
# ## security
# Using get! will raise on HTTP errors...
#
# ## performance
# Consider connection pooling...

# Custom synthesizer
{:ok, result} = ExAgent.route("analyze this code",
  routes: routes,
  synthesizer: fn _input, results ->
    results
    |> Enum.map(fn {name, content} -> "**#{name}**: #{content}" end)
    |> Enum.join("\n\n")
  end
)

Adding a Custom Provider

Any LLM can be integrated by defining a struct and implementing the ExAgent.LlmProvider protocol:

defmodule MyApp.Providers.Anthropic do
  @moduledoc "Custom Anthropic Claude provider."

  defstruct [
    :api_key, :req,
    model: "claude-sonnet-4-20250514",
    base_url: "https://api.anthropic.com/v1",
    system_prompt: nil,
    tools: []
  ]

  def new(opts) do
    provider = struct!(__MODULE__, opts)
    %{provider | req: Req.new(
      base_url: provider.base_url,
      headers: [
        {"x-api-key", provider.api_key},
        {"anthropic-version", "2023-06-01"}
      ]
    )}
  end

  defimpl ExAgent.LlmProvider do
    def chat(provider, messages, _opts) do
      body = %{
        "model" => provider.model,
        "max_tokens" => 1024,
        "messages" => Enum.map(messages, fn msg ->
          %{"role" => to_string(msg.role), "content" => msg.content}
        end)
      }

      case Req.post(provider.req, url: "/messages", json: body) do
        {:ok, %Req.Response{status: 200, body: %{"content" => [%{"text" => text} | _]}}} ->
          {:ok, %ExAgent.Message{role: :assistant, content: text}}

        {:ok, %Req.Response{status: status, body: body}} ->
          {:error, {status, body}}

        {:error, reason} ->
          {:error, reason}
      end
    end
  end
end

# Use it like any other provider
provider = MyApp.Providers.Anthropic.new(api_key: "sk-ant-...")
{:ok, agent} = ExAgent.start_agent(provider: provider)
{:ok, response} = ExAgent.chat(agent, "Hello Claude!")

Architecture

Supervision Tree

Application (ex_agent)
  |
  ExAgent.AgentSupervisor (:one_for_one)
    |
    +-- ExAgent.AgentDynamicSupervisor (:one_for_one)
    |     |
    |     +-- ExAgent.Agent (id: "orchestrator")
    |     +-- ExAgent.Agent (id: "coder")
    |     +-- ExAgent.Agent (id: "reviewer")
    |     +-- ... (any runtime agents)
    |
    +-- ExAgent.TaskSupervisor
          |
          +-- Task (async chat calls)
          +-- Task (parallel subagent invocations)
          +-- Task (router parallel dispatch)

Design Decisions

  • Protocol dispatch — Provider structs implement ExAgent.LlmProvider, enabling compile-time polymorphism
  • Thin protocol impls — Protocol implementations delegate to service modules under services/, keeping HTTP logic separate
  • Tool loop in GenServer — The handle_call({:chat, ...}) contains the tool execution loop, processing one turn at a time to prevent race conditions on context
  • Subagents bypass GenServer — Ephemeral stateless calls use LlmProvider.chat/3 directly in supervised Tasks
  • Handoff returns to caller — Keeps agents decoupled; the caller decides routing after a handoff
  • Router is a plain module — Stateless classify-dispatch-synthesize flow needs no GenServer
  • All patterns share one Agent GenServer — Patterns augment behavior through state and tools, not separate process types

Project Structure

lib/
  ex_agent.ex                     # Public API facade
  ex_agent/
    llm_provider.ex               # LlmProvider protocol
    file_uploader.ex              # FileUploader protocol
    file_ref.ex                   # %FileRef{} struct (uploaded file reference)
    message.ex                    # %Message{} struct
    tool.ex                       # %Tool{} struct
    context.ex                    # %Context{} struct
    skill.ex                      # %Skill{} struct
    agent.ex                      # Agent GenServer
    supervisor.ex                 # AgentSupervisor
    dynamic_supervisor.ex         # AgentDynamicSupervisor
    providers/
      openai.ex                   # OpenAI provider + LlmProvider + FileUploader
      gemini.ex                   # Gemini provider + LlmProvider + FileUploader
      deep_seek.ex                # DeepSeek provider + LlmProvider
    services/
      openai_service.ex           # OpenAI chat HTTP calls via Req
      openai_upload_service.ex    # OpenAI file upload (POST /v1/files)
      gemini_service.ex           # Gemini chat HTTP calls via Req
      gemini_upload_service.ex    # Gemini file upload (Files API)
      deep_seek_service.ex        # DeepSeek HTTP calls via Req
    patterns/
      subagents.ex                # Centralized orchestration
      skills.ex                   # Progressive disclosure
      handoff.ex                  # State-driven transitions
      router.ex                   # Parallel dispatch & synthesis

License

Apache-2.0

About

An Elixir library for building multi-agent LLM applications.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages