-
Notifications
You must be signed in to change notification settings - Fork 4
Production Deployment
Move from development to production with confidence. This guide covers queue integration, process management, resource limits, monitoring, scaling, and operational checklists for running the Claude Agent SDK in a production Laravel application.
The SDK spawns a Claude Code CLI subprocess for every query. Each subprocess:
- Takes 5--120 seconds to complete depending on complexity
- Consumes 50--200 MB of memory
- Blocks the calling thread until completion
This means synchronous HTTP requests are not viable for most use cases. Production deployments should offload agent queries to background jobs via Laravel's queue system.
A typical web request has a 30-second timeout. Agent queries routinely exceed that. Running them synchronously will:
- Cause HTTP timeouts and 504 errors
- Block web server workers, reducing throughput
- Leave users staring at a spinner with no feedback
Laravel's queue system solves all three problems: the HTTP request returns immediately, the agent runs in a dedicated worker process, and real-time updates can be broadcast to the client.
use ClaudeAgentSDK\ClaudeAgentManager;
use ClaudeAgentSDK\Messages\AssistantMessage;
use ClaudeAgentSDK\Messages\ResultMessage;
use ClaudeAgentSDK\Options\ClaudeAgentOptions;
use ClaudeAgentSDK\Exceptions\ClaudeAgentException;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
class RunAgentJob implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
public int $timeout = 180; // Must exceed process_timeout
public int $tries = 2;
public function __construct(
public string $prompt,
public int $userId,
public string $taskId,
) {}
public function handle(ClaudeAgentManager $agent): void
{
$options = ClaudeAgentOptions::make()
->tools(['Read', 'Grep', 'Glob'])
->permission('dontAsk')
->maxBudgetUsd(1.00)
->maxTurns(15)
->user((string) $this->userId);
try {
$result = $agent->streamCollect(
prompt: $this->prompt,
onMessage: function ($message) {
if ($message instanceof AssistantMessage) {
broadcast(new AgentProgressEvent(
$this->userId,
$this->taskId,
$message->text(),
));
}
},
options: $options,
);
broadcast(new AgentCompleteEvent(
$this->userId,
$this->taskId,
$result->text(),
$result->costUsd(),
$result->turns(),
));
Log::info('Agent job completed', [
'task_id' => $this->taskId,
'cost_usd' => $result->costUsd(),
'turns' => $result->turns(),
'duration' => $result->durationMs(),
]);
} catch (ClaudeAgentException $e) {
broadcast(new AgentErrorEvent($this->userId, $this->taskId, $e->getMessage()));
throw $e; // Let the queue retry or fail
}
}
}Warning: The job's
$timeoutproperty must exceed the SDK'sprocess_timeoutconfig value. If the queue kills the job before the CLI finishes, you get an orphaned process and a lost result.
use Illuminate\Http\Request;
class AgentController extends Controller
{
public function run(Request $request)
{
$request->validate(['prompt' => 'required|string|max:5000']);
$taskId = (string) Str::uuid();
RunAgentJob::dispatch($request->prompt, auth()->id(), $taskId);
return response()->json(['task_id' => $taskId], 202);
}
}The simplest pattern. Dispatch a job and let it run. Results are stored (database, cache, etc.) for later retrieval.
RunAgentJob::dispatch($prompt, $userId, $taskId);
// HTTP response returns immediatelyCombine queue jobs with Laravel Broadcasting for live updates via WebSockets:
// Client-side (Laravel Echo)
Echo.private(`agent.${userId}`)
.listen('AgentProgressEvent', (e) => appendText(e.text))
.listen('AgentCompleteEvent', (e) => showResult(e.result))
.listen('AgentErrorEvent', (e) => showError(e.message));See the full job example above for the server-side broadcasting calls.
For simpler frontends that cannot use WebSockets:
// Store result when job completes
Cache::put("agent-result:{$taskId}", [
'status' => 'complete',
'text' => $result->text(),
'cost' => $result->costUsd(),
], now()->addHours(1));
// Polling endpoint
Route::get('/agent/status/{taskId}', function (string $taskId) {
$data = Cache::get("agent-result:{$taskId}", ['status' => 'pending']);
return response()->json($data);
});const poll = setInterval(async () => {
const res = await fetch(`/agent/status/${taskId}`);
const data = await res.json();
if (data.status === 'complete') {
clearInterval(poll);
showResult(data.text);
}
}, 2000);Use Supervisor to keep queue workers running reliably:
[program:agent-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /var/www/app/artisan queue:work --queue=agent --timeout=200 --memory=256 --tries=2
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true
numprocs=4
redirect_stderr=true
stdout_logfile=/var/log/supervisor/agent-worker.log
stopwaitsecs=210Key settings:
| Setting | Value | Rationale |
|---|---|---|
--timeout |
200 | Must exceed the longest possible agent query |
--memory |
256 | Worker restarts if PHP process exceeds 256 MB |
stopwaitsecs |
210 | Supervisor waits longer than --timeout before force-killing |
numprocs |
4 | Number of parallel agent workers (tune based on server resources) |
Tip: Use a dedicated queue name (e.g.
agent) so agent jobs do not compete with fast jobs like email sending or notifications.
Timeouts must be ordered correctly to avoid orphaned processes:
process_timeout (SDK) < job $timeout < queue --timeout < stopwaitsecs (Supervisor)
120s 180s 200s 210s
use ClaudeAgentSDK\Options\ClaudeAgentOptions;
$options = ClaudeAgentOptions::make()
->maxBudgetUsd(1.00) // Hard spend ceiling per query
->maxTurns(15) // Prevent infinite agent loops
->maxThinkingTokens(16000); // Cap extended-thinking token usage// config/claude-agent.php
'process_timeout' => env('CLAUDE_AGENT_TIMEOUT', 120), // secondsProtect against abuse by rate-limiting agent requests per user:
// In your controller or middleware
RateLimiter::for('agent', function (Request $request) {
return Limit::perMinute(5)->by($request->user()->id);
});
Route::post('/agent/run', [AgentController::class, 'run'])
->middleware('throttle:agent');Track spend across all queries and alert on anomalies:
// After each query
$cost = $result->costUsd();
// Store for aggregation
DB::table('agent_costs')->insert([
'user_id' => $userId,
'task_id' => $taskId,
'cost_usd' => $cost,
'model' => $result->modelUsage() ? array_key_first($result->modelUsage()) : null,
'created_at' => now(),
]);
// Alert on high single-query cost
if ($cost > 5.00) {
Log::warning('High agent cost detected', ['task_id' => $taskId, 'cost' => $cost]);
// Notify::send(new HighCostAlert($taskId, $cost));
}
// Daily aggregation check (scheduled command)
$dailyCost = DB::table('agent_costs')
->whereDate('created_at', today())
->sum('cost_usd');
if ($dailyCost > 100.00) {
Log::critical('Daily agent spend exceeds $100', ['total' => $dailyCost]);
}Log::info('Agent performance', [
'task_id' => $taskId,
'duration_ms' => $result->durationMs(),
'turns' => $result->turns(),
'cost_usd' => $result->costUsd(),
'tools_used' => collect($result->toolUses())->pluck('name')->countBy()->toArray(),
'cache_read' => $result->cacheReadTokens(),
'cache_write' => $result->cacheCreationTokens(),
]);Track queue depth to detect backlogs:
// Scheduled command running every minute
$pending = DB::table('jobs')->where('queue', 'agent')->count();
if ($pending > 50) {
Log::warning('Agent queue backlog', ['pending' => $pending]);
}use ClaudeAgentSDK\Exceptions\CliNotFoundException;
use ClaudeAgentSDK\Exceptions\ProcessException;
use ClaudeAgentSDK\Exceptions\ClaudeAgentException;
try {
$result = $agent->query($prompt, $options);
if ($result->isError()) {
Log::warning('Agent returned error result', [
'task_id' => $taskId,
'text' => mb_substr($result->text(), 0, 500),
]);
}
} catch (CliNotFoundException $e) {
Log::critical('Claude CLI not found -- agent is non-functional', [
'message' => $e->getMessage(),
]);
// Page on-call: this means no agent queries will work
} catch (ProcessException $e) {
Log::error('Agent process failed', [
'task_id' => $taskId,
'exit_code' => $e->exitCode,
'stderr' => mb_substr($e->stderr, 0, 1000),
]);
} catch (ClaudeAgentException $e) {
Log::error('Agent SDK error', ['task_id' => $taskId, 'error' => $e->getMessage()]);
}Tip: Monitor the
isError()rate as a percentage of total queries. A sudden spike may indicate a model outage, rate limit, or configuration issue.
Each agent query spawns a separate CLI process. This has direct implications for scaling:
| Dimension | Consideration |
|---|---|
| CPU | CLI processes are mostly I/O-bound (waiting for API responses), so CPU is rarely the bottleneck |
| Memory | Each process uses 50--200 MB. Four concurrent workers need 200--800 MB of headroom |
| Concurrency | Scale horizontally by adding more queue workers across multiple servers |
| Network | Each process makes outbound API calls. Ensure your network allows sufficient concurrent connections |
Warning: Each agent process consumes 50--200 MB of memory. Running too many concurrent workers on a single server will cause OOM kills. Monitor memory usage and set the
--memoryflag on queue workers.
Server A: 4 agent workers (1 GB reserved for agents)
Server B: 4 agent workers (1 GB reserved for agents)
Server C: 4 agent workers (1 GB reserved for agents)
= 12 concurrent agent queries
Use a shared queue backend (Redis, SQS, database) so all servers pull from the same pool.
Before going live, verify every item:
- Claude Code CLI is installed and accessible to the worker user (
which claude) -
ANTHROPIC_API_KEY(or provider credentials) is set via secrets manager -
max_budget_usdis configured to prevent runaway spend -
permission_modeis set todontAskwith a minimalallowed_toolslist -
process_timeoutis set and the timeout chain is correct - Queue workers are running with Supervisor (or equivalent)
- Monitoring is in place for cost, performance, and errors
- Error handling catches all
ClaudeAgentExceptionsubtypes - Audit logging records every query with user, prompt, cost, and result
- Input validation sanitizes user-supplied prompts
- Working directory is scoped (not
/or a sensitive path) - Rate limiting is applied to agent endpoints
npm install -g @anthropic-ai/claude-code
which claude # Verify: /usr/local/bin/claude or similarFROM php:8.2-cli
# Install Node.js (required for Claude CLI)
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y nodejs
# Install Claude CLI
RUN npm install -g @anthropic-ai/claude-code
# Verify installation
RUN claude --version
# Copy application
COPY . /var/www/app
WORKDIR /var/www/appIf the CLI is installed in a non-standard location:
CLAUDE_AGENT_CLI_PATH=/opt/claude/bin/claude// config/claude-agent.php
'cli_path' => env('CLAUDE_AGENT_CLI_PATH', null), // null = auto-detect from $PATH// In your job class
public function handle(ClaudeAgentManager $agent): void
{
// Register a shutdown handler to stop the CLI process cleanly
pcntl_signal(SIGTERM, function () use ($agent) {
Log::info('Received SIGTERM, stopping agent gracefully');
$agent->stop();
});
$result = $agent->query($this->prompt, $options);
// ...
}The stop() method sends SIGINT to the CLI subprocess, allowing it to clean up before exiting.
class RunAgentJob implements ShouldQueue
{
public int $tries = 3;
public array $backoff = [10, 30, 60]; // seconds between retries
public function retryUntil(): DateTime
{
return now()->addMinutes(10);
}
public function failed(\Throwable $e): void
{
Log::critical('Agent job permanently failed', [
'task_id' => $this->taskId,
'error' => $e->getMessage(),
]);
broadcast(new AgentErrorEvent(
$this->userId,
$this->taskId,
'The agent was unable to complete your request. Please try again later.',
));
}
}Note: Not all failures are retryable. A
CliNotFoundExceptionwill fail on every retry -- use thefailed()method to notify users and operators immediately.
use ClaudeAgentSDK\Facades\ClaudeAgent;
use ClaudeAgentSDK\Options\ClaudeAgentOptions;
Route::get('/health/agent', function () {
try {
$result = ClaudeAgent::query(
'Reply with OK',
ClaudeAgentOptions::make()
->maxTurns(1)
->maxBudgetUsd(0.01)
->tools([]),
);
return response()->json([
'status' => $result->isSuccess() ? 'healthy' : 'degraded',
'latency' => $result->durationMs(),
]);
} catch (\Throwable $e) {
return response()->json([
'status' => 'unhealthy',
'error' => $e->getMessage(),
], 503);
}
});- Security Guide -- Permission modes, sandboxing, and input validation
- Configuration -- Full config reference and environment-specific settings
- Streaming -- Real-time output patterns for SSE and WebSocket integrations
- Error Handling -- Comprehensive exception handling strategies
- Custom API Providers -- Configure Bedrock, Vertex AI, or custom endpoints
- Hooks -- Pre/post tool-use hooks for validation and auditing