Skip to content

feat: implement WhatsApp Business Cloud API integration for Eliza App#320

Merged
lalalune merged 17 commits intodevfrom
feat/eliza-app-whatsapp-support
Mar 8, 2026
Merged

feat: implement WhatsApp Business Cloud API integration for Eliza App#320
lalalune merged 17 commits intodevfrom
feat/eliza-app-whatsapp-support

Conversation

@hanzlamateen
Copy link
Collaborator

Summary

Full-stack WhatsApp integration for the Eliza App public bot. Users can message the WhatsApp number and are auto-provisioned on first contact; replies are generated via the default Eliza agent in ASSISTANT mode. Cross-platform account linking is automatic since WhatsApp IDs are phone numbers — existing Telegram/iMessage users with the same number are linked seamlessly.

Changes

API Routes

  • POST /api/eliza-app/auth/whatsapp — Issues JWT session tokens for WhatsApp-identified users. Supports standard lookup (user must first message the bot) and session-based linking (existing user attaches WhatsApp to their account via Bearer token).
  • GET /api/eliza-app/webhook/whatsapp — Meta webhook verification handshake (hub.mode, hub.verify_token, hub.challenge).
  • POST /api/eliza-app/webhook/whatsapp — Receives incoming WhatsApp messages, verifies HMAC-SHA256 signature, auto-provisions users, acquires distributed room lock, processes messages through the Eliza runtime in ASSISTANT mode, and sends responses back via the Cloud API.

Database

  • Migration 0029_add_whatsapp_identity_columns.sql — Adds whatsapp_id (unique, indexed) and whatsapp_name columns to users table. Extends phone_provider and phone_type enums with 'whatsapp' value. Fully idempotent with IF NOT EXISTS guards.
  • Schema updatesdb/schemas/users.ts and db/schemas/agent-phone-numbers.ts updated with WhatsApp columns and enum values.

Services

  • ElizaAppUserService — Three new methods:
    • findOrCreateByWhatsAppId() — Auto-provisions users on first WhatsApp message with 3-way cross-platform linking (by whatsapp_id, by phone_number, or new user). Includes race-condition handling with unique constraint recovery.
    • getByWhatsAppId() — Lookup by WhatsApp ID with organization data.
    • linkWhatsAppToUser() — Session-based linking with idempotency and conflict detection.
  • WhatsAppAuthService (whatsapp-auth.ts) — Verifies webhook POST signatures (X-Hub-Signature-256) and GET subscription handshakes.
  • MessageRouterService — Extended with whatsapp provider and sendViaWhatsApp() method using the Cloud API.

Utilities

  • whatsapp-api.ts (new, 304 lines) — Complete WhatsApp Cloud API utility layer:
    • Zod-validated webhook payload parsing (parseWhatsAppWebhookPayload, extractWhatsAppMessages)
    • Message sending (sendWhatsAppMessage) with retry and error handling
    • Read receipts (markWhatsAppMessageAsRead)
    • HMAC-SHA256 signature verification (verifyWhatsAppSignature)
    • TypeScript types for all WhatsApp API structures
  • deterministic-uuid.ts — Extended channel union type with 'whatsapp' for room/entity ID generation.

Repository

  • UsersRepository — Added findByWhatsAppId() and findByWhatsAppIdWithOrganization() query methods.

Configuration

  • elizaAppConfig — New whatsapp section with accessToken, phoneNumberId, appSecret, verifyToken, and phoneNumber.
  • .env.local.example — Documented all 5 WhatsApp environment variables with setup instructions.
  • README.md — Updated Eliza App variables section to include WhatsApp.

Tests (6 new test files)

  • tests/unit/eliza-app/whatsapp-auth.test.ts — WhatsApp auth service unit tests
  • tests/unit/eliza-app/whatsapp-webhook.test.ts — Webhook handler unit tests
  • tests/unit/eliza-app/cross-platform-linking.test.ts — Cross-platform account linking scenarios
  • tests/unit/whatsapp-api-util.test.ts — WhatsApp API utility tests (390 lines)
  • tests/unit/message-router-service.test.ts — Message router WhatsApp provider tests
  • tests/integration/whatsapp-webhook-e2e.test.ts — End-to-end webhook integration tests

Test Plan

  • Run npx jest --testPathPattern whatsapp — all 6 test files pass
  • Deploy to staging and configure Meta webhook callback URL
  • Verify GET webhook verification handshake succeeds in Meta App Dashboard
  • Send a WhatsApp message to the bot number → confirm user auto-provisioned in DB with whatsapp_id and whatsapp_name
  • Verify Eliza responds via WhatsApp with agent-generated content
  • Send a second message → confirm same user is reused (not duplicated)
  • Test cross-platform linking: create Telegram user with phone, then message from WhatsApp with same number → confirm accounts merge
  • Test session-based linking via POST /api/eliza-app/auth/whatsapp with Bearer token
  • Verify webhook signature rejection with tampered payload (should return 401)
  • Verify rate limiting on auth endpoint
  • Confirm blue checkmarks (read receipts) appear in WhatsApp after message processing

Related

  • Frontend PR: feat/whatsapp-support on eliza-app (onboarding flow, connected page, auth context)

- Added WhatsApp authentication endpoint to handle user sessions via WhatsApp ID, enabling auto-provisioning of users on first message.
- Implemented a public webhook to receive and route messages from the WhatsApp Cloud API, supporting cross-platform user linking with Telegram and iMessage.
- Enhanced user service methods to manage WhatsApp identity, including database migrations for WhatsApp-specific fields and constraints.
- Updated environment configuration and README to include WhatsApp settings, ensuring proper setup for deployment.

This integration expands the Eliza App's capabilities, allowing seamless communication through WhatsApp and improving user experience across platforms.
@vercel
Copy link

vercel bot commented Feb 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
eliza-cloud-v2 Ready Ready Preview, Comment Feb 25, 2026 4:19pm

@coderabbitai
Copy link

coderabbitai bot commented Feb 9, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Note

🎁 Summarized by CodeRabbit Free

Your organization is on the Free plan. CodeRabbit will generate a high-level summary and a walkthrough for each pull request. For a comprehensive line-by-line review, please upgrade your subscription to CodeRabbit Pro by visiting https://app.coderabbit.ai/login.

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link

claude bot commented Feb 9, 2026

PR Review: WhatsApp Integration for Eliza App

Overall Assessment: ✅ Excellent

This is a well-architected, production-ready WhatsApp integration. The code demonstrates strong engineering practices with comprehensive testing, proper security measures, and thoughtful cross-platform account linking.


🎯 Strengths

Architecture & Design

  • Clean separation of concerns: API routes, services, utilities, and repositories are properly layered
  • Idempotent operations: All database operations handle race conditions gracefully with retry logic
  • Cross-platform linking: Automatic account linking via phone number is elegant and well-implemented
  • Distributed locking: Proper use of room locks prevents concurrent message processing (120s TTL matches maxDuration)

Security ✓

  • HMAC-SHA256 signature verification using crypto.timingSafeEqual() to prevent timing attacks (lib/utils/whatsapp-api.ts:145)
  • Zod schema validation for all webhook payloads with proper error handling
  • Rate limiting on auth and webhook endpoints using existing middleware
  • Dev mode escape hatch properly guarded with NODE_ENV !== "production" check (app/api/eliza-app/webhook/whatsapp/route.ts:196)
  • Secure logging with PII masking (e.g., ***${msg.from.slice(-4)})

Database Migration ✓

  • Fully idempotent with IF NOT EXISTS guards throughout (db/migrations/0029_add_whatsapp_identity_columns.sql)
  • Proper enum handling with existence checks before ALTER TYPE ... ADD VALUE
  • Partial indexes for efficient non-NULL lookups (users_whatsapp_id_idx)
  • No CREATE INDEX CONCURRENTLY (correct - runs in transaction per docs)

Test Coverage ✓

  • 6 comprehensive test files covering unit, integration, and cross-platform scenarios
  • 390 lines in whatsapp-api-util.test.ts alone with edge cases
  • Signature verification tests include tamper detection, wrong secrets, missing prefixes
  • Cross-platform linking tests cover Telegram→WhatsApp, iMessage→WhatsApp scenarios

Code Quality

  • TypeScript types are comprehensive and well-documented
  • Error handling is robust with proper logging at all levels
  • No console.log statements - all logging uses structured logger
  • No TODO/FIXME comments in new code
  • Consistent style following existing patterns

🔍 Observations & Recommendations

1. Webhook Retry Logic (route.ts:263)

Returns 503 on lock acquisition failure to trigger Meta's retry mechanism. This is correct, but consider:

  • Monitoring: Add metrics for 503 responses to detect lock contention issues
  • Alerting: If lock failures are frequent, may indicate need for horizontal scaling

2. Phone Number Derivation (user-service.ts:955)

Auto-deriving E.164 from WhatsApp ID with +${whatsappId} is clever for cross-platform linking. However:

  • Edge case: What if WhatsApp ID doesn't start with country code? (e.g., some international numbers)
  • Validation: Consider adding country code validation or at minimum logging unexpected formats
  • Impact: Low - WhatsApp IDs are always phone numbers in international format, but worth documenting

3. Message Type Handling (route.ts:76)

Currently only processes text messages:

if (!text) return true; // Only handle text messages for now

Recommendation: Add TODO comment or tracking issue for image/audio/document support

4. Session-Based Linking (auth/whatsapp/route.ts:94)

The dual-mode auth (standard lookup vs session-based linking) is well-implemented. Consider:

  • Documentation: Add a sequence diagram to docs showing both flows
  • Frontend: Ensure the frontend PR handles the "USER_NOT_FOUND" error gracefully with messaging instructions

5. Race Condition Handling (user-service.ts:1011-1018)

The TOCTOU check at line 994 and unique constraint recovery is excellent. Consider:

  • Metrics: Track how often race conditions occur to gauge load patterns
  • Backoff: If races are frequent under load, consider adding jitter to retry delays

🐛 Potential Issues (Minor)

1. Undefined Organization Access (route.ts:148)

const userContext = await userContextService.buildContext({
  user: { ...userWithOrg, organization } as never,
  //                                        ^^^^^^ type assertion

Issue: Using as never to bypass TypeScript checking
Recommendation: Update buildContext type signature or create a proper adapter function
Severity: Low (works at runtime, but masks type safety)

2. Non-Critical Update Failures (user-service.ts:963-974)

WhatsApp name updates fail silently with warnings. This is acceptable but consider:

  • Eventual consistency: Could this lead to stale profile names long-term?
  • Retry queue: For non-critical updates, consider a background job queue

3. Lock Release on Error (route.ts:188)

Lock is released in finally block - correct. But if lock.release() throws:

  • Impact: Lock expires after 120s anyway (TTL), but logs would be unclear
  • Recommendation: Wrap lock.release() in try-catch for cleaner error reporting

📋 Checklist Items

Before merging, verify:

  • Env vars documented in deployment guide (Vercel secrets, etc.)
  • Meta App Webhook URL configured to point to production endpoint
  • Signature verification never disabled in production (double-check CI/CD configs)
  • Database migration tested on staging DB with realistic data volume
  • Rate limits tuned appropriately (current: AGGRESSIVE preset - is this correct for webhook?)
  • Monitoring dashboards updated with WhatsApp metrics (message volume, lock failures, signature rejections)
  • Runbook created for common issues (webhook verification fails, lock contention, user linking conflicts)

🎓 Learning Points

This PR demonstrates several best practices worth noting:

  1. Deterministic UUID generation for cross-platform room IDs (deterministic-uuid.ts)
  2. Webhook signature verification implementation is textbook-perfect
  3. Idempotent migrations with DO blocks for enum additions
  4. Race condition recovery pattern in user service (find-or-create with retry)
  5. Structured logging with PII masking throughout

✅ Approval Recommendation

LGTM with minor observations. The issues noted above are refinements, not blockers. The code is:

  • Production-ready
  • Well-tested
  • Secure
  • Maintainable

Suggest addressing observations in follow-up PRs rather than blocking this one.


📝 Suggested Follow-Ups

  1. Add support for WhatsApp media messages (images, audio, documents)
  2. Add monitoring dashboards for WhatsApp metrics
  3. Document phone number derivation edge cases in CLAUDE.md
  4. Create runbook for WhatsApp webhook troubleshooting
  5. Consider session-based linking UX improvements in frontend

Great work! 🚀


Files Reviewed:

  • app/api/eliza-app/webhook/whatsapp/route.ts (303 lines)
  • app/api/eliza-app/auth/whatsapp/route.ts (219 lines)
  • lib/utils/whatsapp-api.ts (305 lines)
  • lib/services/eliza-app/user-service.ts (WhatsApp methods)
  • lib/services/eliza-app/whatsapp-auth.ts (80 lines)
  • db/migrations/0029_add_whatsapp_identity_columns.sql (41 lines)
  • ✅ All test files (6 files, comprehensive coverage)
  • ✅ Schema updates and configuration

- Introduced new columns for WhatsApp ID in the users table to support WhatsApp authentication.
- Ensured idempotency in the migration with the use of IF NOT EXISTS.
- Generated migration script using drizzle-kit for consistency with existing database migration practices.
@claude
Copy link

claude bot commented Feb 9, 2026

Code Review: WhatsApp Integration (PR #320)

Summary

This WhatsApp Business Cloud API integration is well-architected with strong security practices, comprehensive error handling, and excellent race condition mitigation. The code follows existing patterns from Discord/Telegram integrations and includes robust test coverage.

Overall Assessment:Approve with minor fixes recommended


Critical Issues (Must Fix Before Merge)

🔴 1. Environment Variable Validation Bypass in Production

File: lib/services/eliza-app/config.ts:47-54
Severity: HIGH

The WhatsApp config uses empty string fallbacks which won't throw errors in production if variables are missing:

whatsapp: {
  accessToken: requireEnv("ELIZA_APP_WHATSAPP_ACCESS_TOKEN", ""),  // ❌ Empty fallback
  phoneNumberId: requireEnv("ELIZA_APP_WHATSAPP_PHONE_NUMBER_ID", ""),
  appSecret: requireEnv("ELIZA_APP_WHATSAPP_APP_SECRET", ""),
  // ...
}

The requireEnv function logic allows empty strings to pass through in production, which could lead to runtime failures.

Fix:

whatsapp: {
  accessToken: requireEnv("ELIZA_APP_WHATSAPP_ACCESS_TOKEN"),  // ✅ No fallback = required
  phoneNumberId: requireEnv("ELIZA_APP_WHATSAPP_PHONE_NUMBER_ID"),
  appSecret: requireEnv("ELIZA_APP_WHATSAPP_APP_SECRET"),
  verifyToken: requireEnv("ELIZA_APP_WHATSAPP_VERIFY_TOKEN"),
  phoneNumber: requireEnv("ELIZA_APP_WHATSAPP_PHONE_NUMBER"),
}

This matches the pattern used for jwt.secret and will properly throw at startup if variables are missing.


Important Improvements (Recommended)

🟡 2. Missing whatsappId in Session Payload Type

File: lib/services/eliza-app/session-service.ts:12-31
Severity: MEDIUM

The session payload type includes telegramId and discordId but not whatsappId, creating inconsistency:

export interface ElizaAppSessionPayload extends JWTPayload {
  userId: string;
  organizationId: string;
  telegramId?: string;
  discordId?: string;
  phoneNumber?: string;
  // ❌ Missing: whatsappId?: string;
}

Recommendation: Add whatsappId?: string for consistency, even if not immediately used in JWT claims.


Minor Enhancements (Nice to Have)

🟢 3. Document Security Implications of Signature Bypass

File: app/api/eliza-app/webhook/whatsapp/route.ts:194-210

The development mode signature verification bypass is properly guarded but could use better documentation:

const skipVerification =
  process.env.SKIP_WEBHOOK_VERIFICATION === "true" &&
  process.env.NODE_ENV !== "production";

Recommendation: Add inline comment:

// SECURITY: Only skip signature verification in non-production environments
// This bypasses HMAC-SHA256 validation and should NEVER be enabled in production
// as it allows any source to send webhook payloads

Excellent Implementations ⭐

Security

HMAC-SHA256 signature verification uses crypto.timingSafeEqual() for constant-time comparison (prevents timing attacks)
Input validation via Zod schemas with proper constraints (7-15 digit WhatsApp IDs, digits-only regex)
SQL injection prevention via Drizzle ORM parameterized queries throughout
Sensitive data masking in logs (only last 4 digits of WhatsApp IDs shown)
Rate limiting configured appropriately (60/min for auth, 100/min for webhooks)

Concurrency & Race Conditions

TOCTOU protection in findOrCreateByWhatsAppId with re-checks after concurrent updates (lib/services/eliza-app/user-service.ts:950-1077)
Unique constraint error recovery handles race conditions during user creation
Distributed lock usage prevents concurrent message processing conflicts (120s timeout matching endpoint config)
Idempotency using WhatsApp message IDs prevents duplicate processing

Database

Migration idempotency uses IF NOT EXISTS guards throughout (db/migrations/0029_add_whatsapp_identity_columns.sql)
Partial indexes for performance (WHERE "whatsapp_id" IS NOT NULL)
Proper transaction handling with graceful degradation for non-critical updates

Code Quality

Comprehensive Zod validation for webhook payloads with nested structure validation
Consistent architecture mirrors existing Discord/Telegram patterns
Cross-platform linking via phone number derivation enables seamless account unification
Semantic HTTP status codes (409 for conflicts, 404 for not found, 503 for service unavailable)
Structured logging with context for debugging

Test Coverage

6 test files covering unit, integration, and E2E scenarios:

  • tests/unit/whatsapp-api-util.test.ts (390 lines) - Signature verification, payload parsing, phone formatting
  • tests/unit/eliza-app/cross-platform-linking.test.ts - All linking scenarios
  • tests/unit/eliza-app/whatsapp-webhook.test.ts - Webhook handler logic
  • tests/unit/eliza-app/whatsapp-auth.test.ts - Auth endpoint
  • tests/unit/message-router-service.test.ts - Message routing
  • tests/integration/whatsapp-webhook-e2e.test.ts - End-to-end flows

Security Checklist ✓

  • ✅ HMAC-SHA256 webhook signature verification implemented correctly
  • ✅ Constant-time signature comparison prevents timing attacks
  • ✅ SQL injection prevented via parameterized queries
  • ✅ Input validation via Zod schemas with proper constraints
  • ✅ Rate limiting configured on all endpoints
  • ✅ PII masking in logs
  • ✅ No XSS vectors (API-only endpoints)
  • ⚠️ Environment variable validation needs fix (Issue Feat: Auth & Chat: Conversation History, Credit Tracking, Error Handling #1 above)

Performance Considerations ✓

  • ✅ Partial indexes on nullable columns for efficiency
  • ✅ Distributed locking prevents concurrent processing overhead
  • ✅ Idempotency checks prevent duplicate work
  • ✅ Non-critical updates fail gracefully without blocking main flow
  • ✅ Proper use of database indexes (whatsapp_id unique + indexed)

Test Plan Validation

The provided test plan in the PR description is comprehensive. Suggested additions:

  • Load test webhook endpoint with concurrent messages (verify distributed lock behavior)
  • Test webhook retry behavior when returning 503 (verify Meta retry logic)
  • Verify production environment variable validation throws at startup (after fixing Issue Feat: Auth & Chat: Conversation History, Credit Tracking, Error Handling #1)
  • Test with malformed WhatsApp payloads from Meta (verify Zod error handling)

Migration Safety ✓

Migration 0029_add_whatsapp_identity_columns.sql follows best practices from docs/database-migrations.md:

  • ✅ Uses IF NOT EXISTS for all DDL operations (fully idempotent)
  • ✅ Uses DO $$ ... END $$ blocks for constraint checks
  • ✅ No CREATE INDEX CONCURRENTLY (runs in transaction as required)
  • ✅ Partial index for performance optimization
  • ✅ Enum additions properly guarded with existence checks

Final Recommendation

APPROVE pending fix for Issue #1 (environment variable validation).

Issues #2 and #3 are nice-to-have improvements but not blockers. The code is production-ready with that single fix.

The implementation demonstrates excellent engineering practices:

  • Robust error handling and race condition mitigation
  • Comprehensive test coverage
  • Security-first approach with proper HMAC verification
  • Consistent with existing codebase patterns

Great work! 🚀


Review completed using CLAUDE.md guidelines

  • Stack: Bun, Next.js 15, PostgreSQL + Drizzle ORM ✓
  • Migration workflow: db:generate → review SQL → db:migrate
  • No db:push usage ✓
  • Proper idempotency in migrations ✓

- Added functions to clean up user data based on Discord and WhatsApp IDs, expanding the script's capabilities for managing user accounts across multiple platforms.
- Updated the cleanup process to include new identifiers and improved logging for better visibility during cleanup operations.
- Modified usage instructions to reflect the new options for Discord and WhatsApp, ensuring users can easily utilize the updated script.
@claude
Copy link

claude bot commented Feb 9, 2026

Code Review: WhatsApp Business Cloud API Integration

Summary

This is a comprehensive, well-architected WhatsApp integration that follows established patterns in the codebase. The implementation demonstrates strong attention to security, error handling, and cross-platform account linking. Overall quality is excellent.


Strengths

1. Security Best Practices

  • HMAC-SHA256 webhook signature verification with constant-time comparison
  • Proper rate limiting on endpoints (AGGRESSIVE for webhooks, STANDARD for auth)
  • Environment-based signature skipping only in dev mode
  • Input validation using Zod schemas
  • Phone number masking in logs

2. Excellent Error Handling

  • Race condition handling throughout user service (unique constraint violations)
  • Idempotent operations using message IDs
  • Graceful degradation for non-critical failures
  • Proper 503 response on lock failure to trigger Meta webhook retry
  • TOCTOU prevention in cross-platform linking

3. Database Migration Quality

  • Fully idempotent with IF NOT EXISTS guards
  • Proper enum extension without CREATE INDEX CONCURRENTLY
  • Partial index for efficient lookups
  • Follows project migration guidelines

4. Cross-Platform Architecture

  • Automatic account linking via phone number
  • WhatsApp ID to E.164 conversion for seamless integration
  • Consistent patterns with existing Telegram/Discord implementations
  • Unified memory via entityId = userId

5. Code Quality

  • Comprehensive TypeScript types
  • Clear separation of concerns
  • Excellent inline documentation
  • Consistent error codes
  • Extended timeout (120s) for ASSISTANT mode

6. Test Coverage

  • 6 test files with 390+ lines of tests
  • Unit tests for signature verification, payload parsing, cross-platform linking
  • Integration tests for E2E webhook flow

Issues Found

Critical Issues

None! No security vulnerabilities or critical bugs detected.

High Priority

1. Missing Production Validation in Config (lib/services/eliza-app/config.ts:70-76)

The config validates Telegram/Blooio in production but not WhatsApp. This could lead to silent failures in production if WhatsApp env vars are missing.

Recommendation: Add WhatsApp validation in the production block to validate accessToken, phoneNumberId, appSecret, and verifyToken.


Medium Priority

2. Type Assertion Safety (app/api/eliza-app/webhook/whatsapp/route.ts:148)

Using "as never" type assertion is unsafe. Consider fixing upstream type signature in userContextService.buildContext to accept UserWithOrganization.

3. Potential Memory Issue in Read Receipts (route.ts:79-83)

markWhatsAppMessageAsRead fires asynchronously without awaiting. While non-blocking is good for UX, consider using Promise.allSettled if scaling to long-running processes.


Low Priority

4. Duplicated Phone Masking Logic

Phone masking appears in multiple places. Consider extracting to a utility: maskPhoneForLogs(phone: string)

5. Hardcoded Model IDs (config.ts:29-32)

Both smallModel and largeModel use the same model. If intentional, add a comment.


Security Assessment

Passed

  • HMAC signature verification follows OWASP guidelines
  • Constant-time comparison prevents timing attacks
  • No SQL injection vectors (Drizzle ORM)
  • No XSS risks
  • Proper rate limiting
  • Secrets not logged

Recommendations

  1. Webhook Replay Attacks: Consider timestamp validation to reject webhooks older than 5 minutes
  2. Environment Variable Exposure: Add validation to ensure APP_SECRET is never committed

Performance Considerations

Good

  • Distributed lock with exponential backoff
  • Partial indexes reduce storage overhead
  • Database queries use read replicas

Optimization Opportunities

  1. Batch Message Processing: Consider Promise.allSettled for parallel processing
  2. Cache User Lookups: Consider Redis cache with 5-minute TTL for returning users

Recommendations

Must Fix Before Merge

  1. Add production validation for WhatsApp config (5 min fix)

Should Fix

  1. Fix "as never" type assertion

Nice to Have

  1. Extract phone masking utility
  2. Add timestamp validation to prevent replay attacks
  3. Cache user lookups for performance

Final Verdict

APPROVE with minor requested changes

This PR demonstrates exceptional code quality and follows all project conventions. The only blocking issue is the missing production validation for WhatsApp config, which is a 2-line fix.

The implementation is:

  • Secure (HMAC verification, rate limiting, input validation)
  • Robust (race condition handling, idempotency, graceful degradation)
  • Well-tested (6 test files, 390+ lines)
  • Well-documented (clear comments, PR description, .env examples)
  • Performant (distributed locks, partial indexes, read replicas)

Estimated Merge Readiness: 95% (pending config validation fix)

Great work! 🎉

- Introduced new fields for WhatsApp ID and name in the UserInfoResponse interface to enhance user data retrieval.
- Updated the handleGetUser function to include WhatsApp information in the user response, improving cross-platform user experience.
@claude
Copy link

claude bot commented Feb 9, 2026

PR Review: WhatsApp Business Cloud API Integration

Overall Assessment

This is a high-quality, production-ready implementation of WhatsApp support for the Eliza App. The code demonstrates excellent security practices, comprehensive error handling, and thorough test coverage. The architecture follows established patterns from the existing Telegram/Discord integrations.

✅ Strengths

Security

  • HMAC-SHA256 signature verification using crypto.timingSafeEqual() for constant-time comparison to prevent timing attacks (whatsapp-api.ts:145)
  • Proper rate limiting on both auth and webhook endpoints using RateLimitPresets
  • Input validation with Zod schemas for all webhook payloads and API requests
  • Signature verification bypass only in dev mode with explicit environment check (route.ts:194-196)
  • Idempotency handling to prevent duplicate message processing (route.ts:244-251)

Error Handling

  • Distributed locking with retry logic to prevent concurrent message processing (route.ts:135-144)
  • Race condition handling in user creation with unique constraint recovery (user-service.ts:1053-1076)
  • Graceful degradation for non-critical failures (profile updates, read receipts)
  • Proper error boundaries with differentiated error messages based on failure type

Database Design

  • Fully idempotent migration using IF NOT EXISTS guards and DO blocks (0029_add_whatsapp_identity_columns.sql)
  • Partial index for efficient WhatsApp ID lookups on non-null values only (line 20)
  • Enum extension safely adds 'whatsapp' to existing phone_provider and phone_type enums
  • Cross-platform linking via auto-derived phone_number from WhatsApp ID (user-service.ts:955)

Code Quality

  • Excellent documentation with detailed JSDoc comments explaining auth flows and cross-platform scenarios
  • Type safety with comprehensive TypeScript types and Zod validation
  • DRY principles with reusable utilities (whatsapp-api.ts, deterministic-uuid.ts)
  • Consistent patterns matching existing Telegram/Discord implementations

Testing

  • 390 lines of WhatsApp API utility tests covering signature verification, payload parsing, phone formatting
  • Cross-platform linking tests validating account merging scenarios
  • Integration tests for end-to-end webhook flows
  • Test coverage for auth service, webhook handler, and message router

🔍 Observations & Recommendations

1. Lock TTL vs Function Timeout

File: app/api/eliza-app/webhook/whatsapp/route.ts:135-139

The lock TTL is set to 120s to match maxDuration, but the comment states this is to "prevent lock expiry during processing." If processing takes the full 120s, the lock will expire at the same time the function times out, potentially leaving the lock held indefinitely if the function crashes before releasing it.

Recommendation: Consider setting lock TTL slightly higher (e.g., 130s) to ensure cleanup, or add a background job to clean up stale locks.

2. WhatsApp ID Validation Consistency

Files:

  • app/api/eliza-app/auth/whatsapp/route.ts:31-34 (7-15 digits)
  • lib/utils/whatsapp-api.ts:302-304 (7-15 digits via regex)

Both enforce 7-15 digits, which is correct for E.164. Good consistency.

3. Error Handling in markWhatsAppMessageAsRead

File: app/api/eliza-app/webhook/whatsapp/route.ts:79-83

The read receipt is fire-and-forget with .catch(). This is appropriate for a non-critical feature, but consider adding metrics/monitoring to track failure rates.

Recommendation: Add a metric counter for failed read receipts to detect API issues.

4. Agent Response Fallback

File: app/api/eliza-app/webhook/whatsapp/route.ts:181-186

If agent processing fails, the message is still marked as processed to avoid infinite retries. This is correct, but users won't receive a fallback error message.

Current behavior: Silent failure from user perspective
Recommendation: Consider sending a user-facing error message: "Sorry, I encountered an error processing your message. Please try again."

5. Cross-Platform Linking Edge Case

File: lib/services/eliza-app/user-service.ts:989-1001

When linking WhatsApp to an existing phone user, there's a TOCTOU check for existingPhoneUser.whatsapp_id. This is good, but the error message "WHATSAPP_ALREADY_LINKED" could be confusing if the race condition is hit.

Recommendation: Differentiate between "already linked to same user" (idempotent success) vs "already linked to different user" (error).

6. Metadata Validation in Message Router

File: lib/services/message-router/index.ts:66

Great to see the provider type includes "whatsapp" - this is properly integrated.

🎯 Best Practices Followed

  • ✅ No CREATE INDEX CONCURRENTLY (follows CLAUDE.md migration rules)
  • ✅ All schema changes through migrations (no db:push)
  • ✅ Proper E.164 phone number normalization throughout
  • ✅ Comprehensive logging with masked sensitive data (whatsappId: ***${msg.from.slice(-4)})
  • ✅ Environment variable documentation with setup instructions
  • ✅ TypeScript strict mode compliance
  • ✅ ASSISTANT mode for full multi-step action execution

📝 Minor Suggestions

  1. Configuration: Consider validating required WhatsApp env vars at startup rather than runtime (fail-fast principle)
  2. Monitoring: Add metrics for webhook processing time, success/failure rates, and user auto-provisioning
  3. Documentation: The test plan is excellent - ensure it's executed before merge
  4. Type Safety: userContext cast as never (line 148) could be typed more precisely

🚀 Security Checklist

  • ✅ HMAC signature verification with constant-time comparison
  • ✅ Rate limiting on auth and webhook endpoints
  • ✅ Input validation with Zod schemas
  • ✅ SQL injection prevention via parameterized queries
  • ✅ No credential logging (access tokens, app secrets)
  • ✅ Idempotency keys to prevent replay attacks
  • ✅ Distributed locking to prevent race conditions
  • ✅ Phone number normalization before DB storage

Verdict

Approve with minor suggestions. This PR is ready for merge pending successful staging tests outlined in the test plan. The implementation is secure, well-tested, and follows all project conventions. The cross-platform account linking is particularly well-designed.

Estimated Risk: Low
Recommendation: Merge after test plan execution ✅

…n management

- Updated Discord and Telegram authentication handlers to include WhatsApp ID in user data.
- Enhanced session management to accommodate WhatsApp ID in session payloads.
- Modified configuration to include WhatsApp-related settings for improved integration.
@claude
Copy link

claude bot commented Feb 9, 2026

PR Review: WhatsApp Business Cloud API Integration

Overall Assessment

This is a well-architected and comprehensive implementation of WhatsApp integration for the Eliza App. The code follows established patterns, includes excellent test coverage, and demonstrates strong security practices. The PR is production-ready with only minor recommendations for improvement.


✅ Strengths

1. Security Best Practices

  • ✅ HMAC-SHA256 webhook signature verification using crypto.timingSafeEqual() for constant-time comparison (prevents timing attacks)
  • ✅ Rate limiting on both auth and webhook endpoints
  • ✅ Idempotency handling to prevent duplicate message processing
  • ✅ Environment variable validation with requireEnv() pattern
  • ✅ Input validation using Zod schemas with proper length/format constraints

2. Database Design

  • ✅ Migration is fully idempotent with IF NOT EXISTS guards
  • ✅ Partial index on whatsapp_id for efficient lookups (only indexes non-null values)
  • ✅ Unique constraint on whatsapp_id prevents duplicates
  • ✅ Proper enum extension for phone_provider and phone_type

3. Concurrency & Race Condition Handling

  • ✅ Distributed lock acquisition with retry logic (prevents concurrent message processing)
  • ✅ Race condition recovery in findOrCreateByWhatsAppId() with unique constraint error handling
  • ✅ Lock TTL (120s) matches maxDuration to prevent lock expiry during processing
  • ✅ TOCTOU (time-of-check-time-of-use) protection when linking WhatsApp to existing users

4. Cross-Platform Account Linking

  • ✅ Automatic 3-way linking by whatsapp_id, phone_number, or new user creation
  • ✅ Seamless integration with Telegram/iMessage users via phone number matching
  • ✅ Session-based linking flow for existing users

5. Test Coverage

  • ✅ 6 comprehensive test files covering unit, integration, and E2E scenarios
  • ✅ Tests for signature verification, payload parsing, cross-platform linking, and webhook flow
  • ✅ 390 lines of utility tests alone

🔍 Code Quality Issues

Critical

None identified.

High Priority

1. Missing Error Handling for WhatsApp API Failures (app/api/eliza-app/webhook/whatsapp/route.ts:178-179)

The code sends responses but doesn't handle partial failures well:

if (responseText) {
  await sendWhatsAppResponse(msg.from, responseText);
}
return true;  // Always returns true, even if send fails

Impact: If the WhatsApp send API fails, the message is marked as processed and won't retry.

Recommendation: Check the return value of sendWhatsAppResponse() and return false on send failures to trigger webhook retry:

if (responseText) {
  const sent = await sendWhatsAppResponse(msg.from, responseText);
  if (!sent) return false;  // Don't mark as processed, allow retry
}

Medium Priority

2. Incomplete Comment Syntax (lib/services/eliza-app/user-service.ts:957, 989)

Lines 957 and 989 have malformed comments:

/ Scenario 1: Check if user exists by whatsapp_id

Should be:

// Scenario 1: Check if user exists by whatsapp_id

3. Hardcoded WhatsApp API Version (lib/utils/whatsapp-api.ts:11)

export const WHATSAPP_API_BASE = "https://graph.facebook.com/v21.0";

Recommendation: Make the API version configurable via environment variable with a sensible default:

const WHATSAPP_API_VERSION = process.env.WHATSAPP_API_VERSION || "v21.0";
export const WHATSAPP_API_BASE = `https://graph.facebook.com/${WHATSAPP_API_VERSION}`;

4. Potential Memory Leak in Background Task (app/api/eliza-app/webhook/whatsapp/route.ts:79-83)

The markWhatsAppMessageAsRead() call runs without awaiting, but errors are caught and logged. This is fine, but consider if you want to track these promises:

markWhatsAppMessageAsRead(WA_ACCESS_TOKEN, WA_PHONE_NUMBER_ID, msg.messageId)
  .catch((err) => logger.warn(...));  // Fire-and-forget

Recommendation: This is acceptable as-is since read receipts are non-critical, but document this as intentional fire-and-forget.

Low Priority

5. Type Assertion Safety (app/api/eliza-app/webhook/whatsapp/route.ts:148)

user: { ...userWithOrg, organization } as never,

The as never cast is overly broad. Consider defining a proper type or using a more specific assertion.

6. Missing Retry Logic for WhatsApp API Calls (lib/utils/whatsapp-api.ts:208-244)

While the webhook retry handles failures, the WhatsApp send functions don't have built-in retry logic. Consider adding exponential backoff for transient failures (5xx errors, rate limits).


🛡️ Security Considerations

✅ Secure

  • Webhook signature verification is properly implemented
  • Sensitive data (phone numbers, WhatsApp IDs) is masked in logs (e.g., ***${msg.from.slice(-4)})
  • No SQL injection risks (using Drizzle ORM)
  • Rate limiting prevents abuse

⚠️ Recommendations

1. Environment Variable Secrets Management

The .env.local.example exposes the structure of sensitive configs. Ensure production deployments use proper secret management (Vercel Secrets, AWS Secrets Manager, etc.).

2. Webhook Replay Attack Protection

The code has idempotency checks, but consider adding timestamp validation to reject old webhook payloads (Meta includes timestamp field in messages).

3. SKIP_WEBHOOK_VERIFICATION Flag (app/api/eliza-app/webhook/whatsapp/route.ts:194-196)

const skipVerification =
  process.env.SKIP_WEBHOOK_VERIFICATION === "true" &&
  process.env.NODE_ENV !== "production";

This is safe (requires both flags), but consider logging a loud warning when verification is skipped to prevent accidental misuse.


📊 Performance Considerations

✅ Optimized

  • Partial index on whatsapp_id reduces index size
  • Distributed locks prevent thundering herd on same room
  • Lock retry with exponential backoff

⚠️ Potential Bottlenecks

1. Sequential Message Processing (app/api/eliza-app/webhook/whatsapp/route.ts:242-260)

Messages are processed sequentially in a for loop. If a webhook contains multiple messages, this could cause timeouts.

Recommendation: Consider processing messages in parallel (with proper lock acquisition per room) or using a background job queue for high-volume scenarios.

2. Lock Contention (app/api/eliza-app/webhook/whatsapp/route.ts:135-139)

Max 10 retries with 2s delay = up to 20s waiting for lock. If WhatsApp retries the webhook during this time, it could cascade.

Recommendation: Monitor lock acquisition metrics in production. Current settings are reasonable for MVP.


🧪 Test Coverage Assessment

Excellent coverage across:

  • ✅ Signature verification (timing attacks, tampered payloads)
  • ✅ Payload parsing (invalid schemas, edge cases)
  • ✅ Cross-platform linking (3-way scenarios)
  • ✅ Webhook E2E flow
  • ✅ WhatsApp API utilities (phone formatting, validation)

Missing tests:

  • ⚠️ Lock acquisition failure scenarios (503 response path)
  • ⚠️ Concurrent webhook delivery (race conditions)
  • ⚠️ WhatsApp API send failures and retry logic

📝 Documentation

✅ Well Documented

  • Clear inline comments explaining business logic
  • .env.local.example has helpful setup instructions
  • TSDoc comments on functions
  • Migration has descriptive SQL comments

Suggestions

  • Add a docs/whatsapp-integration.md guide covering:
    • Meta App Dashboard setup walkthrough
    • Webhook URL configuration
    • Testing with WhatsApp Business API sandbox
    • Troubleshooting common issues (signature failures, lock timeouts)

🎯 Recommendations Summary

Must Fix Before Merge

  1. ✅ Fix malformed comments in user-service.ts:957, 989
  2. ✅ Handle WhatsApp send failures to prevent message loss

Should Fix (High Value)

  1. Make WhatsApp API version configurable
  2. Add timestamp validation for webhook replay protection
  3. Document the fire-and-forget read receipt pattern

Nice to Have

  1. Add lock failure and concurrent webhook tests
  2. Consider parallel message processing for scalability
  3. Create docs/whatsapp-integration.md setup guide

✅ Approval Status

Approved with minor fixes. This is production-quality code with strong fundamentals. The two must-fix items are trivial and can be addressed quickly.

Great work on the comprehensive testing, security practices, and cross-platform linking architecture! 🎉


📋 Checklist for Author

  • Fix comment syntax errors in user-service.ts
  • Add error handling for WhatsApp send failures
  • Consider making API version configurable
  • Add loud warning when webhook verification is skipped
  • Document fire-and-forget read receipt behavior
  • Add integration guide to docs/ (optional but recommended)

- Updated the message handling function to include a retry mechanism for sending responses via WhatsApp.
- Added logging for failed send attempts, allowing the webhook to retry if the message fails to send, improving reliability in message delivery.
@claude
Copy link

claude bot commented Feb 9, 2026

Comprehensive Code Review - PR #320

Executive Summary

This PR implements WhatsApp Business Cloud API integration with auto-provisioning, cross-platform account linking, and webhook signature verification. The implementation is generally solid with good security practices, but has 1 Critical and 4 High severity issues that must be addressed before merging.


❌ Critical Issues (Must Fix)

1. Race Condition in Distributed Lock Acquisition

File: app/api/eliza-app/webhook/whatsapp/route.ts:135-144

Problem: When lock acquisition fails, the function returns false and webhook returns 503, triggering Meta to retry. However, the message is not marked as processed in the idempotency table. Between retries, if the first request completes and releases the lock, the retry will acquire the lock and process the same message again, causing duplicate responses.

Fix: Implement a "processing" state in the idempotency table:

// Before attempting lock acquisition
await markMessageAsProcessing(messageId, ttl=30000); // 30s processing window

const lock = await distributedLocks.acquireRoomLockWithRetry(roomId, 120000, {
  maxRetries: 10,
  initialDelayMs: 100,
  maxDelayMs: 2000,
});

if (!lock) {
  // Message is marked as processing, retry will be rejected by idempotency check
  logger.error("[ElizaApp WhatsAppWebhook] Failed to acquire room lock", { roomId });
  return false;
}

⚠️ High Severity Issues

2. Unsafe Type Coercion Suppresses Type Safety

File: app/api/eliza-app/webhook/whatsapp/route.ts:148

const userContext = await userContextService.buildContext({
  user: { ...userWithOrg, organization } as never, // ❌ Suppresses ALL type checking
  isAnonymous: false,
  agentMode: AgentMode.ASSISTANT,
});

Fix: Update buildContext type signature or create proper type adapter:

type UserWithOrganization = typeof userWithOrg & { organization: typeof organization };
const user: UserWithOrganization = { ...userWithOrg, organization };
const userContext = await userContextService.buildContext({
  user,
  isAnonymous: false,
  agentMode: AgentMode.ASSISTANT,
});

3. Missing Rate Limiting on GET Webhook Verification

File: app/api/eliza-app/webhook/whatsapp/route.ts:290

Problem: GET endpoint has no rate limiting, making it vulnerable to DoS attacks.

Fix:

export const GET = withRateLimit(handleWhatsAppWebhookGet, RateLimitPresets.AGGRESSIVE);

4. Insufficient WhatsApp ID Validation Before Phone Derivation

File: lib/services/eliza-app/user-service.ts:954-955

Problem: Code strips non-digits from WhatsApp ID, but WhatsApp IDs should already be digits-only (validated in auth endpoint). If a WhatsApp ID contains non-digits, this indicates data corruption or validation bypass.

Fix:

async findOrCreateByWhatsAppId(whatsappId: string, profileName?: string) {
  // Validate WhatsApp ID format before proceeding
  if (!/^\d{7,15}$/.test(whatsappId)) {
    logger.error("[ElizaAppUserService] Invalid WhatsApp ID format", { whatsappId });
    throw new Error("INVALID_WHATSAPP_ID");
  }
  const derivedPhone = `+${whatsappId}`;
  // ... rest of function
}

5. No Runtime Validation of WhatsApp API Responses

File: lib/utils/whatsapp-api.ts:224-243

Problem: Response is cast to WhatsAppSendMessageResponse without validation. If Meta changes their API structure, this could cause crashes.

Fix: Add Zod schema validation:

const WhatsAppSendMessageResponseSchema = z.object({
  messaging_product: z.string(),
  contacts: z.array(z.object({ 
    input: z.string(), 
    wa_id: z.string() 
  })),
  messages: z.array(z.object({ 
    id: z.string(), 
    message_status: z.string().optional() 
  })),
});

try {
  const parsed = WhatsAppSendMessageResponseSchema.parse(JSON.parse(responseText));
  return parsed;
} catch (error) {
  logger.error("[WhatsAppAPI] Invalid API response schema", { error, responseText });
  throw new Error(`Invalid WhatsApp API response structure`);
}

📋 Medium Severity Issues (Should Fix)

6. No Retry Logic for Transient Failures

File: lib/utils/whatsapp-api.ts:208-244

The PR description mentions "retry and error handling" but only HTTP error handling is implemented. Add exponential backoff for 5xx errors and network timeouts:

async function sendWhatsAppMessageWithRetry(
  accessToken: string,
  phoneNumberId: string,
  to: string,
  text: string,
  maxRetries = 3
): Promise<WhatsAppSendMessageResponse> {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(url, { 
        ...options, 
        signal: AbortSignal.timeout(15000) // Issue #7 fix
      });
      
      if (response.ok) {
        const result = await response.json();
        return result;
      }
      
      // Retry on 5xx, don't retry on 4xx
      if (response.status >= 500 && attempt < maxRetries) {
        await delay(Math.min(1000 * Math.pow(2, attempt), 10000));
        continue;
      }
      
      throw new Error(`WhatsApp API error (${response.status})`);
    } catch (error) {
      if (attempt === maxRetries) throw error;
      if (error.name === 'AbortError' || error.name === 'FetchError') {
        await delay(Math.min(1000 * Math.pow(2, attempt), 10000));
      } else {
        throw error; // Don't retry on other errors
      }
    }
  }
}

7. Missing Timeout on External API Calls

File: lib/utils/whatsapp-api.ts:224-231

Problem: No timeout set. If Meta's API hangs, this blocks the webhook handler for up to 120 seconds, preventing other messages from being processed in that room.

Fix: Add timeout (shown in Issue #6 above):

signal: AbortSignal.timeout(15000) // 15 second timeout

8. Insufficient Error Context in Message Router

File: lib/services/message-router/index.ts:461-480

Problem: Returning boolean false provides no context about failure type (missing credentials vs network timeout vs rate limit).

Fix: Return structured errors:

type SendResult = 
  | { success: true }
  | { success: false; error: string; retryable: boolean };

private async sendViaWhatsApp(params: SendMessageParams): Promise<SendResult> {
  try {
    // ... 
    if (!accessToken || !phoneNumberId) {
      logger.error("[MessageRouter] Missing WhatsApp credentials");
      return { success: false, error: "MISSING_CREDENTIALS", retryable: false };
    }
    
    await sendWhatsAppMessage(accessToken, phoneNumberId, params.to, params.body);
    return { success: true };
  } catch (error) {
    const isRetryable = error.message?.includes("timeout") || error.message?.includes("5");
    logger.error("[MessageRouter] WhatsApp send error", { error, retryable: isRetryable });
    return { success: false, error: error.message, retryable: isRetryable };
  }
}

9. Enhanced Security Logging Needed

File: app/api/eliza-app/webhook/whatsapp/route.ts:212

For incident response, security events should include more context:

if (!isValid) {
  logger.warn("[ElizaApp WhatsAppWebhook] Invalid signature", {
    ip: request.headers.get("x-forwarded-for") || "unknown",
    userAgent: request.headers.get("user-agent"),
    payloadSize: rawBody.length,
    signatureProvided: !!signatureHeader,
    timestamp: new Date().toISOString(),
  });
  return NextResponse.json({ error: "Invalid signature" }, { status: 401 });
}

ℹ️ Low Severity Issues (Nice to Have)

10. Inconsistent Error Response Format

  • Webhook: { error: "Invalid signature" }
  • Auth: { success: false, error: "...", code: "..." }

Recommendation: Standardize on the auth format for consistency.


11. Magic Numbers in Lock Configuration

File: app/api/eliza-app/webhook/whatsapp/route.ts:135-139

Extract to named constants:

const LOCK_TTL_MS = 120000; // Must match maxDuration
const LOCK_MAX_RETRIES = 10;
const LOCK_INITIAL_DELAY_MS = 100;
const LOCK_MAX_DELAY_MS = 2000;

const lock = await distributedLocks.acquireRoomLockWithRetry(
  roomId, 
  LOCK_TTL_MS, 
  {
    maxRetries: LOCK_MAX_RETRIES,
    initialDelayMs: LOCK_INITIAL_DELAY_MS,
    maxDelayMs: LOCK_MAX_DELAY_MS,
  }
);

12. No User Feedback for Unsupported Message Types

File: app/api/eliza-app/webhook/whatsapp/route.ts:74-76

Non-text messages are silently ignored. Provide feedback:

if (!text) {
  if (msg.type !== "text") {
    await sendWhatsAppResponse(
      msg.from, 
      "Sorry, I can only handle text messages at the moment."
    );
  }
  return true;
}

13. Missing Message Length Validation

File: lib/utils/whatsapp-api.ts:221

WhatsApp has a 4096 character limit:

export async function sendWhatsAppMessage(
  accessToken: string,
  phoneNumberId: string,
  to: string,
  text: string,
): Promise<WhatsAppSendMessageResponse> {
  if (text.length > 4096) {
    throw new Error(`Message too long: ${text.length} chars (max 4096)`);
  }
  // ... rest of function
}

✅ Excellent Practices Found

  1. HMAC-SHA256 Signature Verification - Correctly implemented with crypto.timingSafeEqual
  2. Database Migration Quality - Fully idempotent with proper guards, follows CLAUDE.md guidelines
  3. Unique Constraints - Database-level protection against duplicate accounts
  4. Idempotency Implementation - Prevents replay attacks within TTL window
  5. Zod Input Validation - Strong schema validation on webhook payloads
  6. Rate Limiting - POST endpoint has AGGRESSIVE rate limiting (100/min per IP)
  7. Distributed Locking - Prevents concurrent processing for same room
  8. Comprehensive Testing - 6 test files covering unit, integration, and cross-platform scenarios
  9. Partial Indexing - users_whatsapp_id_idx only indexes non-null values (efficient)
  10. Read Replica Usage - Proper separation of dbRead/dbWrite

📊 Overall Assessment

Status: ⚠️ Conditionally Approve - Address Critical & High Issues First

This is a well-engineered implementation with solid security practices, proper database design, and comprehensive error handling. However, the Critical distributed lock race condition and High severity issues must be addressed before production deployment.

Recommended Next Steps:

  1. Fix Critical Issue Feat: Auth & Chat: Conversation History, Credit Tracking, Error Handling #1 (lock race condition)
  2. Fix High Issues feat: Add versioned /api/v1 endpoints with secure auth, credit tracking, and usage auditing #2-5 (type safety, rate limiting, validation)
  3. Consider Medium Issues feat: Unified Page Header System with Theme-Aware Logos #6-9 (retries, timeouts, error context, logging)
  4. Add tests for edge cases identified above
  5. Deploy to staging for thorough testing
  6. Re-review after changes

Confidence Level: High - This review examined all critical paths including authentication, webhook handling, user provisioning, database operations, and cross-platform linking logic.

Great work overall! The architecture is sound and follows best practices. With these fixes, this will be production-ready. 🚀

@claude
Copy link

claude bot commented Feb 9, 2026

Comprehensive Code Review - PR #320

Executive Summary

This PR implements WhatsApp Business Cloud API integration with auto-provisioning, cross-platform account linking, and webhook signature verification. The implementation is generally solid with good security practices, but has 1 Critical and 4 High severity issues that must be addressed before merging.


❌ Critical Issues (Must Fix)

1. Race Condition in Distributed Lock Acquisition

File: app/api/eliza-app/webhook/whatsapp/route.ts:135-144

Problem: When lock acquisition fails, the function returns false and webhook returns 503, triggering Meta to retry. However, the message is not marked as processed in the idempotency table. Between retries, if the first request completes and releases the lock, the retry will acquire the lock and process the same message again, causing duplicate responses.

Fix: Implement a "processing" state in the idempotency table:

// Before attempting lock acquisition
await markMessageAsProcessing(messageId, ttl=30000); // 30s processing window

const lock = await distributedLocks.acquireRoomLockWithRetry(roomId, 120000, {
  maxRetries: 10,
  initialDelayMs: 100,
  maxDelayMs: 2000,
});

if (!lock) {
  // Message is marked as processing, retry will be rejected by idempotency check
  logger.error("[ElizaApp WhatsAppWebhook] Failed to acquire room lock", { roomId });
  return false;
}

⚠️ High Severity Issues

2. Unsafe Type Coercion Suppresses Type Safety

File: app/api/eliza-app/webhook/whatsapp/route.ts:148

const userContext = await userContextService.buildContext({
  user: { ...userWithOrg, organization } as never, // ❌ Suppresses ALL type checking
  isAnonymous: false,
  agentMode: AgentMode.ASSISTANT,
});

Fix: Update buildContext type signature or create proper type adapter:

type UserWithOrganization = typeof userWithOrg & { organization: typeof organization };
const user: UserWithOrganization = { ...userWithOrg, organization };
const userContext = await userContextService.buildContext({
  user,
  isAnonymous: false,
  agentMode: AgentMode.ASSISTANT,
});

3. Missing Rate Limiting on GET Webhook Verification

File: app/api/eliza-app/webhook/whatsapp/route.ts:290

Problem: GET endpoint has no rate limiting, making it vulnerable to DoS attacks.

Fix:

export const GET = withRateLimit(handleWhatsAppWebhookGet, RateLimitPresets.AGGRESSIVE);

@claude
Copy link

claude bot commented Feb 9, 2026

Code Review - PR #320: WhatsApp Integration

Status: ⚠️ Conditionally Approve - Fix Critical & High Issues First


❌ CRITICAL - Must Fix

1. Race Condition in Lock Acquisition (route.ts:135-144)

  • Problem: Failed lock returns 503 → Meta retries → message not in idempotency table → duplicate processing
  • Fix: Mark message as "processing" in idempotency table BEFORE lock attempt

⚠️ HIGH SEVERITY - Must Fix

2. Unsafe Type Coercion (route.ts:148)

  • as never suppresses type checking
  • Fix type definition in buildContext

3. Missing Rate Limiting (route.ts:290)

  • GET webhook endpoint has no rate limit
  • Apply withRateLimit(..., AGGRESSIVE)

4. Insufficient WhatsApp ID Validation (user-service.ts:954)

  • Add validation: if (!/^\d{7,15}$/.test(whatsappId)) throw ...

5. No API Response Validation (whatsapp-api.ts:224)

  • Add Zod schema validation for WhatsApp API responses

📋 MEDIUM - Should Fix

  1. No retry logic for 5xx/timeout errors
  2. Missing timeout on fetch calls (add 15s timeout)
  3. Boolean returns lose error context (use structured errors)
  4. Security logs need IP/headers/timestamp

✅ Excellent Practices

  • HMAC-SHA256 signature verification ✓
  • Idempotent migration ✓
  • Unique constraints ✓
  • Comprehensive tests (6 files) ✓
  • Rate limiting on POST ✓
  • Distributed locking ✓

Overall: Solid implementation! Fix the 5 critical/high issues and this is production-ready. 🚀

@claude
Copy link

claude bot commented Feb 9, 2026

Detailed Code Fixes

Issue #1: Lock Race Condition (CRITICAL)

Current Problem:

  • Lock fails → return false → 503 response → Meta retries
  • Message NOT in idempotency table → retry can process duplicate

Fix: Mark as processing before lock attempt

Issue #2-5: High Priority Fixes

Type Safety: Remove as never, use proper type adapter
Rate Limiting: Apply to GET endpoint
WhatsApp ID Validation: Validate digits-only format before use
API Response: Add Zod schema validation

Medium Priority

  • Add retry logic with exponential backoff for 5xx errors
  • Add 15s timeout to fetch calls: signal: AbortSignal.timeout(15000)
  • Return structured errors instead of boolean
  • Enhanced security logging with IP/headers/timestamp

Database Migration Review ✅

Migration 0029_add_whatsapp_identity_columns.sql is excellent:

  • ✅ Fully idempotent with IF NOT EXISTS guards
  • ✅ Proper unique constraints
  • ✅ Efficient partial index on non-null values only
  • ✅ Follows CLAUDE.md guidelines (no CONCURRENTLY)
  • ✅ Backward compatible (all columns nullable)

No issues found in migration.


Test Coverage Assessment

6 test files added covering:

  • Unit: auth, webhook, API utils, message router
  • Integration: E2E webhook flow
  • Scenarios: Cross-platform linking

Recommendation: Add tests for:

  • Lock acquisition failures
  • Signature verification edge cases
  • API timeout handling
  • Concurrent user linking attempts

- Added validation for WhatsApp ID format to enhance message handling and prevent processing of invalid IDs.
- Introduced a new function to remove processed marks, allowing for retries on failed message processing.
- Updated the user context building to use a properly typed UserWithOrganization object for better type safety.
- Refactored the webhook handling logic to ensure idempotency and prevent duplicate processing of messages.

These changes enhance the reliability and robustness of the WhatsApp integration in the Eliza App.
@claude
Copy link

claude bot commented Feb 10, 2026

Code Review: WhatsApp Integration for Eliza App

I've conducted a comprehensive review of this PR across code quality, security, database migrations, performance, and integration points. Overall, this is a well-structured implementation with strong security practices, but there are several issues that should be addressed.


🎯 Overall Assessment

Strengths:

  • ✓ Excellent Zod validation throughout
  • ✓ Proper HMAC-SHA256 signature verification with timing-safe comparison
  • ✓ Clean separation of concerns (auth, API utilities, business logic)
  • ✓ Comprehensive test coverage (6 test files)
  • ✓ Database migration follows CLAUDE.md rules (idempotent, no CONCURRENTLY)
  • ✓ Distributed locking prevents thundering herd

Key Concerns:

  • ⚠️ Race condition in webhook idempotency check
  • ⚠️ Security bypass flag could be dangerous
  • ⚠️ Phone number derivation and cross-platform linking edge cases

🔴 High Priority Issues

Issue #1: Race Condition in Webhook Idempotency (TOCTOU)

File: app/api/eliza-app/webhook/whatsapp/route.ts:265-288

Problem: Non-atomic check-then-mark pattern:

if (await isAlreadyProcessed(idempotencyKey)) {
  continue; // ← Check
}
await markAsProcessed(idempotencyKey, "whatsapp-eliza-app"); // ← Mark (gap here!)
const processed = await handleIncomingMessage(msg); // ← Process

Between the isAlreadyProcessed check and markAsProcessed call, another webhook delivery could arrive and also pass the check, leading to duplicate processing.

Recommendation:

const claimed = await tryClaimForProcessing(idempotencyKey, "whatsapp-eliza-app");
if (!claimed) {
  continue; // Already claimed by another process
}
const processed = await handleIncomingMessage(msg);

This uses the atomic INSERT ... ON CONFLICT DO NOTHING operation from lib/utils/idempotency.ts:52.


Issue #2: Signature Verification Bypass in Development

File: app/api/eliza-app/webhook/whatsapp/route.ts:216-232

Problem:

const skipVerification = 
  process.env.SKIP_WEBHOOK_VERIFICATION === "true" &&
  process.env.NODE_ENV !== "production";

This allows signature verification to be skipped if SKIP_WEBHOOK_VERIFICATION=true is set in non-production environments. The risk is accidental deployment to staging with verification disabled.

Recommendation:

  • Remove the bypass entirely, OR
  • Use explicit per-environment feature flags (e.g., STAGING_SKIP_WEBHOOK_VERIFICATION)
  • The NODE_ENV !== "production" guard doesn't protect staging environments

🟡 Medium Priority Issues

Issue #3: Auto-Derived Phone Number Could Conflict

File: lib/services/eliza-app/user-service.ts:955-1046

Problem: WhatsApp ID auto-derives to E.164 phone:

const derivedPhone = `+${whatsappId.replace(/\D/g, "")}`;

If a WhatsApp ID somehow contains formatting (which isValidWhatsAppId() should prevent), or if cross-platform linking matches on phone numbers that were manually entered in different formats, you could get ambiguous matches.

Recommendation:

  • Validate that WhatsApp ID strictly matches E.164 format (no dashes, spaces, etc.)
  • Add explicit validation that derivedPhone is a valid E.164 number before using it for linking

Issue #4: Multiple Database Queries for Cross-Platform Linking

File: lib/services/eliza-app/user-service.ts:950-1077

Problem: Sequential lookups without transaction:

  1. Line 958: Query by WhatsApp ID
  2. Line 990: Query by phone number (if Feat: Auth & Chat: Conversation History, Credit Tracking, Error Handling #1 returns null)
  3. Line 1005-1009: Update existing user
  4. Line 1028: Refetch updated user
  5. Lines 1057, 1066: Race condition recovery

Impact: Race window between lines 990-1009 where another request could claim the phone number. The code does check for race conditions after the fact (lines 993-1001), but a database transaction with SERIALIZABLE isolation would be safer.

Recommendation: Wrap findOrCreate operations in a transaction:

await db.transaction(async (tx) => {
  // All lookups and updates here
}, { isolationLevel: 'serializable' });

🟢 Low Priority Issues

Issue #5: Loose Phone Number Derivation Validation

File: lib/services/eliza-app/user-service.ts:955

The derived phone isn't explicitly validated as E.164. normalizePhoneNumber() will validate later, but validation should be explicit here.

Recommendation: Add E.164 format check immediately after derivation.


Issue #6: Redundant Index Definition

File: db/schemas/users.ts:103 and db/migrations/0029_add_whatsapp_identity_columns.sql:20

The index on whatsapp_id is defined both in the migration SQL and in the schema's table definition. This could cause confusion or duplicate index creation on db:generate.

Recommendation: Choose one source of truth (typically migration for manual control, or schema for auto-generation).


Issue #7: Double Validation is Redundant

File: app/api/eliza-app/webhook/whatsapp/route.ts:81 and app/api/eliza-app/auth/whatsapp/route.ts:34

WhatsApp ID format is validated twice:

  • Zod schema: regex(/^\d+$/), .min(7), .max(15)
  • isValidWhatsAppId() function: same logic

Recommendation: Choose one validation method to avoid maintenance drift.


Issue #8: Inconsistent PII Masking in Logs

File: lib/services/eliza-app/user-service.ts:99,110

WhatsApp ID masked as ***${msg.from.slice(-4)} in some places but fully logged in error scenarios.

Recommendation: Use consistent masking throughout for PII.


✅ Security Review Summary

Good:

  • ✓ Proper HMAC-SHA256 verification with crypto.timingSafeEqual()
  • ✓ Constant-time signature comparison prevents timing attacks
  • ✓ Rate limiting on both GET and POST endpoints
  • ✓ Environment variable validation in production config

Concerns:


✅ Database Migration Review

Excellent adherence to CLAUDE.md rules:

  • ✓ Uses IF NOT EXISTS clauses properly
  • ✓ Idempotent unique constraint via PL/pgSQL block
  • ✓ Partial index for efficient lookups (only indexes non-null values)
  • ✓ No CREATE INDEX CONCURRENTLY (compliant)
  • ✓ Proper enum value additions with existence checks

Minor: Issue #6 about index duplication.


📊 Summary of Findings

Severity Count Issues
High 2 #1 (TOCTOU race), #2 (Signature bypass)
Medium 2 #3 (Phone conflict), #4 (Multiple queries)
Low 4 #5, #6, #7, #8

🎬 Recommended Actions (Priority Order)

  1. Fix Issue Feat: Auth & Chat: Conversation History, Credit Tracking, Error Handling #1 - Replace idempotency check with tryClaimForProcessing()
  2. Fix Issue feat: Add versioned /api/v1 endpoints with secure auth, credit tracking, and usage auditing #2 - Remove or secure the SKIP_WEBHOOK_VERIFICATION flag
  3. Fix Issue feat : Add generation metadata tracking and auditing across chat, image, and video APIs #3 - Add strict WhatsApp ID → E.164 validation
  4. Consider Issue feat: add credit packs, Stripe checkout , Credit tracking #4 - Wrap findOrCreate in transaction with SERIALIZABLE isolation
  5. Document - Clarify the difference between ElizaApp WhatsApp webhook (auto-provisioning) vs message-router WhatsApp (agent-based routing)

🏆 Overall Recommendation

Conditional Approval - This is a solid implementation with excellent test coverage and security practices. The issues identified are addressable and mostly involve race condition hardening and configuration safety. I recommend:

  1. Address Issues Feat: Auth & Chat: Conversation History, Credit Tracking, Error Handling #1 and feat: Add versioned /api/v1 endpoints with secure auth, credit tracking, and usage auditing #2 before merging
  2. Consider Issues feat : Add generation metadata tracking and auditing across chat, image, and video APIs #3-4 for post-merge improvement
  3. Low priority issues can be tracked separately

Great work on the comprehensive integration! The cross-platform linking logic is particularly well thought out.

- Added routes for connecting, disconnecting, and checking the status of WhatsApp Business accounts.
- Introduced a WhatsApp Automation Service for credential validation, storage, and message management.
- Enhanced webhook handling for incoming messages, including signature verification and message processing with retry logic.
- Updated environment configuration and documentation to support new WhatsApp features.
- Added unit tests for the WhatsApp Automation Service to ensure functionality and reliability.
@claude
Copy link

claude bot commented Feb 13, 2026

Code Review: WhatsApp Business Cloud API Integration

I've completed a comprehensive review of PR #320. This is a well-architected implementation with strong security practices and excellent test coverage. Below are my findings:


✅ Strengths

1. Security Implementation

  • ✅ Proper HMAC-SHA256 webhook signature verification with constant-time comparison (crypto.timingSafeEqual)
  • ✅ Rate limiting applied to all endpoints (STANDARD for auth, AGGRESSIVE for webhooks)
  • ✅ Idempotency handling to prevent duplicate message processing
  • ✅ Zod validation for all incoming payloads
  • ✅ Proper masking of sensitive data in logs (***${msg.from.slice(-4)})
  • ✅ Input validation on WhatsApp IDs (7-15 digits regex)

2. Database Design

  • ✅ Migration follows best practices from docs/database-migrations.md
  • ✅ Fully idempotent with IF NOT EXISTS guards
  • ✅ Partial index on whatsapp_id for efficient lookups
  • ✅ Unique constraint on whatsapp_id to prevent duplicates
  • ✅ Enum extensions properly guarded against re-execution
  • ✅ No CREATE INDEX CONCURRENTLY (correctly runs in transaction)

3. Architecture & Code Quality

  • ✅ Clean separation of concerns (auth service, user service, utilities)
  • ✅ Distributed locking prevents concurrent message processing races
  • ✅ Proper error handling with graceful degradation (mark-as-read retries)
  • ✅ Cross-platform account linking by phone number
  • ✅ Race condition handling in user creation (unique constraint recovery)
  • ✅ Extended maxDuration = 120s for ASSISTANT mode processing
  • ✅ Returns 503 on lock failure to trigger Meta webhook retry

4. Test Coverage

  • ✅ 772 lines of test code across 5+ test files
  • ✅ Unit tests for signature verification, payload parsing, auth flow
  • ✅ Integration tests for end-to-end webhook processing
  • ✅ Cross-platform linking scenarios tested
  • ✅ Edge cases covered (race conditions, duplicates, invalid signatures)

🔍 Issues Found

CRITICAL: Missing Comment Syntax in Migration

File: db/migrations/0030_add_whatsapp_identity_columns.sql:7

The migration has a typo that will cause a syntax error:

ALTER TABLE "users" ADD COLUMN IF NOT EXISTS "whatsapp_id" text;--> statement-breakpoint

The --> statement-breakpoint should be on its own line or use SQL comment syntax -- statement-breakpoint. This appears on lines 7-8.

Impact: Migration will fail during deployment.

Fix: Change to:

ALTER TABLE "users" ADD COLUMN IF NOT EXISTS "whatsapp_id" text; --> statement-breakpoint
ALTER TABLE "users" ADD COLUMN IF NOT EXISTS "whatsapp_name" text; --> statement-breakpoint

Or remove the comment entirely (it's a Drizzle artifact).


💡 Recommendations (Non-Blocking)

1. Webhook Retry Logic (app/api/eliza-app/webhook/whatsapp/route.ts:217-222)

The error handling marks messages as processed even on agent failure:

} catch (error) {
  logger.error("[ElizaApp WhatsAppWebhook] Agent failed", {
    error: error instanceof Error ? error.message : String(error),
    roomId,
  });
  return true; // Processing attempted, mark as processed to avoid infinite retry
}

Concern: Transient failures (DB timeout, temporary API errors) will silently drop messages.

Suggestion: Consider differentiating between retryable errors (503, network issues) and permanent errors (validation, user not found). For retryable errors, return false to allow Meta's webhook retry.

2. Lock TTL vs maxDuration Mismatch (route.ts:160)

const lock = await distributedLocks.acquireRoomLockWithRetry(roomId, 120000, {
  maxRetries: 10,
  initialDelayMs: 100,
  maxDelayMs: 2000,
});

With maxRetries: 10 and exponential backoff up to 2000ms, the total retry time could exceed 20 seconds. If the first request holds the lock for 120s, the second request could retry for 20s, then acquire the lock just as the first one times out.

Suggestion: Document this retry strategy or reduce maxRetries to 5 for faster failure detection.

3. Phone Number Normalization (user-service.ts:955)

The auto-derivation assumes WhatsApp IDs are always valid E.164 numbers:

const derivedPhone = `+${whatsappId.replace(/\D/g, "")}`;

Minor Risk: Some international numbers require country-code-specific formatting. This works for most cases, but edge cases (short codes, special regions) might not normalize correctly.

Suggestion: Add a comment documenting this assumption, or use a phone validation library like libphonenumber-js for robust normalization.

4. Missing Rate Limit on GET Webhook (route.ts:344)

The GET webhook verification uses RateLimitPresets.STANDARD:

export const GET = withRateLimit(handleWhatsAppWebhookGet, RateLimitPresets.STANDARD);

Context: Meta only calls this endpoint once during webhook setup, so rate limiting is fine. However, STANDARD might be too permissive if an attacker tries to brute-force the verify token.

Suggestion: Consider AGGRESSIVE rate limit for the GET endpoint, or document why STANDARD is sufficient.

5. Test Coverage Gap: No Tests for Linking Conflicts

The user service handles WhatsApp linking conflicts (WHATSAPP_ALREADY_LINKED), but I didn't see explicit tests for:

  • Attempting to link a WhatsApp ID that's already linked to another user
  • Session-based linking idempotency

Suggestion: Add a test case in tests/unit/eliza-app/cross-platform-linking.test.ts to verify the conflict detection logic.


📋 Summary

Category Status
Security ✅ Excellent
Database ⚠️ Critical syntax error in migration
Architecture ✅ Well-designed
Tests ✅ Comprehensive coverage
Documentation ✅ Clear comments and env var docs

✅ Approval Status

Conditionally Approved pending migration syntax fix.

The migration syntax error is a blocking issue that will cause deployment failure. Once fixed, this PR is production-ready. All other recommendations are minor improvements that can be addressed in follow-up PRs.

Required Before Merge:

  1. Fix migration file syntax (lines 7-8)
  2. Run bun run db:migrate locally to verify migration applies cleanly
  3. Re-run tests to ensure no regressions

Optional Follow-Ups:

  • Improve webhook error retry logic (differentiate transient vs permanent errors)
  • Add test coverage for linking conflict scenarios
  • Document lock retry strategy

Great work on this integration! The code quality, security practices, and test coverage are all excellent. The WhatsApp integration follows the same patterns as the existing Telegram/iMessage implementations, which makes the codebase more maintainable.

Let me know if you'd like me to elaborate on any of the recommendations.

Deleted the migration file for adding WhatsApp identity columns to the users table and updated the migration index in the journal. This change ensures the migration history remains consistent and sequential.
@claude
Copy link

claude bot commented Feb 13, 2026

PR Review: WhatsApp Business Cloud API Integration

I've completed a comprehensive review of PR #320. Overall, this is a well-architected and production-ready implementation with excellent attention to security, error handling, and testing. Below are my findings organized by category.


✅ Strengths

1. Excellent Security Implementation

  • HMAC-SHA256 signature verification using crypto.timingSafeEqual() to prevent timing attacks (lib/utils/whatsapp-api.ts:148)
  • Environment-based signature skipping only in dev mode with explicit guards (app/api/eliza-app/webhook/whatsapp/route.ts:230-246)
  • Input validation using Zod schemas for webhook payloads with proper error handling
  • Rate limiting applied to both GET and POST endpoints with appropriate presets (STANDARD for GET, AGGRESSIVE for POST)
  • WhatsApp ID format validation before processing to prevent injection (app/api/eliza-app/webhook/whatsapp/route.ts:81-87)

2. Robust Race Condition Handling

  • Idempotency keys for duplicate webhook delivery prevention (app/api/eliza-app/webhook/whatsapp/route.ts:280-302)
  • Distributed locks with configurable TTL and retry backoff to prevent concurrent message processing (app/api/eliza-app/webhook/whatsapp/route.ts:160-169)
  • Unique constraint recovery in user service with proper TOCTOU protection (lib/services/eliza-app/user-service.ts:993-1001)
  • Database constraint handling throughout user provisioning flows

3. Smart Cross-Platform Linking

  • Automatic phone number derivation from WhatsApp ID (WhatsApp ID = E.164 without +)
  • 3-way linking strategy: by whatsapp_id → by phone_number → create new user
  • Seamless integration with existing Telegram/iMessage/Discord users through phone number matching
  • Session-based linking support for authenticated users adding WhatsApp to their account

4. Production-Grade Error Handling

  • 503 responses on lock failures to trigger webhook retry from Meta
  • Graceful degradation for non-critical operations (read receipts, profile updates)
  • Structured logging with sanitized PII (masked phone numbers, WhatsApp IDs)
  • Retry logic with exponential backoff for mark-as-read API calls

5. Comprehensive Test Coverage

  • 390 lines of unit tests for WhatsApp API utilities
  • Integration tests for end-to-end webhook flows
  • Cross-platform linking tests covering edge cases
  • Security tests for signature verification and tampered payloads

6. Clean Migration

  • Fully idempotent SQL with IF NOT EXISTS guards
  • Proper enum extension with duplicate-check logic
  • Partial index for efficient WhatsApp ID lookups on non-null values
  • Follows CLAUDE.md migration guidelines (no CONCURRENTLY, uses transactions)

🔍 Issues Found

Critical Issues

None - No critical security vulnerabilities or data integrity issues found.

High Priority Issues

None - No blocking bugs identified.

Medium Priority Issues

1. Potential Memory Leak in Fire-and-Forget markRead()

Location: app/api/eliza-app/webhook/whatsapp/route.ts:108

markRead(); // Fire-and-forget - no await

Issue: The markRead() async function is called without await, which means:

  • Unhandled rejections could crash the process in strict mode
  • No guarantee it completes before the response is sent
  • Silent failures won't be caught by the outer try-catch

Recommendation: Either:

  • Add .catch() handler: markRead().catch(err => logger.warn(...))
  • Or document the fire-and-forget behavior more explicitly with a comment about intentional non-blocking

2. Missing Timeout on WhatsApp API Fetch Calls

Location: lib/utils/whatsapp-api.ts:227-234 and lib/utils/whatsapp-api.ts:270-278

const response = await fetch(url, {
  method: "POST",
  headers: { ... },
  body: JSON.stringify(body),
  // Missing: signal: AbortSignal.timeout(10000)
});

Issue: No timeout configured for external API calls to Meta's servers. This could cause:

  • Request hanging indefinitely if Meta's API is slow
  • Serverless function timeout (120s max) being consumed
  • Potential resource exhaustion under load

Recommendation: Add timeout to all fetch calls:

signal: AbortSignal.timeout(10000), // 10s timeout

3. Hardcoded Retry Count in markRead()

Location: app/api/eliza-app/webhook/whatsapp/route.ts:91

const markRead = async (retries = 2) => {

Issue: Retry logic is embedded in the webhook handler rather than in the utility function. This makes it:

  • Harder to test the retry behavior
  • Inconsistent with other API calls (sendWhatsAppMessage doesn't have retry)
  • Duplicated if needed elsewhere

Recommendation: Move retry logic to markWhatsAppMessageAsRead() in whatsapp-api.ts or create a reusable withRetry() helper.

Low Priority Issues

4. Response Text Extraction Could Be More Defensive

Location: app/api/eliza-app/webhook/whatsapp/route.ts:200-204

const responseText =
  typeof responseContent === "string"
    ? responseContent
    : responseContent?.text || "";

Issue: If responseContent is an object without a text property, it sends an empty message to WhatsApp. This might be confusing for users.

Recommendation: Log a warning when the agent returns unexpected content format.

5. Magic Number for Lock TTL

Location: app/api/eliza-app/webhook/whatsapp/route.ts:160

const lock = await distributedLocks.acquireRoomLockWithRetry(roomId, 120000, {

Issue: The 120000 (120s) is tied to maxDuration but not explicitly linked in code. If maxDuration changes, the lock TTL should change too.

Recommendation:

const LOCK_TTL_MS = maxDuration * 1000;
const lock = await distributedLocks.acquireRoomLockWithRetry(roomId, LOCK_TTL_MS, {

6. Phone Number Derivation Assumes E.164 Format

Location: lib/services/eliza-app/user-service.ts:955

const derivedPhone = `+${whatsappId.replace(/\D/g, "")}`;

Issue: While WhatsApp IDs are typically digits only, the comment assumes E.164 format. However:

  • Not all country codes work with just "+" prefix (some need 00)
  • No validation that the derived phone is actually valid E.164

Impact: Low - WhatsApp IDs are always in international format, but worth documenting.

Recommendation: Add a comment explaining the assumption or use whatsappIdToE164() from whatsapp-api.ts for consistency.


🎯 Performance Considerations

Good Practices

  • Partial indexes on nullable WhatsApp columns reduce index size
  • Single query for user lookup with organization join (no N+1)
  • Efficient room lookup before creation to prevent duplicates
  • Fire-and-forget read receipts don't block response

Potential Optimizations

  1. Database connection pooling: Verify Drizzle pool size can handle concurrent webhooks
  2. Caching: Consider caching agent config or phone number mappings (low-value optimization, current approach is fine)

📋 Test Coverage Assessment

Excellent Coverage

  • ✅ Signature verification (valid, invalid, tampered, wrong secret)
  • ✅ Webhook payload parsing and validation
  • ✅ Message extraction from nested payloads
  • ✅ Phone number format conversion (E.164 ↔ WhatsApp ID)
  • ✅ Cross-platform linking scenarios
  • ✅ E2E webhook verification handshake

Suggested Additional Tests

  1. Idempotency: Test that duplicate webhook deliveries don't create duplicate users or messages
  2. Lock contention: Test behavior when multiple webhooks arrive for the same room simultaneously
  3. Agent failure: Test webhook behavior when Eliza agent throws an error (currently marks as processed - verify this is correct)
  4. Empty/null responses: Test when agent returns empty response

🔒 Security Assessment

Threats Mitigated ✅

  • Replay attacks: Signature verification + idempotency keys
  • Timing attacks: crypto.timingSafeEqual() for signature comparison
  • Injection attacks: WhatsApp ID validation + Zod schema validation
  • DoS attacks: Rate limiting + metadata size limits
  • Race conditions: Distributed locks + unique constraints

Security Best Practices

  • ✅ Secrets loaded from environment variables
  • ✅ PII sanitization in logs (masked phone numbers)
  • ✅ HTTPS-only for WhatsApp API calls
  • ✅ Webhook signature verification required in production
  • ✅ Rate limiting on all public endpoints

Recommendations

  1. Consider webhook IP allowlisting: Meta publishes their webhook IP ranges - could add an extra security layer
  2. Monitor failed signature attempts: Track and alert on repeated signature failures (potential attack)

📝 Code Quality

Strengths

  • Clear, descriptive variable names and function signatures
  • Comprehensive JSDoc comments explaining complex flows
  • Consistent error handling patterns
  • Follows CLAUDE.md guidelines (Bun, Next.js 15, Drizzle migrations)
  • TypeScript types properly defined for all WhatsApp API structures

Minor Suggestions

  1. Extract magic strings: "whatsapp:eliza-app:" prefix could be a constant
  2. DRY principle: The generateSlugFrom* functions could potentially share a helper
  3. Comment clarity: The 3-way linking logic in findOrCreateByWhatsAppId is well-commented, but a diagram in the PR description would help

✅ Final Verdict

Approve with minor suggestions

This PR demonstrates:

  • Strong security fundamentals
  • Production-ready error handling
  • Excellent test coverage
  • Thoughtful cross-platform design
  • Clean, maintainable code

The medium-priority issues are non-blocking and can be addressed in a follow-up PR or before merge. The implementation is solid and ready for staging deployment.

Recommended Actions Before Merge

  1. ✅ Add timeout to WhatsApp API fetch calls (5 min fix)
  2. ✅ Add .catch() handler to fire-and-forget markRead() (2 min fix)
  3. Optional: Address low-priority issues in follow-up PR

Post-Merge Recommendations

  1. Monitor webhook signature failure rates in production
  2. Set up alerts for lock acquisition failures (503 responses)
  3. Track WhatsApp API error rates and latency
  4. Consider IP allowlisting for Meta's webhook servers

Great work on this integration! The attention to detail in race condition handling and cross-platform linking is particularly impressive. 🚀

… webhook handlers

- Added `startWhatsAppTypingIndicator` and `sendWhatsAppTypingIndicator` functions to manage typing indicators in WhatsApp messages.
- Integrated performance tracing in the WhatsApp webhook handlers to monitor key processing stages.
- Enhanced user provisioning and message processing logic with improved error handling and logging.
- Updated related services to support new typing indicator functionality.
@claude
Copy link

claude bot commented Feb 25, 2026

PR Review: WhatsApp Business Cloud API Integration

Overall this is a well-structured integration with good documentation, Zod validation, HMAC-SHA256 signature verification, and solid idempotency patterns. A few issues need attention before merge, ranging from a security concern to bugs and architecture notes.


Security: Auth endpoint authenticates by phone number alone

File: app/api/eliza-app/auth/whatsapp/route.ts

The POST /api/eliza-app/auth/whatsapp endpoint issues a full JWT session to any caller who provides a whatsapp_id with no proof that the caller actually controls that WhatsApp account. WhatsApp IDs are just phone number digits (7-15 chars), making them enumerable from known or leaked phone number lists.

// Anyone who knows a valid whatsapp_id gets a session token
const userWithOrg = await elizaAppUserService.getByWhatsAppId(whatsappId);
const session = await elizaAppSessionService.createSession(...);

The rate limiting (RateLimitPresets.STANDARD) is not sufficient for a credential-issuance endpoint. Consider:

  • A short-lived one-time token (nonce) sent in the WhatsApp conversation by the bot, then submitted and verified here
  • Or a signed claim from the webhook handler passed to the frontend via a secure redirect

Bug: Double stopTyping() call in Eliza App webhook

File: app/api/eliza-app/webhook/whatsapp/route.ts

stopTyping() (which calls clearInterval) is called in both the inner lock finally and the outer finally, so clearInterval runs twice on the success path:

} finally {
  stopTyping();       // 1st call (inside lock block)
  await lock.release();
}
// ...
} finally {
  stopTyping();       // 2nd call (outer block) — redundant
  perfTrace.end();
}

clearInterval with an already-cleared ID is a no-op, but this is dead code. Move stopTyping() exclusively to the outer finally.


Bug: Non-atomic idempotency in Eliza App webhook

File: app/api/eliza-app/webhook/whatsapp/route.ts

The Eliza App webhook uses the older non-atomic TOCTOU pattern:

if (await isAlreadyProcessed(idempotencyKey)) { continue; }  // check
await markAsProcessed(idempotencyKey, ...);                   // mark (gap here!)

Two concurrent Meta deliveries can both pass the isAlreadyProcessed check before either inserts the key. The org-level webhook at app/api/webhooks/whatsapp/[orgId]/route.ts already correctly uses the atomic tryClaimForProcessing. The Eliza App webhook should match:

const claimed = await tryClaimForProcessing(idempotencyKey, "whatsapp-eliza-app");
if (!claimed) { continue; }

Architecture: In-memory status cache is ineffective in serverless

File: lib/services/whatsapp-automation/index.ts

private statusCache = new Map<string, CachedStatus>();

On Vercel serverless each request can land on a different container, so this Map is reset on every invocation. Every GET /api/v1/whatsapp/status call will make a live Meta Graph API call, which could hit Meta rate limits if the settings page polls frequently. Consider a Redis-backed cache (already used elsewhere in the project) or add a comment noting the cache only benefits local dev.


Partial credential storage on failure

File: lib/services/whatsapp-automation/index.tsstoreCredentials

The five sequential createOrUpdateSecret calls have no rollback. If the third call fails, WHATSAPP_ACCESS_TOKEN and WHATSAPP_PHONE_NUMBER_ID are stored but WHATSAPP_APP_SECRET is not, leaving the org in a broken state where signature verification will silently fail. Consider a try/catch that deletes already-stored keys on failure, or store all credentials together as a single JSON secret.


Redundant functions for the same operation

File: lib/utils/idempotency.ts

removeProcessedMark (added in this PR) and releaseProcessingClaim (also added in this PR) do exactly the same thing — delete a key from the idempotency table. Consolidate to one export and update the Eliza App webhook to use it.


Minor: Dynamic import in hot path

File: app/api/webhooks/whatsapp/[orgId]/route.ts

const { messageRouterService } = await import("@/lib/services/message-router");

Dynamic imports add cold-start latency on first invocation per container. Prefer a top-level import unless a circular dependency forces the dynamic approach.


Minor: Auth health check GET has no rate limiting

File: app/api/eliza-app/auth/whatsapp/route.ts

The GET export only returns {status: "ok"} so the risk is low, but for consistency with the other auth routes it should either use withRateLimit or be removed.


What's well done

  • HMAC-SHA256 signature verification uses crypto.timingSafeEqual correctly — no timing oracle
  • SKIP_WEBHOOK_VERIFICATION is correctly ignored in production in both webhook handlers
  • Zod schemas for webhook payload parsing with strict z.literal("whatsapp_business_account")
  • Race-condition handling in findOrCreateByWhatsAppId with unique-constraint recovery
  • Partial index on whatsapp_id WHERE NOT NULL is good practice for sparse columns
  • Cross-platform phone-number linking logic is sound and well-commented
  • Org-level webhook correctly uses atomic tryClaimForProcessing
  • Test coverage is thorough: unit tests, integration tests, and race-condition scenarios
  • markRead fire-and-forget is appropriate for non-critical UX operations
  • Phone number masking in logs (last 4 digits only)

Summary: The auth endpoint security issue is the highest priority — issuing JWT sessions based solely on knowing a phone number is too weak. The non-atomic idempotency race condition and double stopTyping are bugs to fix before merge. The remaining items can be addressed as follow-ups.

@lalalune lalalune merged commit 9ed94af into dev Mar 8, 2026
4 checks passed
@hanzlamateen hanzlamateen deleted the feat/eliza-app-whatsapp-support branch March 9, 2026 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants