-
Notifications
You must be signed in to change notification settings - Fork 0
User Flows
End-to-end sequence diagrams for every use case (UC-01..UC-15) plus cross-cutting flows (idempotency, outbox dispatch, event dedup, auth). All diagrams in Mermaid - render natively in GitHub Wiki. Each flow links to its Use Case and shows the happy path plus the most critical alternative paths.
- Solid arrow = synchronous call (HTTP/gRPC)
- Dashed arrow = asynchronous (Kafka event, scheduled job, callback)
-
:separator denotes a returned value - DB writes inside a transaction are grouped under a
rectwith note - "Outbox" = transactional outbox row in the same DB transaction
- "Dispatcher" = background worker reading outbox and publishing to Kafka
Every API call passes through this flow before reaching the use-case logic.
sequenceDiagram
participant Client
participant Gateway
participant Keycloak
participant Service
Client->>Keycloak: 1. POST /realms/fincore/protocol/openid-connect/token<br/>(client_credentials grant)
Keycloak-->>Client: 2. JWT access_token (TTL 5m, RS256)
Client->>Gateway: 3. Request with Authorization: Bearer <jwt>
Gateway->>Gateway: 4. Verify JWT signature (cached JWKS)
Gateway->>Gateway: 5. Check scope, roles, audience
alt JWT invalid
Gateway-->>Client: 401 Unauthorized
else JWT valid
Gateway->>Service: 6. Forward request<br/>X-User-Id, X-Roles propagated
Service->>Service: 7. @PreAuthorize role check
Service-->>Gateway: 8. Response
Gateway-->>Client: 9. Response
end
Key points:
- JWKS cached in Gateway for 10 minutes (rotation tolerated)
- Scope-based authorization:
ledger:write,payments:initiate,compliance:resolve - All calls get
correlationId(UUID) injected if not present, propagated asX-Correlation-Id
Every POST/PUT/DELETE that mutates state respects Idempotency-Key header.
sequenceDiagram
participant Client
participant API
participant DB
participant Service
Client->>API: POST /v1/<resource><br/>Idempotency-Key: abc-123
API->>DB: SELECT * FROM idempotency_keys<br/>WHERE key='abc-123' FOR UPDATE
alt Key exists, hash matches
DB-->>API: cached response_status, response_body
API-->>Client: Same status, same body (replay)
else Key exists, hash mismatch
API-->>Client: 409 Conflict<br/>RFC 7807 Problem Details
else Key missing
API->>Service: Process request
Service->>Service: Compute response
Service->>DB: BEGIN TRANSACTION
Service->>DB: Business writes
Service->>DB: INSERT INTO idempotency_keys<br/>(key, request_hash, response_status, response_body, expires_at = now+24h)
Service->>DB: COMMIT
Service-->>API: response
API-->>Client: response (cached for 24h replay)
end
Key points:
- TTL = 24h (configurable). Cleanup job runs nightly.
-
request_hash= SHA-256 of normalized request body. Mismatch = 409. - The idempotency row is written in the same transaction as business state. Atomic.
sequenceDiagram
autonumber
participant Client
participant Gateway
participant LedgerService as Ledger Service
participant DB
participant OutboxDispatcher as Outbox Dispatcher
participant Kafka
Client->>Gateway: POST /v1/accounts<br/>{ name, type, currency }<br/>Idempotency-Key: k-001
Gateway->>Gateway: AuthN/Z (LEDGER_WRITER)
Gateway->>LedgerService: createAccount(cmd)
LedgerService->>DB: SELECT * FROM idempotency_keys WHERE key=k-001
alt Key not found
rect rgb(225, 245, 254)
note right of DB: Single DB transaction (REPEATABLE_READ)
LedgerService->>DB: INSERT INTO accounts (id=UUID, name, type, currency, status=ACTIVE, version=0)
LedgerService->>DB: INSERT INTO outbox_events (event_type=account.created, aggregate_id=accountId, payload)
LedgerService->>DB: INSERT INTO idempotency_keys (key, hash, response, expires_at)
LedgerService->>DB: COMMIT
end
LedgerService-->>Gateway: 201 Created { account }
Gateway-->>Client: 201 Created
end
note over OutboxDispatcher: After commit, async
OutboxDispatcher->>DB: SELECT * FROM outbox_events WHERE status='PENDING'
OutboxDispatcher->>Kafka: PUBLISH topic=ledger.events<br/>key=accountId, payload, headers={correlationId}
Kafka-->>OutboxDispatcher: ack
OutboxDispatcher->>DB: UPDATE outbox_events SET status='PUBLISHED', published_at=now()
Failure modes:
- DB commit fails → 500, no event published, client retries with same idempotency key
- Kafka publish fails → outbox stays
PENDING, dispatcher retries with exponential backoff - After N publish failures → row moves to
FAILED, alerts fire
sequenceDiagram
autonumber
participant Client
participant Gateway
participant LedgerService as Ledger Service
participant DB
participant Trigger as DB Trigger<br/>(invariant check)
participant OutboxDispatcher as Outbox Dispatcher
participant Kafka
Client->>Gateway: POST /v1/transactions<br/>{ reference, entries: [{accountId, amount, direction}] }<br/>Idempotency-Key: tx-007
Gateway->>Gateway: AuthN/Z (LEDGER_POSTER)
Gateway->>LedgerService: postTransaction(cmd)
LedgerService->>LedgerService: Validate (>=2 entries, sum=0, single currency)
LedgerService->>DB: SELECT * FROM idempotency_keys WHERE key=tx-007
alt Key not found
rect rgb(255, 243, 224)
note right of DB: REPEATABLE_READ transaction with deferred trigger
LedgerService->>DB: SELECT * FROM accounts<br/>WHERE id IN (...) FOR UPDATE
DB-->>LedgerService: Locked accounts
alt Any account inactive
LedgerService-->>Gateway: 409 Conflict
Gateway-->>Client: 409 Conflict
else All active
LedgerService->>DB: INSERT INTO transactions (id, reference, status='POSTED')
LedgerService->>DB: INSERT INTO entries (...) [N rows]
note over Trigger: Trigger fires AFTER INSERT,<br/>deferred to COMMIT time
LedgerService->>DB: COMMIT
Trigger->>Trigger: Verify SUM(entries.amount) = 0<br/>per (transaction_id, currency)
alt Invariant violated
Trigger-->>DB: ROLLBACK
DB-->>LedgerService: SQLException 22000
LedgerService-->>Gateway: 422 Unprocessable Entity
Gateway-->>Client: 422 (invariant violation)
else Invariant OK
Trigger-->>DB: ALLOW
DB-->>LedgerService: COMMIT OK
end
LedgerService->>DB: INSERT INTO outbox_events (event_type=transaction.posted)
LedgerService->>DB: INSERT INTO idempotency_keys
LedgerService->>DB: REFRESH MATERIALIZED VIEW CONCURRENTLY account_balances
end
end
LedgerService-->>Gateway: 201 Created { transaction }
Gateway-->>Client: 201 Created
end
note over OutboxDispatcher: After commit, async
OutboxDispatcher->>Kafka: PUBLISH topic=ledger.events<br/>event=transaction.posted
note over Kafka: Downstream consumers triggered:<br/>AML check, notifications, analytics
Why deferred trigger:
- Invariant
SUM(entries) = 0only verifiable after all entries are inserted - Postgres
CONSTRAINT TRIGGER ... DEFERRABLE INITIALLY DEFERREDruns at COMMIT time - If violated, the entire transaction is rolled back atomically - no partial state
Concurrency safety:
-
SELECT ... FOR UPDATEon accounts prevents lost updates -
versioncolumn on accounts catches concurrent balance changes via optimistic locking on the materialized view refresh - If two postings to the same account race, one retries
sequenceDiagram
autonumber
participant Operator
participant Gateway
participant LedgerService
participant DB
participant Outbox
Operator->>Gateway: POST /v1/transactions/{id}/reverse<br/>{ reason }<br/>Idempotency-Key: r-042
Gateway->>Gateway: AuthN/Z (LEDGER_REVERSER)
Gateway->>LedgerService: reverseTransaction(id, reason)
LedgerService->>DB: SELECT * FROM transactions WHERE id=:id
alt Not found
LedgerService-->>Gateway: 404
else Already REVERSED
LedgerService-->>Gateway: 409 Conflict
else POSTED
rect rgb(255, 243, 224)
note right of DB: Single transaction, same invariants apply
LedgerService->>DB: INSERT INTO transactions (id', reference=orig.reference+"-rev", status='POSTED', reverses_id=orig.id)
LedgerService->>DB: INSERT INTO entries (id', transaction_id=id', accountId, amount=-orig.amount, direction=opposite)
LedgerService->>DB: UPDATE transactions SET status='REVERSED' WHERE id=:id
LedgerService->>DB: INSERT INTO outbox_events (event_type=transaction.reversed)
LedgerService->>DB: COMMIT
end
LedgerService-->>Gateway: 201 Created { reversingTransaction }
end
Outbox-->>Outbox: Publishes transaction.reversed event
Audit: original transaction is immutable - never deleted. The journal grows monotonically.
sequenceDiagram
autonumber
participant Client
participant Gateway
participant PaymentService as Payment Service
participant DecisionEngine as Decision Engine
participant LedgerService as Ledger Service
participant DB
participant Outbox
participant Kafka
Client->>Gateway: POST /v1/payments<br/>{ from, to, amount, currency }<br/>Idempotency-Key: p-100
Gateway->>Gateway: AuthN/Z (PAYMENTS_INITIATOR)
Gateway->>PaymentService: initiatePayment(cmd)
PaymentService->>PaymentService: Validate
PaymentService->>DB: BEGIN<br/>Check idempotency_keys
alt Key not found
PaymentService->>LedgerService: getBalance(fromAccountId)
LedgerService-->>PaymentService: balance
alt Insufficient balance
PaymentService->>DB: INSERT INTO payments (state='FAILED', reason='insufficient_balance')
PaymentService-->>Gateway: 422
else Sufficient
PaymentService->>DecisionEngine: evaluate({ amount, currency, fromAccountId, toAccountId, ... })
DecisionEngine->>DB: SELECT active rules WHERE rule_set_id='payment-screening'
DecisionEngine->>DecisionEngine: Apply rules deterministically
DecisionEngine->>DB: INSERT INTO decision_logs (input, matched_rules, decision)
DecisionEngine-->>PaymentService: { decision: APPROVE, matchedRules, explanation }
alt decision = REJECT
PaymentService->>DB: INSERT INTO payments (state='REJECTED')
PaymentService->>DB: INSERT INTO outbox_events (event=payment.rejected)
PaymentService->>DB: COMMIT
PaymentService-->>Gateway: 202 Accepted (rejected)
else decision = REVIEW
PaymentService->>DB: INSERT INTO payments (state='PENDING_REVIEW')
PaymentService->>DB: INSERT INTO compliance_cases (...)
PaymentService->>DB: INSERT INTO outbox_events (event=payment.pending_review)
PaymentService->>DB: COMMIT
PaymentService-->>Gateway: 202 (held)
else decision = APPROVE
rect rgb(255, 243, 224)
note right of DB: Single transaction across services<br/>(co-located DB schemas)
PaymentService->>DB: INSERT INTO payments (state='PROCESSING')
PaymentService->>LedgerService: postTransaction(...)
LedgerService->>DB: INSERT transactions, entries (UC-04 inner flow)
PaymentService->>DB: INSERT INTO outbox_events (event=payment.created)
PaymentService->>DB: INSERT INTO idempotency_keys
PaymentService->>DB: COMMIT
end
PaymentService-->>Gateway: 202 Accepted { paymentId, state='PROCESSING' }
end
end
end
note over Outbox,Kafka: Async after commit
Outbox->>Kafka: payment.created event
Kafka-->>Kafka: Triggers Bank Adapter (UC-08 callback later)
Why payments and ledger share a DB transaction:
- Modular monolith - single deployment, single DB instance, separate schemas per bounded context
- Payment row + ledger transaction row + outbox row all written atomically
- After extraction to microservices: replaced by Saga pattern (orchestrated, with compensating transactions)
sequenceDiagram
autonumber
participant Bank as External Bank
participant Gateway
participant WebhookService as Webhook Service
participant DB
participant PaymentService
participant Outbox
Bank->>Gateway: POST /v1/webhooks/payments/{providerId}<br/>X-Provider-Signature: <hmac><br/>Body: { providerEventId, paymentRef, status, ... }
Gateway->>WebhookService: ingest(rawBody, signature, providerId)
WebhookService->>DB: SELECT secret FROM provider_secrets WHERE provider_id=...
WebhookService->>WebhookService: Verify HMAC-SHA256(rawBody, secret) == signature
alt Invalid signature
WebhookService-->>Gateway: 401 Unauthorized<br/>Log forensic event
Gateway-->>Bank: 401
else Valid signature
WebhookService->>DB: SELECT * FROM processed_webhooks<br/>WHERE provider_event_id=...
alt Already processed (replay)
WebhookService-->>Gateway: 200 OK (idempotent ack)
else New event
rect rgb(255, 243, 224)
note right of DB: Atomic
WebhookService->>DB: INSERT INTO processed_webhooks (provider_event_id, processed_at)
WebhookService->>PaymentService: updatePaymentStatus(paymentRef, status)
PaymentService->>DB: SELECT * FROM payments WHERE provider_ref=...
alt Payment found
PaymentService->>DB: UPDATE payments SET state=newState, updated_at=now()<br/>WHERE id=:id AND version=:v
PaymentService->>DB: INSERT INTO outbox_events (event=payment.<status>)
alt Failure status & ledger transaction exists
note right of DB: Reverse the ledger transaction (UC-05)
PaymentService->>DB: INSERT reversing transaction + entries
end
end
PaymentService->>DB: COMMIT
end
WebhookService-->>Gateway: 200 OK
Gateway-->>Bank: 200
end
end
Outbox->>Outbox: Publishes payment.<status> event
Replay attacks:
- HMAC alone insufficient - replay of valid signed payload would re-credit
-
provider_event_iddeduplication blocks replays - For extra safety: also include
timestampin signed payload, reject events older than 5 min
sequenceDiagram
autonumber
participant Cron as Scheduled Job<br/>(every 1 min)
participant PaymentService as Payment Service
participant DB
participant BankAdapter as Bank Adapter
Cron->>PaymentService: scanPendingRetries()
PaymentService->>DB: SELECT * FROM payments<br/>WHERE state='FAILED' AND retryable=true<br/>AND next_retry_at <= now()<br/>AND attempts < max_attempts<br/>FOR UPDATE SKIP LOCKED LIMIT 100
loop for each payment
PaymentService->>BankAdapter: send(payment, idempotencyKey=payment.idempotencyKey)
BankAdapter-->>PaymentService: response
alt Success
PaymentService->>DB: UPDATE payments SET state='PROCESSING', attempts+=1
PaymentService->>DB: INSERT INTO outbox_events (event=payment.retried_success)
else Transient failure
PaymentService->>DB: UPDATE payments SET attempts+=1,<br/>next_retry_at = now() + backoff(attempts)
else Permanent failure
PaymentService->>DB: UPDATE payments SET state='PERMANENTLY_FAILED'
PaymentService->>DB: INSERT INTO outbox_events (event=payment.permanently_failed)
end
end
Backoff schedule (exponential): 1m → 5m → 15m → 1h → 6h → 24h. After 6 attempts → PERMANENTLY_FAILED.
SKIP LOCKED - multiple worker instances can run simultaneously without picking the same payment.
sequenceDiagram
autonumber
participant Client
participant Gateway
participant ComplianceService as Compliance Service
participant DB
participant KycProvider as Sandbox KYC Provider<br/>(plug-in)
participant Outbox
Client->>Gateway: POST /v1/kyc/sessions<br/>{ userId, providerId? }<br/>Idempotency-Key: kyc-1
Gateway->>Gateway: AuthN/Z
Gateway->>ComplianceService: createKycSession(cmd)
ComplianceService->>DB: INSERT INTO kyc_sessions (id, user_id, status='PENDING', provider='sandbox')
ComplianceService->>KycProvider: startSession(userId, returnUrl)
KycProvider-->>ComplianceService: { providerSessionId, hostedUrl, status='AWAITING_DOCUMENTS' }
ComplianceService->>DB: UPDATE kyc_sessions SET provider_session_id=...
ComplianceService-->>Gateway: 201 { sessionId, hostedUrl, status }
Gateway-->>Client: 201
note over Client,KycProvider: Out-of-band: user uploads docs to KycProvider URL
KycProvider->>Gateway: POST /v1/webhooks/kyc/{providerId}<br/>{ sessionId, status='APPROVED', evidence }
Gateway->>ComplianceService: handleKycCallback(...)
ComplianceService->>DB: UPDATE kyc_sessions SET status='APPROVED', evidence_ref=...
ComplianceService->>DB: INSERT INTO outbox_events (event=kyc.approved, user_id, session_id)
ComplianceService-->>Gateway: 200
Outbox->>Outbox: kyc.approved event
PII protection:
- Document content NEVER stored in our DB - only
evidence_refpointing to KYC provider - Session metadata encrypted at rest (Postgres-level transparent encryption)
- Logs scrub PII (only IDs, hashes)
sequenceDiagram
autonumber
participant Kafka
participant ComplianceConsumer as Compliance Consumer
participant DB
participant DecisionEngine
participant Outbox
Kafka->>ComplianceConsumer: transaction.posted event<br/>{ transactionId, accountIds, amount, currency, postedAt }
ComplianceConsumer->>DB: SELECT * FROM processed_events<br/>WHERE event_id=event.id
alt Already processed (Kafka redelivery)
ComplianceConsumer->>Kafka: ack offset (no-op)
else New event
ComplianceConsumer->>DecisionEngine: evaluate({ context: txn data, ruleSet: 'aml-screening' })
DecisionEngine->>DB: SELECT * FROM aml_rules WHERE active=true ORDER BY priority
DecisionEngine->>DecisionEngine: Apply rules
DecisionEngine->>DB: INSERT INTO decision_logs (input, matched_rules, decision)
DecisionEngine-->>ComplianceConsumer: { matchedRules, decision }
alt No rules matched
ComplianceConsumer->>DB: INSERT INTO processed_events (event_id, processed_at)
ComplianceConsumer->>Kafka: ack offset
else Rules matched
rect rgb(255, 243, 224)
ComplianceConsumer->>DB: INSERT INTO aml_alerts (id, transaction_id, risk_score, reasons)
ComplianceConsumer->>DB: INSERT INTO compliance_cases (id, alert_id, status='OPEN')
ComplianceConsumer->>DB: INSERT INTO outbox_events (event=aml.flagged)
ComplianceConsumer->>DB: INSERT INTO processed_events (event_id, processed_at)
ComplianceConsumer->>DB: COMMIT
end
ComplianceConsumer->>Kafka: ack offset
Outbox->>Outbox: aml.flagged event
end
end
Idempotent consumer:
-
processed_eventstable tracks every event_id we've handled - Kafka may redeliver (at-least-once) - dedup makes it effectively-once
- Insert + business action + ack offset must be atomic - pattern: process → DB commit → ack
sequenceDiagram
autonumber
participant Officer as Compliance Officer
participant Gateway
participant ComplianceService
participant DB
participant LedgerService
participant LlmCopilot as AML Copilot<br/>(optional plug-in)
participant Outbox
Officer->>Gateway: GET /v1/compliance/cases/{id}
Gateway->>ComplianceService: getCaseDetails(id)
ComplianceService->>DB: SELECT case + alert + transaction + linked entries
ComplianceService->>LlmCopilot: explainAlert(alert)<br/>(if configured)
LlmCopilot-->>ComplianceService: explanation, suggested_steps
ComplianceService-->>Gateway: { case, transaction, evidence, ai_explanation }
Officer->>Officer: Reviews evidence
Officer->>Gateway: POST /v1/compliance/cases/{id}/resolve<br/>{ decision: REJECT, reason: "Sanctions match: SDN" }<br/>Idempotency-Key: res-1
Gateway->>ComplianceService: resolveCase(id, decision, reason)
rect rgb(255, 243, 224)
ComplianceService->>DB: UPDATE compliance_cases SET status='RESOLVED', decision, reason, resolved_by, resolved_at
alt decision = REJECT
ComplianceService->>LedgerService: reverseTransaction(originalTxnId, reason)
LedgerService->>DB: insert reversing transaction (UC-05)
end
ComplianceService->>DB: INSERT INTO outbox_events (event=compliance.case.resolved)
ComplianceService->>DB: COMMIT
end
ComplianceService-->>Gateway: 200 OK { caseStatus }
Gateway-->>Officer: 200
Outbox->>Outbox: compliance.case.resolved event
Audit trail: every resolution requires reason (free text + structured code). Resolved cases are immutable - re-opening creates a new case.
sequenceDiagram
autonumber
participant Caller as Internal Service<br/>(Payments, Compliance, Onboarding)
participant DecisionEngine
participant DB
Caller->>DecisionEngine: evaluate({ context, ruleSetId })
DecisionEngine->>DB: SELECT * FROM decision_rules<br/>WHERE rule_set_id=:rsi AND active=true<br/>ORDER BY priority DESC
DB-->>DecisionEngine: ordered rule list
loop for each rule
DecisionEngine->>DecisionEngine: Apply rule.conditions to context
alt Match
DecisionEngine->>DecisionEngine: Append to matchedRules
alt rule.terminate=true
Note over DecisionEngine: Stop iteration
end
end
end
DecisionEngine->>DecisionEngine: Aggregate decision (APPROVE/REJECT/REVIEW per ruleset config)
DecisionEngine->>DB: INSERT INTO decision_logs<br/>(input_payload, matched_rules, decision, latency_ms, rule_version)
DecisionEngine-->>Caller: { decision, matchedRules, explanation }
Determinism guarantees:
- Same input + same active rules + same
decision_logs.created_at→ same output -
rule_versionsnapshot - even if rule changes later, log can be replayed against old version - p99 < 10ms for ≤100 rules
sequenceDiagram
autonumber
participant Admin as Risk Admin
participant Gateway
participant DecisionEngine
participant DB
participant Outbox
Admin->>Gateway: POST /v1/decision/rules<br/>{ name, conditions, action, priority, ruleSetId }
Gateway->>Gateway: AuthN/Z (DECISION_ADMIN)
Gateway->>DecisionEngine: createRule(cmd)
DecisionEngine->>DecisionEngine: Validate JSON DSL syntax
DecisionEngine->>DecisionEngine: Validate referenced fields against schema
alt Invalid DSL
DecisionEngine-->>Gateway: 400 Bad Request<br/>RFC 7807 with offending JSON path
else Valid
rect rgb(255, 243, 224)
DecisionEngine->>DB: INSERT INTO decision_rules (id, ruleSetId, name, definition_json, status='DRAFT', version=1)
DecisionEngine->>DB: INSERT INTO rule_versions (rule_id, version, definition_json, created_by)
DecisionEngine->>DB: COMMIT
end
DecisionEngine-->>Gateway: 201 { rule (DRAFT) }
end
Note over Admin: Rule must be activated separately
Admin->>Gateway: POST /v1/decision/rules/{id}/activate
Gateway->>DecisionEngine: activateRule(id)
rect rgb(255, 243, 224)
DecisionEngine->>DB: UPDATE decision_rules SET status='ACTIVE'
DecisionEngine->>DB: INSERT INTO outbox_events (event=decision.rule.activated)
DecisionEngine->>DB: COMMIT
end
DecisionEngine-->>Gateway: 200 { active rule }
Outbox->>Outbox: decision.rule.activated event
Versioning:
- Every change to a rule creates a new
rule_versionsrow - Old versions queryable via
GET /v1/decision/rules/{id}/versions -
decision_logsreferencesrule_versionat evaluation time - replay-friendly - Rules are never deleted - only deactivated (
status='DEPRECATED')
sequenceDiagram
autonumber
participant Client
participant Gateway
participant WebhookService
participant DB
participant Subscriber as Subscriber's Endpoint
participant DispatcherJob as Webhook Dispatcher<br/>(periodic)
Client->>Gateway: POST /v1/webhooks/subscriptions<br/>{ url, events:[...], secret }
Gateway->>WebhookService: createSubscription(cmd)
WebhookService->>DB: INSERT INTO webhook_subscriptions
WebhookService-->>Gateway: 201 { subscriptionId }
Note over DispatcherJob: Background, every 5 sec
DispatcherJob->>DB: SELECT pending events × matching subscriptions<br/>FROM outbox_events JOIN webhook_subscriptions
loop for each (event, subscription)
DispatcherJob->>DispatcherJob: Sign payload (HMAC-SHA256 with subscription.secret)
DispatcherJob->>Subscriber: POST {subscription.url}<br/>X-Signature: <hmac><br/>X-Event-Id, X-Event-Type, X-Delivery-Id
alt 2xx
DispatcherJob->>DB: INSERT INTO webhook_deliveries (status='DELIVERED', http_status, latency_ms)
else non-2xx or timeout
DispatcherJob->>DB: INSERT INTO webhook_deliveries (status='FAILED', http_status, error)
DispatcherJob->>DB: UPDATE delivery_attempts<br/>SET attempts+=1, next_retry_at=now+backoff
alt attempts >= 7
DispatcherJob->>DB: UPDATE status='PERMANENTLY_FAILED'
DispatcherJob->>DB: INSERT INTO alerts (subscription_dead, ...)
end
end
end
Backoff schedule: 1m → 5m → 30m → 6h → 24h → 3d → 7d (7 attempts).
Subscriber verification: subscriber re-computes HMAC-SHA256(rawBody, theirSecret) and compares with X-Signature header.
sequenceDiagram
autonumber
participant Worker as Outbox Worker<br/>(per-pod, lease-based)
participant DB
participant Kafka
loop every 100ms
Worker->>DB: SELECT * FROM outbox_events<br/>WHERE status='PENDING'<br/>ORDER BY created_at<br/>FOR UPDATE SKIP LOCKED LIMIT 100
alt batch is empty
Note over Worker: Sleep, retry
else batch found
loop for each event
Worker->>Kafka: PUBLISH(topic, key=aggregate_id, payload, headers)
alt success
Worker->>DB: UPDATE outbox_events SET status='PUBLISHED', published_at=now() WHERE id=:id
else failure
Worker->>DB: UPDATE outbox_events SET attempts+=1, last_error=:err WHERE id=:id
alt attempts >= 5
Worker->>DB: UPDATE status='FAILED' (alerts fire)
end
end
end
end
end
Guarantees:
- At-least-once delivery to Kafka (Worker crash between Kafka ack and DB UPDATE → republish next iteration; idempotent consumers dedup)
-
Ordering per aggregate preserved (Kafka partition by
aggregate_id) - No event loss - DB transaction ensures outbox row is committed atomically with business state
-
Multi-worker safe -
SKIP LOCKEDmeans leader election is not needed
sequenceDiagram
autonumber
participant Kafka
participant Consumer
participant DB
Kafka->>Consumer: poll() returns batch
loop for each record
Consumer->>DB: SELECT 1 FROM processed_events<br/>WHERE event_id=:id
alt Already processed
Consumer->>Consumer: skip
else New
rect rgb(255, 243, 224)
Consumer->>Consumer: handle(event) [pure business logic]
Consumer->>DB: INSERT INTO processed_events (event_id, processed_at)
Consumer->>DB: business writes (in same transaction)
Consumer->>DB: COMMIT
end
end
end
Consumer->>Kafka: commitSync(offsets)
The contract:
- Business writes +
processed_eventsinsert happen in one transaction - Kafka offset commit happens after DB commit
- Crash between DB commit and Kafka commit → reprocessing finds
processed_eventsrow, skips, ok - Crash between handle() and DB commit → DB rolls back, message redelivered, processed cleanly
- No "exactly-once" promised - at-least-once delivery + idempotent processing = effectively-once
| Failure Class | Detection | Action |
|---|---|---|
| Validation error | Service input check | Return 400/422 with RFC 7807 |
| Idempotency conflict | Same key, different body hash | Return 409 |
| Authorization failure | JWT invalid / role missing | Return 401/403 |
| Resource not found | DB SELECT empty | Return 404 |
| Business invariant violation | DB trigger / service check | Return 422, transaction rolled back |
| Optimistic lock failure | Hibernate OptimisticLockException
|
Auto-retry up to 3 times, then 503 |
| External service timeout | Bank/KYC adapter exceeds budget | Mark FAILED with retryable=true, scheduled retry |
| Kafka publish failure | Outbox dispatcher non-success | Increment attempts, retry, alert after 5 |
| Webhook delivery failure | Subscriber non-2xx | Backoff retry, dead after 7 attempts |
| DB unavailable | Connection pool exhausted | Health probe fails, K8s restarts pod |
| Flow | p50 | p99 | SLO source |
|---|---|---|---|
| Create account | 80ms | 200ms | UC-01 AC-01.4 |
| Get balance | 15ms | 50ms | UC-02 AC-02.3 (cached) |
| Post transaction | 100ms | 300ms | UC-04 AC-04.5 |
| Initiate payment | 150ms | 500ms | UC-06 (decision + ledger) |
| Decision evaluation | 3ms | 10ms | UC-13 AC-13.2 |
| Webhook ingest | 100ms | 500ms | UC-08 AC-08.3 |
These are commitments to ourselves, baked into integration tests. Failure to meet them in CI = release blocker.
- Overview
- Services
- Data Model
- Domain Model
- Event Flow
- Security
- Observability
- Resilience
- SLA / SLI / SLO