Skip to content

Architecture Services

Tiana_ edited this page May 30, 2026 · 1 revision

Architecture - Services

Detailed responsibilities, contracts and boundaries of each service in FinCore Engine. Companion to Architecture-Overview, Architecture-Event-Flow, Architecture-Security.


Service catalog (v0.1 → v1.0)

Service First version OSS Maven artifact (post-extraction) Deployment
API Gateway v0.2 - (Spring Cloud Gateway) 3 replicas
Ledger Service v0.1 (MVP) com.fincore:ledger-service 3 replicas, HPA 3-10
Payment Service v0.2 com.fincore:payment-service 3 replicas, HPA 3-10
Compliance Service v0.3 com.fincore:compliance-service 2 replicas
Decision Engine v0.2 com.fincore:decision-engine (also library JAR) 2 replicas, can be embedded
Webhook Service v0.2 com.fincore:webhook-service 2 replicas
Outbox Dispatcher v0.1 com.fincore:outbox-dispatcher 2 replicas (lease-based)
fincore-ml (library) v0.3 com.fincore:fincore-ml embedded in services
fincore-llm (library) v0.3 com.fincore:fincore-llm embedded in services
Sandbox Bank Adapter v0.1 com.fincore:sandbox-bank-adapter embedded in payment service
Sandbox KYC Adapter v0.3 com.fincore:sandbox-kyc-adapter embedded in compliance service
Reconciliation Service v0.4 com.fincore:reconciliation-service 1 replica, scheduled
Reporting Service Y1 H2 com.fincore:reporting-service 1 replica, scheduled

Service: Ledger

Repo (post-extraction): tiana-code/ledger-service Bounded context: Ledger Database schema: ledger Domain: see Domain-Model - Account, Transaction, Entry aggregates

Responsibilities

  • Manage Account lifecycle (create, freeze, close)
  • Post double-entry transactions with SUM=0 invariant enforcement
  • Reverse transactions (compensating, never delete)
  • Compute and serve Balances (via materialized view)
  • Emit ledger.events (account.created, transaction.posted, transaction.reversed, account.frozen, account.closed)

NOT responsibilities

  • Money movement orchestration (that's Payment Service)
  • Currency conversion (delegated to bank provider)
  • Compliance decisions (delegated to Decision Engine + Compliance Service)
  • Persistence of operational metadata (customer info, domain-specific data - never in ledger)

Public API surface

Method Path Description
POST /v1/accounts Create account
GET /v1/accounts/{id} Get account
GET /v1/accounts/{id}/balance Get balance (cached MV)
GET /v1/accounts/{id}/balance?asOf=<ISO-8601> Time-travel balance (Killer Feature #5)
GET /v1/accounts/{id}/entries Cursor-paginated entries
PATCH /v1/accounts/{id} Update name, metadata, status
POST /v1/transactions Post double-entry transaction
GET /v1/transactions/{id} Get transaction with entries
POST /v1/transactions/{id}/reverse Create compensating transaction
GET /v1/health Spring Actuator health

Internal contract surface (called by other services)

Method Description
LedgerService.postTransaction(cmd) Idempotent post
LedgerService.reverseTransaction(id, reason) Reverse
LedgerService.getBalance(accountId, currency, asOf?) Balance query

In modular monolith: in-process Kotlin call. After extraction: gRPC or REST.

Events emitted (Kafka topic: ledger.events)

account.created            { accountId, type, currency, status, createdAt }
account.frozen             { accountId, reason, frozenAt, frozenBy }
account.unfrozen           { accountId, unfrozenAt, unfrozenBy }
account.closed             { accountId, closedAt, closedBy }
transaction.posted         { transactionId, reference, entries[], postedAt }
transaction.reversed       { transactionId, reverseTransactionId, reason, reversedAt }
balance.refreshed          { accountId, currency, oldBalance, newBalance } (high-volume, optional)

Invariants enforced (database level)

  • SUM(entries.amount) = 0 per (transaction_id, currency) - deferred trigger
  • entries.account_id REFERENCES accounts(id) - FK
  • entries.currency matches accounts.currency at posting time (service-level check)
  • transactions.reference is unique
  • accounts.version optimistic locking
  • All FK with ON DELETE RESTRICT (no cascade deletes - journal is immutable)

Performance targets

Metric Target
Post transaction p99 < 300ms
Get balance p99 (cached MV) < 50ms
List entries p99 (50 items) < 100ms
Sustained writes/sec (single instance) 1000
Burst writes/sec (single instance) 3500

Storage

  • PostgreSQL 17 schema ledger
  • Tables: accounts, transactions, entries, account_balances (MV)
  • Partitioning kicks in at 10M entries (by created_at quarter)
  • Indexes: see Data-Model for full DDL

Configuration

fincore:
  ledger:
    storage:
      provider: postgres   # or tigerbeetle (Y1 H2)
    invariants:
      enforce-sum-zero: true       # always true in production
      enforce-currency-match: true
      enforce-account-active: true
    balance-mv:
      refresh-mode: after-each-post   # or scheduled-30s
    partitioning:
      enabled: true
      partition-by: created_at_quarter

Service: Payment

Repo (post-extraction): tiana-code/payment-service Bounded context: Payments Database schema: payments Domain: Payment aggregate with state machine (see Domain-Model)

Responsibilities

  • Manage Payment lifecycle (CREATED → PROCESSING → COMPLETED/FAILED/...)
  • Coordinate with Decision Engine for pre-payment evaluation
  • Coordinate with Ledger for transaction posting
  • Coordinate with Bank Adapter for external payment send
  • Handle bank webhooks (success/failure callbacks)
  • Schedule retries with exponential backoff
  • Idempotency on all writes
  • Emit payment.events (payment.created, payment.processing, payment.completed, payment.failed, payment.permanently_failed, payment.rejected, payment.pending_review, payment.reversed)

NOT responsibilities

  • Business logic of accepting payments (the application layer above us decides "should I accept?")
  • KYC verification (delegated to Compliance Service)
  • Direct ledger entry construction (delegated to Ledger Service)
  • Money storage (we never hold customer funds)

Public API surface

Method Path Description
POST /v1/payments Initiate payment
GET /v1/payments/{id} Get payment status
GET /v1/payments/{id}?include=events Get with full event history
POST /v1/payments/{id}/cancel Cancel (if in CREATED/PENDING_REVIEW)
POST /v1/payments/{id}/retry Manual retry (if FAILED)
GET /v1/payments List with filters (state, accountId, dateRange, cursor)
POST /v1/webhooks/payments/{providerId} Inbound bank provider webhook

Plug-in interfaces (implementations swappable)

interface BankProvider {
    suspend fun send(payment: Payment, idempotencyKey: String): BankSendResponse
    suspend fun queryStatus(providerRef: ProviderReference): BankStatus
    suspend fun cancel(providerRef: ProviderReference): BankCancelResponse
    val capabilities: Set<BankCapability>   // e.g., SEPA_INSTANT, ACH, SWIFT
}

interface BankProviderRegistry {
    fun get(providerId: String): BankProvider
    fun listActive(): List<BankProvider>
}

Sandbox provider (com.fincore:sandbox-bank-adapter) is bundled and active by default. Real provider adapters are out of OSS scope.

Events emitted (Kafka topic: payment.events)

payment.created             { paymentId, fromAccountId, toAccountId|counterparty, amount, currency, idempotencyKey }
payment.decision_evaluated  { paymentId, decision, matchedRules, decisionLogId }
payment.processing          { paymentId, ledgerTransactionId, providerRef? }
payment.completed           { paymentId, completedAt }
payment.failed              { paymentId, error, retryable, attempts }
payment.permanently_failed  { paymentId, finalError, attempts }
payment.rejected            { paymentId, reason, reasonCode }
payment.pending_review      { paymentId, complianceCaseId }
payment.reversed            { paymentId, reverseTransactionId, reason }
payment.cancelled           { paymentId, cancelledBy, cancelledAt }

State machine guard (in PaymentServiceImpl)

Every state transition validated against the explicit transition matrix. Illegal transitions throw IllegalStateException. See Domain-Model for the full state machine.

Performance targets

Metric Target
Initiate payment p99 (full sync flow) < 500ms
Get payment p99 < 100ms
Webhook ingest p99 < 500ms
Sustained payments/sec (single instance) 500

Storage

  • Schema payments
  • Tables: payments, payment_events, processed_webhooks, payment_retries
  • Indexes on idempotency_key, provider_ref, state + next_retry_at

Configuration

fincore:
  payments:
    bank-providers:
      default: sandbox
      enabled: [sandbox]
    retry:
      max-attempts: 6
      backoff: [1m, 5m, 15m, 1h, 6h, 24h]
    webhook:
      signature-algorithm: HMAC-SHA256
      replay-window: 5m
      timestamp-skew-tolerance: 30s
    decision:
      ruleset-id: "payment-screening"
      timeout: 200ms
      fail-safe: REVIEW   # if Decision Engine unavailable, default to REVIEW

Service: Compliance

Repo (post-extraction): tiana-code/compliance-service Bounded context: Compliance Database schema: compliance

Responsibilities

  • KYC orchestration (provider-agnostic)
  • AML transaction screening (consumer of ledger.events.transaction.posted)
  • Compliance Case management (open, assign, resolve)
  • LLM-powered AML Copilot (optional plug-in)
  • SAR/STR draft generation (Y1 H2 - Killer Feature #15)
  • Sanctions screening orchestration (uses external SanctionsProvider)
  • Emit compliance.events

Public API surface

KYC

Method Path Description
POST /v1/kyc/sessions Start KYC session
GET /v1/kyc/sessions/{id} Get session status
POST /v1/kyc/sessions/{id}/documents Upload document reference
POST /v1/webhooks/kyc/{providerId} Inbound provider webhook

AML

Method Path Description
GET /v1/aml/alerts List alerts
GET /v1/aml/alerts/{id} Get alert details
POST /v1/aml/rules Create AML rule (admin)
GET /v1/aml/rules List rules

Cases

Method Path Description
GET /v1/compliance/cases List cases (operator UI)
GET /v1/compliance/cases/{id} Get case details + AI explanation
POST /v1/compliance/cases/{id}/claim Claim for review
POST /v1/compliance/cases/{id}/resolve Resolve with decision + reason
POST /v1/compliance/cases/{id}/escalate Escalate to senior officer
POST /v1/compliance/cases/{id}/notes Add note (audit)

Plug-in interfaces

interface KycProvider {
    suspend fun createSession(userId: UserId, returnUrl: HttpUrl): KycSessionInit
    suspend fun queryStatus(providerSessionId: String): KycSessionStatus
}

interface SanctionsProvider {
    suspend fun screen(party: Party): SanctionsScreeningResult
    val listVersion: String   // for audit
}

interface AmlCopilot {                              // optional, LLM-powered
    suspend fun explainAlert(alert: AmlAlert): String
    suspend fun draftReport(case: ComplianceCase): DraftReport
    suspend fun suggestNextSteps(case: ComplianceCase): List<NextStep>
}

interface RuleSynthesizer {                         // optional, LLM-powered
    suspend fun synthesizeRule(naturalLanguage: String, schema: ContextSchema): DraftRule
    suspend fun explainRule(rule: DecisionRule): NaturalLanguageExplanation
}

Events emitted (Kafka topic: compliance.events)

kyc.session.created      { sessionId, userId, provider, expiresAt }
kyc.approved             { sessionId, userId, evidence }
kyc.rejected             { sessionId, userId, reason }
kyc.expired              { sessionId, userId }
aml.flagged              { alertId, transactionId, riskScore, matchedRules }
aml.case.opened          { caseId, alertId|paymentId|kycSessionId, priority }
aml.case.assigned        { caseId, assignedTo }
aml.case.resolved        { caseId, decision, reason, resolvedBy }
aml.case.escalated       { caseId, escalatedBy, escalatedTo }
sanctions.list.updated   { provider, version, addedCount, removedCount }

Events consumed

Topic Reason
ledger.events.transaction.posted Run AML screening on every transaction
payment.events.payment.created Run pre-decision sanctions check (alongside Decision Engine)
decision.events.rule.activated Invalidate cached rules

Storage

  • Schema compliance
  • Tables: kyc_sessions, kyc_documents (refs only, no PII), aml_alerts, aml_rules, compliance_cases, case_notes, case_evidence_refs, processed_events (consumer dedup)

Configuration

fincore:
  compliance:
    kyc:
      providers:
        default: sandbox
        enabled: [sandbox]
      session-ttl: 30d
    aml:
      ruleset-id: "aml-screening"
      auto-create-cases-above-risk-score: 70
    sanctions:
      provider: sandbox
      refresh-interval: 1h
    copilot:
      provider: noop                    # or openai, anthropic, ollama
      model: claude-haiku-4-5
      timeout: 10s
      fail-safe: noop                   # if LLM down, no AI explanation, case still resolvable
    pii-encryption:
      key-management: vault

Service: Decision Engine

Repo (post-extraction): tiana-code/decision-engine ← may stand on its own as a hit project Bounded context: Decision Database schema: decision

Responsibilities

  • Manage decision rules (CRUD + versioning)
  • Evaluate decisions deterministically (synchronous, low-latency)
  • Maintain immutable decision_logs for audit/replay
  • Validate JSON DSL syntax
  • LLM-powered rule synthesis (optional plug-in - Killer Feature #3)
  • Replay mode (Killer Feature #1)

NOT responsibilities

  • Side-effects (the engine is pure - no external calls, no DB writes outside decision_logs and rule storage)
  • Workflow orchestration (single-shot evaluations)
  • ML scoring (advisory only, separate RiskScorer interface)

Public API surface

Method Path Description
POST /v1/decision/evaluate Evaluate decision (synchronous)
GET /v1/decision/logs/{id} Get decision log entry
GET /v1/decision/logs?correlationId=... Find logs by correlation
POST /v1/decision/replay Replay range (Killer Feature)
POST /v1/decision/rules Create rule (DRAFT)
POST /v1/decision/rules/{id}/activate Activate (DRAFT → ACTIVE)
POST /v1/decision/rules/{id}/deprecate Deprecate
GET /v1/decision/rules List rules with filters
GET /v1/decision/rules/{id}/versions Version history
POST /v1/decision/rules/synthesize LLM rule synthesis (optional)
POST /v1/decision/rules/{id}/explain LLM explanation of rule

Library mode

The Decision Engine is also publishable as a Maven artifact com.fincore:decision-engine. Adopters can embed:

dependencies {
    implementation("com.fincore:decision-engine:0.2.0")
}

@Configuration
class MyDecisionConfig {
    @Bean
    fun decisionEngine(): DecisionEngine = DecisionEngine.embedded(
        ruleStore = JdbcRuleStore(dataSource),
        auditLog = JdbcAuditLog(dataSource),
        clock = Clock.systemUTC(),
    )
}

This positions Decision Engine as a Drools-replacement for modern Kotlin/Spring teams.

DSL examples

{
  "id": "high-amount-foreign-country",
  "ruleSetId": "payment-screening",
  "priority": 100,
  "terminate": false,
  "definition": {
    "conditions": {
      "all": [
        { "field": "amount.amount", "op": ">", "value": 10000 },
        { "field": "amount.currency", "op": "=", "value": "EUR" },
        { "field": "destination.country", "op": "in", "value": ["NG", "PK", "RU"] }
      ]
    },
    "action": {
      "decision": "REVIEW",
      "reason": "high_amount_high_risk_country"
    }
  }
}

Supported operators: =, !=, <, <=, >, >=, in, not_in, matches (regex), contains, is_null, is_not_null. Logical: all, any, none. Nested.

Performance targets

Metric Target
Evaluate p99 (≤100 active rules) < 10ms
Evaluate p99 (≤1000 active rules) < 50ms
Sustained evaluations/sec 5000

Storage

  • Schema decision
  • Tables: decision_rules, rule_versions, decision_logs, replay_runs
  • Logs retained 7+ years (regulatory)

Events emitted (Kafka topic: decision.events)

decision.rule.created       { ruleId, ruleSetId, version=1, status=DRAFT }
decision.rule.activated     { ruleId, ruleSetId, version, activatedAt, activatedBy }
decision.rule.deprecated    { ruleId, ruleSetId, deprecatedAt, deprecatedBy, reason }
decision.replay.completed   { replayId, range, summary, divergenceCount }

Service: Webhook

Repo (post-extraction): tiana-code/webhook-service Database schema: platform

Responsibilities

Outbound (subscriptions)

  • Manage WebhookSubscription records
  • Sign and deliver events to subscriber URLs (HMAC-SHA256)
  • Retry with exponential backoff (1m, 5m, 30m, 6h, 24h, 3d, 7d)
  • Mark PERMANENTLY_FAILED after 7 attempts
  • Provide delivery inspection API

Inbound (provider webhooks)

  • Routes inbound webhooks to appropriate service (Payment, Compliance)
  • Verifies provider signatures
  • Performs replay-protection deduplication
  • Exposes /v1/webhooks/<provider> endpoints

Public API

Method Path Description
POST /v1/webhooks/subscriptions Create subscription
GET /v1/webhooks/subscriptions List own subscriptions
GET /v1/webhooks/subscriptions/{id} Get subscription
PATCH /v1/webhooks/subscriptions/{id} Update (events, secret rotation)
DELETE /v1/webhooks/subscriptions/{id} Delete
GET /v1/webhooks/subscriptions/{id}/deliveries Delivery history
POST /v1/webhooks/subscriptions/{id}/deliveries/{dId}/retry Manual retry
POST /v1/webhooks/payments/{providerId} Inbound from bank
POST /v1/webhooks/kyc/{providerId} Inbound from KYC

Outbound delivery flow

See User-Flows#uc-15-subscribe-to-webhooks.

Storage

  • Schema platform
  • Tables: webhook_subscriptions, webhook_deliveries, processed_webhooks (inbound dedup)

Service: Outbox Dispatcher

Repo (post-extraction): tiana-code/outbox-dispatcher Library also publishable as com.fincore:outbox-dispatcher

Responsibilities

  • Poll all outbox_events tables across schemas
  • Publish to Kafka (preserves per-aggregate ordering via partition key)
  • Mark dispatched (idempotent on consumer side via processed_events)
  • Track failures, alert on backlog

Operational characteristics

  • Lease-based work: multiple replicas use SELECT FOR UPDATE SKIP LOCKED to share work safely
  • Polling interval: 100ms (configurable, can drop to 50ms for low-latency scenarios)
  • Batch size: 100 events per poll
  • Per-aggregate ordering: events with same aggregate_id go to same Kafka partition (key = aggregate_id)
  • Failure handling: 5 failed attempts → row marked FAILED, alert; manual intervention required

Configuration

fincore:
  outbox:
    poll-interval: 100ms
    batch-size: 100
    max-attempts: 5
    failure-alert-threshold: 1   # any FAILED row triggers alert
    schemas: [ledger, payments, compliance, decision, platform]

Metrics

  • outbox.events.pending - gauge per schema
  • outbox.events.published.total - counter
  • outbox.events.failed.total - counter
  • outbox.dispatcher.lag.seconds - time from row creation to publish

Service: API Gateway

Repo (post-extraction): tiana-code/api-gateway Tech: Spring Cloud Gateway (reactive, on Netty)

Responsibilities

  • TLS termination (or delegated to ingress)
  • JWT validation (cached JWKS)
  • Rate limiting (Redis token bucket, per-route + per-user)
  • Request routing to backend services
  • Correlation ID injection
  • Request/response logging
  • CORS handling
  • Compression
  • Path rewriting (/v1/... → service-internal /...)

Configuration (excerpt)

spring:
  cloud:
    gateway:
      routes:
        - id: ledger
          uri: http://ledger-service
          predicates:
            - Path=/v1/accounts/**, /v1/transactions/**
          filters:
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 100
                redis-rate-limiter.burstCapacity: 200
                key-resolver: "#{@principalKeyResolver}"
        - id: payments
          uri: http://payment-service
          predicates:
            - Path=/v1/payments/**, /v1/webhooks/payments/**

resilience4j:
  ratelimiter:
    instances:
      api-default:
        limit-for-period: 1000
        limit-refresh-period: 1m
        timeout-duration: 0s

Performance targets

Metric Target
Gateway overhead p99 < 10ms
Concurrent connections per replica 10000
TLS handshake p99 < 100ms

Service-to-service communication

v0.1 Modular Monolith (in-process)

Services interact via Kotlin function calls - LedgerService, DecisionEngineService etc. injected via Spring DI. Type-safe, zero overhead, but boundaries enforced via:

  • Module ownership in settings.gradle.kts (Gradle modules)
  • ArchUnit tests forbidding cross-context entity imports
  • Module-private packages (internal Kotlin visibility)

Post-extraction (microservices, Y1 H2+)

Same Kotlin interfaces become REST clients generated from OpenAPI:

// Auto-generated from openapi.yaml
class LedgerServiceRestClient(
    private val webClient: WebClient,
) : LedgerService {
    override suspend fun postTransaction(cmd: PostTransactionCommand): Transaction =
        webClient.post().uri("/v1/transactions")
            .bodyValue(cmd)
            .retrieve()
            .awaitBody()
}

Drop-in replacement - calling code doesn't change. Resilience patterns (circuit breaker, retry) added at the client layer.

For latency-critical synchronous calls (Decision Engine evaluations during payment), gRPC is an alternative - saves serialization cost at p99. Decision after measuring v0.1 performance.


Inter-service contracts (v0.1 - modular monolith)

Even in the same JVM, we treat inter-service calls as black-box contracts documented as if remote:

// In ledger-api (published interface, no implementation visible)
interface LedgerService {
    /**
     * Post a double-entry transaction. Idempotent via cmd.idempotencyKey.
     *
     * @throws InvalidTransactionException if invariants violated
     * @throws AccountNotFoundException if any account missing
     * @throws AccountInactiveException if any account not ACTIVE
     * @throws ConcurrencyException if optimistic lock fails after retries
     *
     * @return PostedTransaction with txn ID
     */
    suspend fun postTransaction(cmd: PostTransactionCommand): PostedTransaction

    suspend fun reverseTransaction(id: TransactionId, reason: String): PostedTransaction

    suspend fun getBalance(accountId: AccountId, currency: Currency, asOf: Instant?): Balance
}

Implementations are not exposed across module boundaries.


Service deployment matrix

Service Stateless DB Cache Kafka role Scaling strategy
Gateway yes no yes (rate limit) no HPA on CPU + RPS
Ledger yes yes (writes) no producer (via outbox) HPA on CPU + DB pool usage
Payment yes yes (writes) yes (idempotency) producer + consumer HPA on CPU + queue depth
Compliance yes yes (writes) yes (rules) consumer (heavy) + producer HPA on consumer lag
Decision Engine yes yes (rules R/W, logs W) yes (rules) producer HPA on CPU
Webhook yes yes (subscriptions, deliveries) no consumer (all topics) HPA on outbox queue depth
Outbox Dispatcher yes (lease-based) yes (reads, marks) no producer Vertical (1-3 replicas, lease)

Cross-cutting decisions documented as ADRs

  • ADR-0001 - modular monolith vs microservices for v0.1
  • ADR-0002 - license choice
  • ADR-0003 - outbox pattern
  • ADR-0004 - Hibernate over R2DBC
  • ADR-0005 - Keycloak adoption
  • ADR-0006 - Redpanda default
  • ADR-0007 - invariant enforcement strategy
  • ADR-0008 - decision engine DSL choice
  • ADR-0009 - license alternative analysis
  • ADR-0010 (planned) - saga pattern adoption when extracting services
  • ADR-0011 (planned) - TigerBeetle adapter integration

Clone this wiki locally