Skip to content

Architecture Overview

Tiana_ edited this page May 30, 2026 · 1 revision

Architecture Overview

High-level view of FinCore Engine's structure, using the C4 model. Four levels of zoom: System Context → Containers → Components → Code (the last one is the source itself).

Companion pages: Architecture-Services, Architecture-Event-Flow, Architecture-Security, Architecture-Observability, Architecture-SLA-SLI-SLO.


Architectural style

  • Modular monolith (today) - single deployable artifact, internally split into bounded contexts (Ledger, Payments, Compliance, Decision, Platform). See ADR-0001.
  • Event-driven core - outbox pattern + Kafka-API broker for cross-context communication. See ADR-0003.
  • Hexagonal layering per service - domain, application (ports), infrastructure (adapters). See Domain-Model for domain layer details.
  • Pluggable adapters for external systems - bank, KYC, AML data, sanctions, LLM, ML scorers. Sandbox implementations in OSS; real ones swapped per deployment.
  • Polyrepo with umbrella: tiana-code/fincore-engine is the umbrella; sibling repos (tiana-code/decision-engine, etc.) extracted as standalone OSS subprojects when they earn it.
  • Postgres-default storage with optional TigerBeetle adapter (Y1 H2). See ADR-0004.

C4 - Level 1: System Context

C4Context
    title FinCore Engine - System Context
    Person(developer, "Developer / Adopter", "Builds a fintech app on FinCore Engine")
    Person(complianceOfficer, "Compliance Officer", "Resolves AML cases, manages rules")
    Person(operator, "Operator", "Monitors, runs reconciliations, manual overrides")
    Person(endUser, "End user", "Customer of the fintech (uses the adopter's app)")

    System(fincore, "FinCore Engine", "Open-source fintech core: ledger, payments, compliance, decision engine")

    System_Ext(adopterApp, "Adopter's Application", "Web/mobile UI built by the adopter; talks to FinCore via API")
    System_Ext(bank, "Banking Partner", "pluggable provider; real adapters outside OSS")
    System_Ext(kyc, "KYC Provider", "pluggable provider; real adapters outside OSS")
    System_Ext(amlData, "AML Data Vendor", "pluggable provider; real adapters outside OSS")
    System_Ext(llm, "LLM Provider", "OpenAI / Anthropic / Ollama (pluggable)")
    System_Ext(idp, "Identity Provider", "Keycloak (bundled OSS) or external OIDC")

    Rel(endUser, adopterApp, "Uses", "HTTPS")
    Rel(adopterApp, fincore, "Reads/writes ledger, payments, compliance", "REST/JSON over HTTPS, OIDC JWT")
    Rel(developer, fincore, "Operates, deploys, configures", "Helm, kubectl, Docker Compose")
    Rel(complianceOfficer, fincore, "Resolves cases, manages rules", "Web UI / REST")
    Rel(operator, fincore, "Monitoring, reconciliation, manual ops", "Web UI / CLI / Grafana")

    Rel(fincore, bank, "Sends/receives payments", "REST/SOAP/SEPA/SWIFT")
    Rel(fincore, kyc, "Verifies identities", "REST")
    Rel(fincore, amlData, "Screens parties", "REST")
    Rel(fincore, llm, "Drafts reports, explains alerts", "REST (OpenAI-compat)")
    Rel(fincore, idp, "Authenticates users/clients", "OIDC")
Loading

Trust boundaries:

  • Public internet ↔ Gateway: TLS termination, rate limiting, JWT verification
  • Gateway ↔ Services: mTLS recommended in production (cluster-internal)
  • Services ↔ External providers: HTTPS + per-provider auth (HMAC, API key, mTLS)

C4 - Level 2: Container Diagram

C4Container
    title FinCore Engine - Containers

    Person(client, "Client App / Operator")

    System_Boundary(fincore, "FinCore Engine deployment") {
        Container(gateway, "API Gateway", "Spring Cloud Gateway", "TLS, JWT verify, rate limit, routing")

        Container(ledger, "Ledger Service", "Kotlin / Spring Boot 3.5", "Accounts, transactions, balances, double-entry invariants")
        Container(payments, "Payment Service", "Kotlin / Spring Boot 3.5", "Payment lifecycle, idempotency, state machine")
        Container(compliance, "Compliance Service", "Kotlin / Spring Boot 3.5", "KYC orchestration, AML rules, case management")
        Container(decision, "Decision Engine", "Kotlin / Spring Boot 3.5", "JSON-DSL rules, deterministic evaluation, audit log")
        Container(webhook, "Webhook Service", "Kotlin / Spring Boot 3.5", "Inbound provider webhooks, outbound subscription delivery")

        Container(outboxDisp, "Outbox Dispatcher", "Kotlin / Spring Boot 3.5", "Polls outbox tables, publishes to Kafka, marks dispatched")

        ContainerDb(postgres, "PostgreSQL 17", "RDBMS", "Per-service schemas, materialized views, deferred triggers")
        Container(redpanda, "Redpanda", "Kafka API", "Event bus, retry topics, DLQ")
        ContainerDb(redis, "Redis 7", "Cache", "Idempotency keys cache, JWKS cache, decision rule cache")
        Container(keycloak, "Keycloak 26.6", "Identity Provider", "OIDC, JWT issuance, RBAC")
    }

    System_Ext(bank, "Banking Partner")
    System_Ext(kyc, "KYC Provider")
    System_Ext(llm, "LLM Provider")

    Rel(client, gateway, "REST/JSON", "HTTPS, JWT")
    Rel(gateway, ledger, "REST", "HTTP, propagated JWT")
    Rel(gateway, payments, "REST", "HTTP")
    Rel(gateway, compliance, "REST", "HTTP")
    Rel(gateway, decision, "REST", "HTTP")
    Rel(gateway, webhook, "REST", "HTTP")

    Rel(payments, decision, "Sync evaluate", "REST")
    Rel(compliance, decision, "Sync evaluate", "REST")
    Rel(payments, ledger, "Sync post transaction", "REST")
    Rel(compliance, ledger, "Sync reverse on reject", "REST")

    Rel(ledger, postgres, "JPA/JDBC", "TLS")
    Rel(payments, postgres, "JPA/JDBC", "TLS")
    Rel(compliance, postgres, "JPA/JDBC", "TLS")
    Rel(decision, postgres, "JPA/JDBC", "TLS")
    Rel(webhook, postgres, "JPA/JDBC", "TLS")

    Rel(outboxDisp, postgres, "Polls outbox tables", "TLS")
    Rel(outboxDisp, redpanda, "Publishes events", "Kafka API")

    Rel(compliance, redpanda, "Consumes ledger.events for AML", "Kafka API")
    Rel(webhook, redpanda, "Consumes all topics for outbound delivery", "Kafka API")

    Rel(payments, redis, "Idempotency cache", "RESP")
    Rel(decision, redis, "Rule cache", "RESP")
    Rel(gateway, redis, "Rate limit counters, JWKS", "RESP")

    Rel(gateway, keycloak, "JWKS fetch, token introspection", "HTTPS")

    Rel(payments, bank, "Send/receive via Bank Adapter", "Provider-specific")
    Rel(compliance, kyc, "Verify identity via KYC Adapter", "Provider-specific")
    Rel(compliance, llm, "Draft reports via AmlCopilot adapter", "REST")
Loading

Why each container:

Container Why exists
Gateway One TLS endpoint, one rate-limiter, one JWT verifier, one place for cross-cutting concerns
Ledger Service Owns financial truth. Most performance-sensitive. Highest invariants.
Payment Service Coordinates lifecycle. Translates between business concepts (Payment) and ledger primitives (Transaction).
Compliance Service Implements AML/KYC orchestration. Heavy event consumer. Heavy operator UI.
Decision Engine Standalone library + service mode. Pure function. Could be extracted as separate OSS hit.
Webhook Service Two-direction async traffic - separating it isolates retry storms from main API
Outbox Dispatcher Background lease-based worker. Could be a separate process or thread within services. We keep it separate for clear ownership.
PostgreSQL Source of truth. Per-service schemas (logical isolation) but one physical instance for v0.1 simplicity.
Redpanda Kafka-API event bus. Single binary, fast startup. Production deployments swap for Kafka native.
Redis Hot path caches. Optional - services degrade gracefully without it.
Keycloak OIDC provider. Bundled in OSS for docker compose up completeness. Production deployments use external OIDC.

C4 - Level 2.5: Modular Monolith view (alternative deployment)

For local dev and small-scale production, all the application services run in a single JVM - modular monolith mode:

flowchart TB
    Client[Client] --> Gateway[Gateway Module]

    Gateway --> Ledger[Ledger Module]
    Gateway --> Payment[Payment Module]
    Gateway --> Compliance[Compliance Module]
    Gateway --> Decision[Decision Module]
    Gateway --> Webhook[Webhook Module]

    Ledger --> DB[(PostgreSQL<br/>schemas: ledger, payments, compliance, decision, platform)]
    Payment --> DB
    Compliance --> DB
    Decision --> DB
    Webhook --> DB

    OutboxDispatcher[Outbox Dispatcher Module] --> DB
    OutboxDispatcher --> Kafka[Redpanda]

    Compliance --> Kafka
    Webhook --> Kafka

    Payment -.uses.-> Ledger
    Payment -.uses.-> Decision
    Compliance -.uses.-> Decision
    Compliance -.uses.-> Ledger
Loading

Rules of modular monolith (enforced via Gradle + ArchUnit):

  1. Modules talk to each other only through published interfaces (no direct entity imports)
  2. Each module has its own DB schema - no cross-schema joins outside read-only views
  3. Inter-module calls are synchronous in-JVM - but the contracts are designed to also work over HTTP, so extraction is mechanical
  4. No shared mutable state between modules
  5. Events flow through outbox → Kafka, even within the monolith - same transport, no shortcuts

C4 - Level 3: Component Diagram (per service)

The per-service component view is identical across services. Each follows hexagonal layering:

C4Component
    title Per-Service Component Layout (Ledger Service shown - others identical)

    Container_Boundary(svc, "Ledger Service") {
        Component(restCtrl, "REST Controllers", "Spring MVC", "Thin: validate input, delegate to use-case")
        Component(useCases, "Use Case Services", "Kotlin interfaces + impls", "Application orchestration, @Transactional boundary")
        Component(domain, "Domain Layer", "Pure Kotlin", "Aggregates, value objects, invariants")
        Component(jpaRepos, "JPA Repositories", "Spring Data JPA", "DB access; never imported by domain")
        Component(mappers, "MapStruct Mappers", "MapStruct + KSP", "Entity ↔ DTO conversions")
        Component(eventPub, "Event Publisher", "Outbox writer", "Writes outbox rows in same DB tx as business")

        Component(secCfg, "Security Config", "Spring Security", "JWT validation, role checks")
        Component(openApiCfg, "OpenAPI Config", "Springdoc", "Auto-generates OpenAPI spec from code")
        Component(obsCfg, "Observability Config", "Micrometer + OTel", "Metrics, traces, structured logs")
        Component(excHandler, "Exception Handler", "@RestControllerAdvice", "RFC 7807 problem details responses")
    }

    Rel(restCtrl, useCases, "delegates")
    Rel(useCases, domain, "uses")
    Rel(useCases, jpaRepos, "reads/writes via")
    Rel(useCases, eventPub, "publishes via")
    Rel(restCtrl, mappers, "Entity → DTO")
    Rel(restCtrl, excHandler, "rethrown handled by")
Loading

Source layout in code:

src/main/kotlin/com/fincore/<service>/
├── api/                      # @RestController, @RequestMapping
│   ├── dto/
│   │   ├── request/          # *CreateRequest, *UpdateRequest
│   │   └── response/         # *Response, *Summary
│   └── *Controller.kt
├── application/              # use-case services
│   ├── *Service.kt           # interface
│   └── *ServiceImpl.kt       # @Service, @Transactional
├── domain/                   # pure Kotlin, no Spring imports
│   ├── *.kt                  # aggregates, value objects
│   └── enum/
│       └── *.kt
├── infrastructure/           # adapters
│   ├── persistence/
│   │   ├── *Entity.kt        # @Entity (class, not data class)
│   │   ├── *Repository.kt    # interface : JpaRepository
│   │   └── *Mapper.kt        # @Mapper(componentModel = "spring")
│   ├── messaging/
│   │   ├── *EventPublisher.kt   # writes outbox rows
│   │   └── *EventConsumer.kt    # @KafkaListener (idempotent via processed_events)
│   └── external/
│       └── *Adapter.kt       # impls of plug-in interfaces (BankProvider, KycProvider, etc.)
├── config/
│   ├── SecurityConfig.kt
│   ├── OpenApiConfig.kt
│   ├── ObservabilityConfig.kt
│   └── ApplicationConfig.kt
└── exception/
    ├── DomainException.kt    # base
    ├── *Exception.kt         # specific
    └── GlobalExceptionHandler.kt   # @RestControllerAdvice

Cross-cutting concerns

Configuration management

  • 12-factor: configuration via environment variables, application.yml only for defaults.
  • Profiles: default, dev, test, prod. Profile-specific overrides in application-<profile>.yml.
  • Sensitive values: ${SECRET_NAME} placeholders, sourced from Vault / AWS Secrets Manager / K8s Secrets.
  • No secrets in env vars in production - use secret-store integration.

Tenancy strategy

  • v0.1: single-tenant per deployment. Adopters spin up one deployment per tenant.
  • Why: shipping multi-tenant in financial software invites cross-tenant data leaks. Operators take responsibility for tenancy at the deployment level.
  • A future "shared infrastructure, isolated DB schema" mode is on the v1.x+ roadmap with explicit threat-model review.

Time and clock

  • All timestamps are Instant (UTC) in domain/application layers.
  • DB: TIMESTAMPTZ (Postgres normalizes to UTC internally).
  • API: ISO-8601 with explicit Z suffix (RFC 3339).
  • Tests: Clock injection via Spring (@Bean fun clock() = Clock.systemUTC()); test config overrides with Clock.fixed().

Internationalization (i18n)

  • API: English-only error messages. Localization is the adopter's concern.
  • Logs: English-only. Operator-facing.
  • Money formatting: never done by the API. Returned as { amount: "100.00", currency: "EUR" }. Adopters format for users.

Pagination

  • Cursor-based, opaque base64-encoded cursors with HMAC integrity tag.
  • Default limit 50, max 200. limit parameter respected.
  • Response: { items, nextCursor, hasMore }.

API versioning

  • URL-path versioning: /v1/..., /v2/....
  • Each major version is a separate set of routes; old version supported for at least 12 months after new release.
  • Breaking changes within a version are forbidden (additive only). Minor changes communicated via changelog and OpenAPI diff.

Error responses (RFC 7807)

Every 4xx/5xx response is a Problem Details JSON:

{
  "type": "https://docs.fincore.dev/errors/insufficient-balance",
  "title": "Insufficient balance",
  "status": 422,
  "detail": "Account abc-123 has 50.00 EUR, transfer requires 100.00 EUR",
  "instance": "/v1/payments",
  "correlationId": "01HXYZ..."
}

Failure model and resilience

Concern Mechanism
Transient external failures (bank timeout) Retry topic + exponential backoff + DLQ
DB unavailability Connection pool with timeout; failed health probe → K8s restart
Kafka unavailability Outbox accumulates PENDING, dispatcher retries on recovery, no event loss
Optimistic lock failure Auto-retry up to 3 times, then 503 with Retry-After
Consumer crash mid-processing DB tx rolls back, message redelivered, idempotent handler dedup
Outbox dispatcher crash Lease-based work distribution; another worker picks up, idempotent on Kafka side
Webhook subscriber down Backoff retry: 1m, 5m, 30m, 6h, 24h, 3d, 7d. Marked PERMANENTLY_FAILED after 7.
Network partition Each service degrades to read-only if it can't reach DB
Slow LLM provider Configurable timeout, falls back to "no AI explanation available"

No retries from API layer - retries are the client's responsibility (or an explicit retry topic). API responses are deterministic per request.


Deployment topology

Development (single host)

docker compose up
├── postgres            (port 5432)
├── redpanda            (port 9092)
├── redis               (port 6379)
├── keycloak            (port 8081)
├── ledger              (port 8080)
├── (other services in v0.2+)
└── grafana / prom / loki / tempo (observability profile, port 3000)

Production (Kubernetes)

fincore-engine namespace
├── ledger Deployment (3 replicas, HPA 3-10)
├── payments Deployment (3 replicas, HPA 3-10)
├── compliance Deployment (2 replicas)
├── decision Deployment (2 replicas)
├── gateway Deployment (3 replicas)
├── webhook Deployment (2 replicas)
├── outbox-dispatcher Deployment (2 replicas, lease-based)

External (or in-cluster):
├── PostgreSQL (managed: RDS / Cloud SQL - recommended HA primary + 2 replicas)
├── Kafka (managed: MSK / Confluent / Strimzi operator)
├── Redis (managed: ElastiCache / Cloud Memorystore)
├── Keycloak (in-cluster or external SaaS)
└── Vault / AWS Secrets Manager (secrets)

Cross-cutting:
├── Ingress (nginx or AWS ALB) with TLS termination
├── Service mesh optional (Istio for mTLS) - recommended in regulated deployments
├── ServiceMonitor (Prometheus Operator scrape config)
├── PodSecurityPolicy / PSA-restricted
└── NetworkPolicy (default deny + explicit allows)

Helm chart structure

  • Umbrella chart fincore-engine in tiana-code/fincore-engine/deploy/helm/fincore-engine
  • Sub-charts per service in tiana-code/fincore-helm-charts/charts/<service> (Phase 1 of polyrepo extraction)
  • Default values target a small production cluster (3 nodes, 16 GB total)
  • Hardening values in values-prod.yaml (resource limits, security contexts, NetworkPolicy enabled)

Capacity & scale (initial sizing)

Targets for v0.1 single-instance deployment:

Metric Target Bottleneck if exceeded
Sustained transactions/sec 1000 Postgres single-instance, mostly disk
Burst transactions/sec 3500 Connection pool size (default 50 per service)
Active accounts 1M Postgres index size (partitioning kicks in at 10M)
Active payments concurrent 100k Mostly DB row contention
Decision Engine evaluations/sec 5000 CPU bound, scales horizontally
Webhook deliveries/sec outbound 500 Worker pool size

Beyond these targets:

  • Postgres → managed RDS with read replicas, partitioning enabled
  • Kafka → multi-broker cluster, partition count tuning
  • Service replicas → HPA based on CPU + custom metric (queue depth)
  • TigerBeetle adapter (Y1 H2) for ledger workloads beyond 10k tx/sec sustained

Why this architecture (the architect's perspective)

Three principles drive every decision:

  1. Correctness first, performance second. Money-handling code that's "fast enough" but loses 1 in 10⁶ transactions is unfit for purpose. We pay measurable latency cost (DB triggers, transaction boundaries, optimistic locking retries) for invariants that hold mathematically.

  2. Boring tech in the hot path. PostgreSQL, Spring Boot, Kotlin, Kafka API. None of these are "innovative" - and that's the point. Innovation is in what we build on top (decision engine DSL, outbox correctness, plug-in architecture), not in the foundation.

  3. Extraction-readiness over premature distribution. A modular monolith with clean boundaries can be extracted into microservices in 1-2 weeks per service. Starting with microservices imposes a 10× operational cost for benefits we don't yet need.

The combined cost: every commit is reviewed for invariant preservation, transaction boundaries, idempotency, audit trail. The combined benefit: a system that can be operated by a small team and trusted by regulators.


Where to read next

Clone this wiki locally