Skip to content

feat(platform): tier 1 + tier 2 expansion — 11 features (audit, JWKS, OTel, Redis cache+limiter, jobs, storage, mailer, webhooks, RBAC+audit, authn, forge gen)#6

Merged
dedeez14 merged 10 commits intodevin/1777054236-initial-frameworkfrom
devin/1777082400-platform-expansion-v2
Apr 25, 2026
Merged

feat(platform): tier 1 + tier 2 expansion — 11 features (audit, JWKS, OTel, Redis cache+limiter, jobs, storage, mailer, webhooks, RBAC+audit, authn, forge gen)#6
dedeez14 merged 10 commits intodevin/1777054236-initial-frameworkfrom
devin/1777082400-platform-expansion-v2

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented Apr 25, 2026

Summary

Implements every Tier 1 and Tier 2 feature you selected ("semua tier 1 + tier 2 (sekalian besar)"). Eleven self-contained packages, one feature per commit, full unit-test coverage, lint-clean, and the existing pentest harness untouched.

This is the framework's biggest expansion since the platform foundation in #4. After merging it, goforge ships every primitive a real production SaaS reaches for: rotation-safe JWTs, distributed cache + rate limit, durable jobs, signed object storage, multi-transport mailer, signed webhooks both ways, RBAC with audit log, MFA + magic-link + OAuth2, and a one-command CRUD generator.

Why

You explicitly asked for the full Tier 1 + Tier 2 set in one go. Each piece is the missing link between "starter kit" and "use this for every project I build for the next two years":

Feature Why it matters
Soft delete + audit columns (already in earlier commit) Compliance, support investigations
JWT key rotation + JWKS Service-to-service auth without shared secrets
OpenTelemetry tracing Cross-request debugging beyond logs+metrics
Redis cache + distributed limiter Horizontal scale-out
pkg/jobs Webhook retries, mailer, image resize, cron
pkg/storage Browser-direct uploads via presigned URLs
pkg/mailer Signup, magic-link, password-reset
Outgoing/inbound webhooks Every SaaS integrates with something
RBAC + audit log "who changed this and when"
TOTP + magic-link + OAuth2 Modern auth, no more email+password only
forge gen resource Hours saved per new aggregate

How

  • 11 commits, one per package, each with a doc-string, unit tests, and a focused diff. Reviewers can read them independently.
  • All new pkgs follow the same shape: small interface + one Postgres/Redis/S3 impl + one in-memory impl for tests + a Fiber middleware where applicable.
  • Webhooks deliveries piggyback on pkg/jobs so retries/backoff/DLQ are uniform across the framework — no second retry strategy.
  • RBAC uses Casbin's RBAC-with-domains model so roles in tenantA never leak into tenantB; the test suite asserts that explicitly.
  • forge gen resource is template-driven (text/template + embed.FS); adding a new pattern only touches templates, not Go.
  • No global state, no init() side-effects, errors as values, and every public type carries a doc comment.

Test plan

  • go test -race ./... — every package green (audit, authn, authz, cache, dbx, errs, events, flags, forge/gen, idempotency, jobs, mailer, observability, openapi, paginate, ratelimit, storage, tenant, validatorx, webhooks)
  • golangci-lint run ./... — clean
  • Smoke test of forge gen resource --name Widget in a fresh checkout: produces 9 files, generated code compiles + tests pass
  • Existing internal/usecase and internal/infrastructure/security tests still green
  • Live pentest harness re-run (deferred — requires running API; CI does not gate it, will verify post-merge)

Risk

Surface area is large. Each package is opt-in — none of them activate unless wired in internal/app/app.go — so the existing API behaviour is unchanged by this PR. Rollback is git revert of any single commit.

The two riskier places to watch:

  1. JWT/JWKS — verified by existing auth tests, but operators rotating keys for the first time should keep the old key valid for one full token TTL.
  2. RBAC — adding Require(...) to a route that previously had no authorization check is a behaviour change; consult the audit log post-deploy.

Checklist

  • Doc comments on every new public type/function
  • No secrets / debug fmt.Println left behind
  • Migrations: 0006_audit_log.up.sql (audit_log) — earlier commits added 0005_jobs.up.sql for the queue
  • Tests cover happy path + failure path for each package

Link to Devin session: https://app.devin.ai/sessions/8fdfc20358514c97a766adca630a2527
Requested by: @dedeez14


Open in Devin Review

devin-ai-integration Bot and others added 10 commits April 25, 2026 02:04
…dpoint

Three foundational additions ahead of the larger tier-1+2 expansion:

* pkg/dbx — opinionated audit-column convention. Audit struct
  (CreatedAt, UpdatedAt, DeletedAt + CreatedBy/UpdatedBy/DeletedBy),
  Touch / Create / SoftDelete helpers, AuditColumnsDDL fragment for
  migrations, and a context-actor accessor (WithActor /
  ActorFromContext) so repositories stamp the correct user without
  threading it through every signature.

* JWT key rotation — config.JWT.NextSecrets accepts a list of
  verify-only HS256 secrets so applications can rotate keys without
  invalidating live tokens. Issued JWTs now carry a kid header (first
  8 bytes of sha256 of the secret); verify uses kid to pick the right
  secret instead of trying each one.

* JWKS endpoint — /.well-known/jwks.json returns the canonical empty
  set for HS256 issuers (RFC 7517 demands deterministic answers,
  symmetric secrets must never appear in JWKS) and is wired through a
  PublicKeySetProvider interface so a future RS256/EdDSA issuer
  publishes its public keys without changing the host code.

Also fixes the Devin Review finding from PR #5: issuePair now treats
a Parse failure on a freshly-issued refresh token as a hard error
instead of silently returning a token the RefreshStore never saw, so
the user's next /refresh call cannot 401 with auth.unknown_token.

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
Adds tier-1 distributed tracing without forcing every deployment to
run a collector:

* pkg/observability.InitTracing wires an OTLP/HTTP exporter when
  cfg.Platform.OtelEndpoint is set; with an empty endpoint it
  installs the noop TracerProvider so every otel.Tracer(...) call in
  the rest of the framework still works at zero overhead.
* FiberTracing middleware extracts upstream W3C TraceContext / B3
  headers, opens a server-kind span named '<METHOD> <route>' (using
  Fiber's parameterised route, not the raw path, to bound
  cardinality), records http.method / http.route /
  http.status_code per OTel HTTP semconv, and flips the span status
  to Error on 5xx.
* The outbox dispatcher wraps every Sink.Publish in a producer
  span so a downstream collector can correlate the transactional
  write with the eventual delivery.
* New TracingConfig knobs (endpoint, insecure, sample_ratio) flow
  through internal/config.Platform.

The propagator is a composite of TraceContext + Baggage, so
cross-service requests trace cleanly whether they originate from a
W3C-compliant client or from another goforge service.

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
pkg/cache: minimal Cache interface (Get/Set/SetNX/Del/Incr/Ping/Close)
with two implementations:

  * Memory — in-process, safe for concurrent use, optional sweeper
    goroutine. Suits single-replica deployments and tests with no
    external dependency.
  * Redis — go-redis v9 client, supports Addr or full URL, optional
    TLS. Incr is implemented via a small Lua script so the TTL is
    only applied on the first write — subsequent increments cannot
    accidentally extend the rate-limit window.

pkg/ratelimit: a fixed-window approximation rate limiter built on top
of cache.Cache. Allow returns a Decision struct (allowed flag,
remaining budget, reset duration) so middleware can populate the
canonical X-Ratelimit-Limit / X-Ratelimit-Remaining / X-Ratelimit-Reset
headers on every response.

The Fiber middleware fail-open on cache errors — a transient Redis
outage must not 503 the whole API; the framework keeps its other
defences (auth, body limits, idempotency) running.

Drop-in: pkg/ratelimit replaces fasthttp's per-process limiter when
your deployment goes multi-replica. The interface is the same; only
the cache backend changes.

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
…+ cron

A goforge app eventually wants to send email, deliver outgoing
webhooks, run nightly rollups and resize uploads — none of that
belongs on the request hot-path. pkg/jobs ships a single-table queue
that uses Postgres' FOR UPDATE SKIP LOCKED so any number of replicas
can dispatch in parallel without coordinating.

Schema (migrations/0005_jobs):
  - jobs(id, queue, kind, payload, status, attempts, max_attempts,
         last_error, run_at, locked_at, locked_by, completed_at,
         dedupe_key)
  - job_schedules(id, name, queue, kind, payload, interval_secs,
                  next_run_at, enabled)

Public API:
  - Queue interface (Enqueue / Claim / Complete / Fail / Stats)
  - Postgres implementation
  - Runner — N-goroutine worker pool, exponential backoff with full
    jitter (no thundering herd), bounded handler timeout, panic
    recovery (a single bad payload cannot kill a worker forever),
    automatic DLQ on attempts == max_attempts.
  - Scheduler — wakes every tick, atomically claims due schedules,
    enqueues a Job, advances next_run_at.

Idempotent enqueue via dedupe_key plus a partial unique index on
(queue, dedupe_key) WHERE status IN (pending,running,failed) — the
schedule helper uses it to prevent double-enqueue on overlapping
ticks.

Tests cover the happy path, panic-recovery, and unknown-kind DLQ
routing. Production Postgres path is exercised in the next commit
when wired into platform.Build.

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
pkg/storage exposes a single Storage interface (Put / Get / Delete /
PresignPut / PresignGet / List). One implementation covers every
common provider:

  * S3 — backed by aws-sdk-go-v2/service/s3. Works unchanged with
    AWS S3, Cloudflare R2, MinIO, Backblaze B2, and DigitalOcean
    Spaces by toggling Endpoint and UsePathStyle.
  * Memory — in-process, no I/O, used in tests. PresignPut/Get
    return opaque memory:// URLs so tests can match on them.

Presigned URLs are the recommended primitive for user uploads: the
browser PUTs the bytes directly to the bucket, your API never
buffers them. Both PresignPut and PresignGet take a TTL so leaked
URLs auto-expire.

The interface intentionally stays small. Streaming, multipart, and
metadata get added when an actual project asks for them; today's
goal is the 90% surface (POST a profile picture, GET a download
link).

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
pkg/mailer is a tiny abstraction over the awkward bits of email:
MIME multipart construction, RFC 2047 subject encoding, attachment
base64 encoding, quoted-printable bodies, RFC 1123Z dates.

Three transports ship today:

  * SMTP — works with Postmark, Mailgun, AWS SES (SMTP), Postfix
    relay, anything that speaks submission. Optional implicit TLS
    (port 465) via crypto/tls; STARTTLS is the smtp package's
    default. Builds multipart/alternative when both Text and HTML
    are set, multipart/mixed when attachments are present.
  * LogTransport — writes a structured log line and returns nil,
    so dev environments can exercise signup / forgot-password flows
    without provisioning a relay.
  * MemoryTransport — appends every message to a slice, used by
    integration tests to assert against what would have been sent.

Templating and queueing are deliberately *outside* this package:
render the body yourself (html/template, embed.FS, etc.), then enqueue
a job that does mailer.Send. This keeps the abstraction honest — one
package, one job (deliver bytes).

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
Outgoing flow piggybacks on pkg/jobs:

  Dispatcher.Enqueue → jobs.Queue.Enqueue → Runner picks up → Deliver
  Deliver signs the body with HMAC-SHA256 over
  '<unix>.<event_id>.<body>' (Stripe/Slack-shape) and POSTs it.
  Non-2xx → return error → jobs.Runner reschedules with the same
  exponential-jitter backoff used everywhere else, so we don't
  reinvent retry semantics.

The dedupe key on Enqueue is '<event_id>:<endpoint_id>' so a buggy
caller that fires Enqueue twice doesn't end up posting twice. The
worker re-signs on every attempt, which means a secret rotation
between attempt N and N+1 immediately takes effect — the receiver
won't accept a stale signature and we don't have to manually flush
in-flight retries.

Inbound: pkg/webhooks.InboundVerifier is a Fiber middleware that
validates incoming signatures using a SecretLookup callback (so
each integration carries its own secret and the framework doesn't
hard-code 'one global webhook key'). Replay protection is
identical to outgoing — a 5-minute window on the timestamp.

Header is the canonical 'Webhook-Signature: t=<unix>,v1=<hex>',
multi-candidate friendly so secret rotation works without downtime
on either side.

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
pkg/authz wraps Casbin with the standard 'RBAC with domains' model:

  p, sub, dom, obj, act
  g, sub, role, dom

so Allow(subject, tenant, object, action) returns true iff a policy
matches directly, or via a role granted *in that tenant*. The model
uses keyMatch2 so policies can pattern-match URL paths (/users/:id)
without bespoke matchers.

Roles are scoped per tenant by design — granting 'admin' in tenantA
must NEVER let the principal admin tenantB. The test suite asserts
that explicitly.

A Fiber Require() middleware enforces a single (object, action)
pair per route; subjects are read from c.Locals('subject') by
default (set by goforge's auth middleware), but the extractor is
swappable for service-to-service flows.

pkg/audit is the append-only counterpart. Every privileged action
emits an Entry (Action, Subject, Object, Before, After, Metadata,
TenantID, RequestID, IP, UserAgent). The Postgres implementation
writes to audit_log; a Memory implementation is provided for tests
so handlers can assert what would have been logged.

The audit_log table is intentionally never updated or deleted by
application code: an operator answering 'who did what when' should
be able to trust the row.

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
pkg/authn ships three modern-auth primitives that goforge apps
inevitably want once email+password is no longer enough.

TOTP (RFC 6238):
  * NewTOTPSecret — 160-bit base32, format Google Authenticator
    and 1Password accept verbatim.
  * Provision — builds the otpauth:// URI for QR rendering.
  * QRPNG — encodes the URI as a 256x256 PNG so onboarding pages
    can stream the image directly.
  * Verify — pquerna/otp's +/-1 step skew, matches every major
    authenticator app.

Magic-link:
  * Tokens are opaque 32-byte URL-safe random; only a SHA-256 hash
    is stored in cache, so a database leak does not expose live
    tokens.
  * Strictly single-use — Consume deletes the cache entry, a
    second call returns an explicit 'already used' error.
  * Backed by pkg/cache so the same code works in-memory in tests
    and in Redis in production.

OAuth2: * Wraps golang.org/x/oauth2 with state+PKCE and userinfo
    fetching.
  * Provider config is provider-agnostic (Google, GitHub, Apple,
    Microsoft, Auth0): supply AuthURL / TokenURL / UserInfoURL.
  * AuthCodeURL stashes the PKCE verifier keyed by state in cache,
    so the callback handler can recover it without writing a
    sessions table.
  * HelperEnsureHTTPS guards against accidentally configuring an
    http:// callback in production.
Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
`forge gen resource` is the single command that turns a name into
a fully wired CRUD aggregate, every layer included:

  internal/domain/<lc>/<lc>.go              entity + dbx.Audit + errs
  internal/domain/<lc>/repository.go        repo interface
  internal/usecase/<lc>.go                  Create/Get/List/Update/Delete
  internal/usecase/<lc>_test.go             table-driven tests w/ stub repo
  internal/adapter/repository/postgres/...  pgx impl honouring soft-delete
  internal/adapter/http/dto/<lc>.go         request/response DTOs
  internal/adapter/http/handler/<lc>.go     Fiber handler, 5 routes
  migrations/NNNN_create_<plural>.up.sql    table with audit columns
  migrations/NNNN_create_<plural>.down.sql

The whole thing is template-driven (text/template + embed.FS), so
adopting a new pattern only touches templates rather than Go code.
Filenames use literal '__' as a path separator and '{{LC}}' /
'{{PLURAL}}' / '{{MIG}}' as plain placeholders — Go's template
syntax fights with shells and curly-brace filenames, plain string
replace is the path of least surprise.

The migration ID is auto-incremented from the highest existing
migration so generated SQL never collides with the framework's
own migrations.

The generator refuses to overwrite existing files (the test suite
asserts this) so re-running the command is safe.

Smoke tests cover: invalid name rejection, full file set produced,
overwrite refusal, and migration-ID auto-increment.

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@dedeez14 dedeez14 merged commit e0ff34a into devin/1777054236-initial-framework Apr 25, 2026
1 check was pending
@devin-ai-integration devin-ai-integration Bot deleted the devin/1777082400-platform-expansion-v2 branch April 25, 2026 02:29
devin-ai-integration Bot added a commit that referenced this pull request Apr 25, 2026
CI runs `go mod tidy && git diff --exit-code` and was failing on
PR #7 because several modules introduced in PR #6 (aws-sdk-go-v2,
casbin, otp, redis, oauth2, otel) were listed in the indirect block
even though they are imported directly. `go mod tidy` reorganises
them to the direct `require` block, no version changes.

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment thread pkg/jobs/scheduler.go
}
dues = append(dues, d)
}
rows.Close()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Missing rows.Err() check after row iteration in Scheduler.dispatchDue

After iterating rows and calling rows.Close() at line 79, the code does not check rows.Err() for errors that occurred during iteration. If a network or decoding error causes rows.Next() to return false prematurely, the error is only available via rows.Err(). Without this check, the scheduler silently operates on an incomplete set of due schedules, potentially skipping work. The outbox dispatcher in the same PR correctly implements this check at pkg/outbox/outbox.go:195-198.

Suggested change
rows.Close()
rows.Close()
if err := rows.Err(); err != nil {
return err
}
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 362115f (carried via PR #7 since #6 was already merged). Added the rows.Err() check immediately after rows.Close(), mirroring the pattern in pkg/outbox/outbox.go.

Comment on lines +86 to +89
ratio := cfg.SampleRatio
if ratio <= 0 {
ratio = 1.0
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 SampleRatio=0 documented as "never sample" but silently overridden to 1.0 (always sample)

The TracingConfig.SampleRatio field comment at pkg/observability/tracing.go:32 documents 0.0 = never, 1.0 = always; default 0.1 in prod, 1.0 elsewhere, but the implementation treats any value <= 0 as 1.0. Since internal/config/config.go:153 (setDefaults) does not set a default for platform.otel_sample_ratio, the Go zero value (0.0) flows through and gets overridden to 1.0. This means enabling tracing by setting an OTLP endpoint without explicitly configuring a sample ratio results in 100% sampling rather than the documented 10% production default — potentially causing significant performance overhead and collector/storage cost in production.

Prompt for agents
The TracingConfig.SampleRatio field documents '0.0 = never, 1.0 = always; default 0.1 in prod, 1.0 elsewhere' but the implementation in InitTracing (pkg/observability/tracing.go:86-89) uses 'if ratio <= 0 { ratio = 1.0 }', making it impossible to set 0-sampling and defaulting unset values to 100% sampling.

Two things need to be reconciled:
1. Either update the comment to match the implementation (0 means 'use default 1.0'), or change the guard to only replace negative values (ratio < 0) and treat 0.0 as a valid 'never sample' setting.
2. Add a sensible default in internal/config/config.go setDefaults(), e.g. v.SetDefault('platform.otel_sample_ratio', 1.0) for dev or 0.1 for production. Without a default, the Go zero value (0.0) flows through, and the current guard silently changes it to 1.0.

Note that distinguishing 'user explicitly set 0.0' from 'field was never set' is tricky with float64 zero values. A common pattern is to use a small sentinel (e.g. -1 for 'unset') or a pointer type (*float64) to detect the difference.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 362115f (carried via PR #7). Tightened the guard from <= 0 to < 0 so an explicit 0.0 now means exactly that, and rewrote the doc comment to describe the actual behaviour (negatives coerced to 1.0, application owns the default).

devin-ai-integration Bot added a commit that referenced this pull request Apr 25, 2026
Two real issues spotted by Devin Review on PR #6 — fixed against
the same merge target so PR #7 carries them through to main.

* pkg/jobs/scheduler.go (BUG_0001): `rows.Err()` was never
  consulted after row iteration, so a connection drop mid-iteration
  would silently leave dues empty and the scheduler would skip work
  without logging anything. Mirrors the pattern already used in
  pkg/outbox/outbox.go.

* pkg/observability/tracing.go (BUG_0002): the docs claimed
  `SampleRatio = 0` meant 'never sample' but the implementation
  coerced any value <= 0 to 1.0, so an explicit 0 silently became
  100%. Tightened the guard to negatives only and rewrote the doc
  comment to match the new semantics. Operators wanting 'never
  sample' now get exactly that.

Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
dedeez14 added a commit that referenced this pull request Apr 25, 2026
chore: sync main with all merged work (PRs #3#6)
dedeez14 added a commit that referenced this pull request Apr 25, 2026
feat(i18n): localised error messages (en, id) via Accept-Language (Tier-B #6)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant