feat(platform): tier 1 + tier 2 expansion — 11 features (audit, JWKS, OTel, Redis cache+limiter, jobs, storage, mailer, webhooks, RBAC+audit, authn, forge gen)#6
Conversation
…dpoint Three foundational additions ahead of the larger tier-1+2 expansion: * pkg/dbx — opinionated audit-column convention. Audit struct (CreatedAt, UpdatedAt, DeletedAt + CreatedBy/UpdatedBy/DeletedBy), Touch / Create / SoftDelete helpers, AuditColumnsDDL fragment for migrations, and a context-actor accessor (WithActor / ActorFromContext) so repositories stamp the correct user without threading it through every signature. * JWT key rotation — config.JWT.NextSecrets accepts a list of verify-only HS256 secrets so applications can rotate keys without invalidating live tokens. Issued JWTs now carry a kid header (first 8 bytes of sha256 of the secret); verify uses kid to pick the right secret instead of trying each one. * JWKS endpoint — /.well-known/jwks.json returns the canonical empty set for HS256 issuers (RFC 7517 demands deterministic answers, symmetric secrets must never appear in JWKS) and is wired through a PublicKeySetProvider interface so a future RS256/EdDSA issuer publishes its public keys without changing the host code. Also fixes the Devin Review finding from PR #5: issuePair now treats a Parse failure on a freshly-issued refresh token as a hard error instead of silently returning a token the RefreshStore never saw, so the user's next /refresh call cannot 401 with auth.unknown_token. Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
Adds tier-1 distributed tracing without forcing every deployment to run a collector: * pkg/observability.InitTracing wires an OTLP/HTTP exporter when cfg.Platform.OtelEndpoint is set; with an empty endpoint it installs the noop TracerProvider so every otel.Tracer(...) call in the rest of the framework still works at zero overhead. * FiberTracing middleware extracts upstream W3C TraceContext / B3 headers, opens a server-kind span named '<METHOD> <route>' (using Fiber's parameterised route, not the raw path, to bound cardinality), records http.method / http.route / http.status_code per OTel HTTP semconv, and flips the span status to Error on 5xx. * The outbox dispatcher wraps every Sink.Publish in a producer span so a downstream collector can correlate the transactional write with the eventual delivery. * New TracingConfig knobs (endpoint, insecure, sample_ratio) flow through internal/config.Platform. The propagator is a composite of TraceContext + Baggage, so cross-service requests trace cleanly whether they originate from a W3C-compliant client or from another goforge service. Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
pkg/cache: minimal Cache interface (Get/Set/SetNX/Del/Incr/Ping/Close)
with two implementations:
* Memory — in-process, safe for concurrent use, optional sweeper
goroutine. Suits single-replica deployments and tests with no
external dependency.
* Redis — go-redis v9 client, supports Addr or full URL, optional
TLS. Incr is implemented via a small Lua script so the TTL is
only applied on the first write — subsequent increments cannot
accidentally extend the rate-limit window.
pkg/ratelimit: a fixed-window approximation rate limiter built on top
of cache.Cache. Allow returns a Decision struct (allowed flag,
remaining budget, reset duration) so middleware can populate the
canonical X-Ratelimit-Limit / X-Ratelimit-Remaining / X-Ratelimit-Reset
headers on every response.
The Fiber middleware fail-open on cache errors — a transient Redis
outage must not 503 the whole API; the framework keeps its other
defences (auth, body limits, idempotency) running.
Drop-in: pkg/ratelimit replaces fasthttp's per-process limiter when
your deployment goes multi-replica. The interface is the same; only
the cache backend changes.
Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
…+ cron
A goforge app eventually wants to send email, deliver outgoing
webhooks, run nightly rollups and resize uploads — none of that
belongs on the request hot-path. pkg/jobs ships a single-table queue
that uses Postgres' FOR UPDATE SKIP LOCKED so any number of replicas
can dispatch in parallel without coordinating.
Schema (migrations/0005_jobs):
- jobs(id, queue, kind, payload, status, attempts, max_attempts,
last_error, run_at, locked_at, locked_by, completed_at,
dedupe_key)
- job_schedules(id, name, queue, kind, payload, interval_secs,
next_run_at, enabled)
Public API:
- Queue interface (Enqueue / Claim / Complete / Fail / Stats)
- Postgres implementation
- Runner — N-goroutine worker pool, exponential backoff with full
jitter (no thundering herd), bounded handler timeout, panic
recovery (a single bad payload cannot kill a worker forever),
automatic DLQ on attempts == max_attempts.
- Scheduler — wakes every tick, atomically claims due schedules,
enqueues a Job, advances next_run_at.
Idempotent enqueue via dedupe_key plus a partial unique index on
(queue, dedupe_key) WHERE status IN (pending,running,failed) — the
schedule helper uses it to prevent double-enqueue on overlapping
ticks.
Tests cover the happy path, panic-recovery, and unknown-kind DLQ
routing. Production Postgres path is exercised in the next commit
when wired into platform.Build.
Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
pkg/storage exposes a single Storage interface (Put / Get / Delete /
PresignPut / PresignGet / List). One implementation covers every
common provider:
* S3 — backed by aws-sdk-go-v2/service/s3. Works unchanged with
AWS S3, Cloudflare R2, MinIO, Backblaze B2, and DigitalOcean
Spaces by toggling Endpoint and UsePathStyle.
* Memory — in-process, no I/O, used in tests. PresignPut/Get
return opaque memory:// URLs so tests can match on them.
Presigned URLs are the recommended primitive for user uploads: the
browser PUTs the bytes directly to the bucket, your API never
buffers them. Both PresignPut and PresignGet take a TTL so leaked
URLs auto-expire.
The interface intentionally stays small. Streaming, multipart, and
metadata get added when an actual project asks for them; today's
goal is the 90% surface (POST a profile picture, GET a download
link).
Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
pkg/mailer is a tiny abstraction over the awkward bits of email:
MIME multipart construction, RFC 2047 subject encoding, attachment
base64 encoding, quoted-printable bodies, RFC 1123Z dates.
Three transports ship today:
* SMTP — works with Postmark, Mailgun, AWS SES (SMTP), Postfix
relay, anything that speaks submission. Optional implicit TLS
(port 465) via crypto/tls; STARTTLS is the smtp package's
default. Builds multipart/alternative when both Text and HTML
are set, multipart/mixed when attachments are present.
* LogTransport — writes a structured log line and returns nil,
so dev environments can exercise signup / forgot-password flows
without provisioning a relay.
* MemoryTransport — appends every message to a slice, used by
integration tests to assert against what would have been sent.
Templating and queueing are deliberately *outside* this package:
render the body yourself (html/template, embed.FS, etc.), then enqueue
a job that does mailer.Send. This keeps the abstraction honest — one
package, one job (deliver bytes).
Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
Outgoing flow piggybacks on pkg/jobs: Dispatcher.Enqueue → jobs.Queue.Enqueue → Runner picks up → Deliver Deliver signs the body with HMAC-SHA256 over '<unix>.<event_id>.<body>' (Stripe/Slack-shape) and POSTs it. Non-2xx → return error → jobs.Runner reschedules with the same exponential-jitter backoff used everywhere else, so we don't reinvent retry semantics. The dedupe key on Enqueue is '<event_id>:<endpoint_id>' so a buggy caller that fires Enqueue twice doesn't end up posting twice. The worker re-signs on every attempt, which means a secret rotation between attempt N and N+1 immediately takes effect — the receiver won't accept a stale signature and we don't have to manually flush in-flight retries. Inbound: pkg/webhooks.InboundVerifier is a Fiber middleware that validates incoming signatures using a SecretLookup callback (so each integration carries its own secret and the framework doesn't hard-code 'one global webhook key'). Replay protection is identical to outgoing — a 5-minute window on the timestamp. Header is the canonical 'Webhook-Signature: t=<unix>,v1=<hex>', multi-candidate friendly so secret rotation works without downtime on either side. Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
pkg/authz wraps Casbin with the standard 'RBAC with domains' model:
p, sub, dom, obj, act
g, sub, role, dom
so Allow(subject, tenant, object, action) returns true iff a policy
matches directly, or via a role granted *in that tenant*. The model
uses keyMatch2 so policies can pattern-match URL paths (/users/:id)
without bespoke matchers.
Roles are scoped per tenant by design — granting 'admin' in tenantA
must NEVER let the principal admin tenantB. The test suite asserts
that explicitly.
A Fiber Require() middleware enforces a single (object, action)
pair per route; subjects are read from c.Locals('subject') by
default (set by goforge's auth middleware), but the extractor is
swappable for service-to-service flows.
pkg/audit is the append-only counterpart. Every privileged action
emits an Entry (Action, Subject, Object, Before, After, Metadata,
TenantID, RequestID, IP, UserAgent). The Postgres implementation
writes to audit_log; a Memory implementation is provided for tests
so handlers can assert what would have been logged.
The audit_log table is intentionally never updated or deleted by
application code: an operator answering 'who did what when' should
be able to trust the row.
Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
pkg/authn ships three modern-auth primitives that goforge apps
inevitably want once email+password is no longer enough.
TOTP (RFC 6238):
* NewTOTPSecret — 160-bit base32, format Google Authenticator
and 1Password accept verbatim.
* Provision — builds the otpauth:// URI for QR rendering.
* QRPNG — encodes the URI as a 256x256 PNG so onboarding pages
can stream the image directly.
* Verify — pquerna/otp's +/-1 step skew, matches every major
authenticator app.
Magic-link:
* Tokens are opaque 32-byte URL-safe random; only a SHA-256 hash
is stored in cache, so a database leak does not expose live
tokens.
* Strictly single-use — Consume deletes the cache entry, a
second call returns an explicit 'already used' error.
* Backed by pkg/cache so the same code works in-memory in tests
and in Redis in production.
OAuth2: * Wraps golang.org/x/oauth2 with state+PKCE and userinfo
fetching.
* Provider config is provider-agnostic (Google, GitHub, Apple,
Microsoft, Auth0): supply AuthURL / TokenURL / UserInfoURL.
* AuthCodeURL stashes the PKCE verifier keyed by state in cache,
so the callback handler can recover it without writing a
sessions table.
* HelperEnsureHTTPS guards against accidentally configuring an
http:// callback in production.
Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
`forge gen resource` is the single command that turns a name into
a fully wired CRUD aggregate, every layer included:
internal/domain/<lc>/<lc>.go entity + dbx.Audit + errs
internal/domain/<lc>/repository.go repo interface
internal/usecase/<lc>.go Create/Get/List/Update/Delete
internal/usecase/<lc>_test.go table-driven tests w/ stub repo
internal/adapter/repository/postgres/... pgx impl honouring soft-delete
internal/adapter/http/dto/<lc>.go request/response DTOs
internal/adapter/http/handler/<lc>.go Fiber handler, 5 routes
migrations/NNNN_create_<plural>.up.sql table with audit columns
migrations/NNNN_create_<plural>.down.sql
The whole thing is template-driven (text/template + embed.FS), so
adopting a new pattern only touches templates rather than Go code.
Filenames use literal '__' as a path separator and '{{LC}}' /
'{{PLURAL}}' / '{{MIG}}' as plain placeholders — Go's template
syntax fights with shells and curly-brace filenames, plain string
replace is the path of least surprise.
The migration ID is auto-incremented from the highest existing
migration so generated SQL never collides with the framework's
own migrations.
The generator refuses to overwrite existing files (the test suite
asserts this) so re-running the command is safe.
Smoke tests cover: invalid name rejection, full file set produced,
overwrite refusal, and migration-ID auto-increment.
Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
e0ff34a
into
devin/1777054236-initial-framework
CI runs `go mod tidy && git diff --exit-code` and was failing on PR #7 because several modules introduced in PR #6 (aws-sdk-go-v2, casbin, otp, redis, oauth2, otel) were listed in the indirect block even though they are imported directly. `go mod tidy` reorganises them to the direct `require` block, no version changes. Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
| } | ||
| dues = append(dues, d) | ||
| } | ||
| rows.Close() |
There was a problem hiding this comment.
🟡 Missing rows.Err() check after row iteration in Scheduler.dispatchDue
After iterating rows and calling rows.Close() at line 79, the code does not check rows.Err() for errors that occurred during iteration. If a network or decoding error causes rows.Next() to return false prematurely, the error is only available via rows.Err(). Without this check, the scheduler silently operates on an incomplete set of due schedules, potentially skipping work. The outbox dispatcher in the same PR correctly implements this check at pkg/outbox/outbox.go:195-198.
| rows.Close() | |
| rows.Close() | |
| if err := rows.Err(); err != nil { | |
| return err | |
| } | |
Was this helpful? React with 👍 or 👎 to provide feedback.
| ratio := cfg.SampleRatio | ||
| if ratio <= 0 { | ||
| ratio = 1.0 | ||
| } |
There was a problem hiding this comment.
🟡 SampleRatio=0 documented as "never sample" but silently overridden to 1.0 (always sample)
The TracingConfig.SampleRatio field comment at pkg/observability/tracing.go:32 documents 0.0 = never, 1.0 = always; default 0.1 in prod, 1.0 elsewhere, but the implementation treats any value <= 0 as 1.0. Since internal/config/config.go:153 (setDefaults) does not set a default for platform.otel_sample_ratio, the Go zero value (0.0) flows through and gets overridden to 1.0. This means enabling tracing by setting an OTLP endpoint without explicitly configuring a sample ratio results in 100% sampling rather than the documented 10% production default — potentially causing significant performance overhead and collector/storage cost in production.
Prompt for agents
The TracingConfig.SampleRatio field documents '0.0 = never, 1.0 = always; default 0.1 in prod, 1.0 elsewhere' but the implementation in InitTracing (pkg/observability/tracing.go:86-89) uses 'if ratio <= 0 { ratio = 1.0 }', making it impossible to set 0-sampling and defaulting unset values to 100% sampling.
Two things need to be reconciled:
1. Either update the comment to match the implementation (0 means 'use default 1.0'), or change the guard to only replace negative values (ratio < 0) and treat 0.0 as a valid 'never sample' setting.
2. Add a sensible default in internal/config/config.go setDefaults(), e.g. v.SetDefault('platform.otel_sample_ratio', 1.0) for dev or 0.1 for production. Without a default, the Go zero value (0.0) flows through, and the current guard silently changes it to 1.0.
Note that distinguishing 'user explicitly set 0.0' from 'field was never set' is tricky with float64 zero values. A common pattern is to use a small sentinel (e.g. -1 for 'unset') or a pointer type (*float64) to detect the difference.
Was this helpful? React with 👍 or 👎 to provide feedback.
Two real issues spotted by Devin Review on PR #6 — fixed against the same merge target so PR #7 carries them through to main. * pkg/jobs/scheduler.go (BUG_0001): `rows.Err()` was never consulted after row iteration, so a connection drop mid-iteration would silently leave dues empty and the scheduler would skip work without logging anything. Mirrors the pattern already used in pkg/outbox/outbox.go. * pkg/observability/tracing.go (BUG_0002): the docs claimed `SampleRatio = 0` meant 'never sample' but the implementation coerced any value <= 0 to 1.0, so an explicit 0 silently became 100%. Tightened the guard to negatives only and rewrote the doc comment to match the new semantics. Operators wanting 'never sample' now get exactly that. Co-Authored-By: dede febriansyah <febriansyahd65@gmail.com>
feat(i18n): localised error messages (en, id) via Accept-Language (Tier-B #6)
Summary
Implements every Tier 1 and Tier 2 feature you selected ("semua tier 1 + tier 2 (sekalian besar)"). Eleven self-contained packages, one feature per commit, full unit-test coverage, lint-clean, and the existing pentest harness untouched.
This is the framework's biggest expansion since the platform foundation in #4. After merging it, goforge ships every primitive a real production SaaS reaches for: rotation-safe JWTs, distributed cache + rate limit, durable jobs, signed object storage, multi-transport mailer, signed webhooks both ways, RBAC with audit log, MFA + magic-link + OAuth2, and a one-command CRUD generator.
Why
You explicitly asked for the full Tier 1 + Tier 2 set in one go. Each piece is the missing link between "starter kit" and "use this for every project I build for the next two years":
forge gen resourceHow
pkg/jobsso retries/backoff/DLQ are uniform across the framework — no second retry strategy.forge gen resourceis template-driven (text/template + embed.FS); adding a new pattern only touches templates, not Go.Test plan
go test -race ./...— every package green (audit, authn, authz, cache, dbx, errs, events, flags, forge/gen, idempotency, jobs, mailer, observability, openapi, paginate, ratelimit, storage, tenant, validatorx, webhooks)golangci-lint run ./...— cleanforge gen resource --name Widgetin a fresh checkout: produces 9 files, generated code compiles + tests passinternal/usecaseandinternal/infrastructure/securitytests still greenRisk
Surface area is large. Each package is opt-in — none of them activate unless wired in
internal/app/app.go— so the existing API behaviour is unchanged by this PR. Rollback isgit revertof any single commit.The two riskier places to watch:
Require(...)to a route that previously had no authorization check is a behaviour change; consult the audit log post-deploy.Checklist
fmt.Printlnleft behind0006_audit_log.up.sql(audit_log) — earlier commits added0005_jobs.up.sqlfor the queueLink to Devin session: https://app.devin.ai/sessions/8fdfc20358514c97a766adca630a2527
Requested by: @dedeez14