Release erli18n-v0.6.0
Phase 5: per-request localization middleware for Cowboy and Elli, plus the
pure negotiation core, structural performance/correctness fixes on the new
per-request path, and two general latent-bug fixes surfaced by a test-adequacy
audit (UTF-8 truncation in the interpolator, non-UTF-8 byte escaping in the PO
serializer). Additive under the 0.x SemVer policy — new optional adapter
modules, a new public core module, and one new facade function; the default
kernel + stdlib build is unchanged.
Added
-
Per-request localization middleware for Cowboy and Elli (roadmap Phase 5).
Two new optional adapter modules make per-request locale negotiation
turnkey:erli18n_cowboy— acowboy_middlewarethat negotiates the request locale
and callserli18n:setlocale/1before the handler runs.erli18n_elli— the Ellielli_middlewarecounterpart (preprocess/2).
Both delegate to the existing
erli18n_negotiateengine via a new pure,
framework-agnostic core,erli18n_http, which resolves the locale from an
ordered set of sources — default precedence query > cookie >Accept-Language
header > default (configurable), with cookie/query overrides canonicalized
and the header parsed by the fail-soft RFC 9110 parser. The chosen locale is
also placed in the CowboyEnv(erli18n_locale) and, by default, inlogger
process metadata.Per-request resolution is lazy and short-circuiting: each source is
extracted only when it is reached, and negotiation stops at the first source
that yields a supported locale — so a request answered by an earlier-precedence
source never pays for the cookie split or header parse of the later ones. The
adapters resolveavailable/defaultlazily too:erli18n:loaded_locales/0
is forced only once a source actually yields a value,erli18n:default_locale/0
only on a total miss, and an explicitly-suppliedavailable/defaultis
zero-cost. Both the Cowboy and Elli query seams are total and fail-soft:
each adapter feeds the raw query binary (from the framework's own total
accessor —cowboy_req:qs/1,elli_request:query_str/1) to a single pure-core
parser,erli18n_http:query_value/2, instead of the framework's raising query
decoder. A value-less?localeand a malformed percent-escape (?locale=%ZZ,
a bare?%, a truncated?locale=%E0%) are skipped rather than crashing the
request. Per-request option values are validated at therun/2boundary:
a malformeddefault(non-binary) oravailable(not a list, or a list with
bad elements) is dropped so the documented default applies
(erli18n:default_locale/0/erli18n:loaded_locales/0), emitting a one-time
logger:warning— operator misconfiguration is fail-soft-and-observable,
never request-fatal.cowboyandelliare optional in the same way astelemetry: they are
declared inoptional_applicationsand are not runtime dependencies of the
published package, which still builds and runs onkernel+stdlibalone.
The module docs document the per-process / not-inherited-across-spawn locale
model and the broader cross-process handoff hazard (pooled workers, shared
gen_servers,Task-style spawns, Cowboy stream handlers that offload), the
mitigations, and a Phoenix interop note (no Elixir dependency). -
erli18n:loaded_locales/0— returns the distinct, sorted locales that have
at least one catalog loaded: the authoritative available set for negotiation
(the defaultavailableset the new adapters use). It is backed by a dedicated
loaded-locale index kept as its own keyedpersistent_termand maintained on
every catalog add/remove path (load/reload/put/merge-that-creates/unload/
erase_all), so the read is a single copy-free keyed lookup plus ausortrather
than a scan of every term on the node. Index writes are compare-before-put:
reloading an already-indexed catalog (or unloading an absent pair) leaves the
index term untouched and skips the node-wide literal-area GC that a
persistent_term:putwould otherwise trigger. -
erli18n_http— public framework-agnostic negotiation core. Exposes
negotiate_locale/3(resolve the request locale from an ordered candidate
list against an available set, with adefaultfall-through),
negotiate_locale_lazy/4(the lazy, short-circuiting engine the adapters drive,
taking an on-demand extraction callback andavailable/defaultthunks),
cookie_value/2(total, fail-soft single-cookie extraction from a raw
Cookieheader, bounded against abuse), andquery_value/2(total, fail-soft
single-parameter extraction from a raw query binary, percent-decoding the
matched value fail-soft and bounded against abuse). It is pure (nosetlocale,
no logger, no I/O) and is the supported entry point for wiring frameworks the
bundled adapters do not cover. The canonical available-locale index is built
once per negotiation call and reused across every candidate; the cookie
parser bounds the split itself (peeling at mostMAX_COOKIE_PAIRS
;-segments and dropping the tail unscanned, O(cap) rather than O(header
length)); and alocale="pt_BR"cookie in RFC 6265 quoted-string form is
unquoted byte-level and total.
Added (negotiation core)
erli18n_negotiate:available_index/1+negotiate_with_index/2— the
canonical available-locale index (#{canonicalize(Original) => Original}) is
now a public, reusable value: build it once withavailable_index/1and
negotiate many preference lists against it withnegotiate_with_index/2,
instead of rebuilding the index per call.negotiate/2is exactly
negotiate_with_index(Preferred, available_index(Available)); its semantics are
unchanged.- The
?MAX_RANGESanti-DoS cap onto_locale_list/2is now honest on every
consumed cell. The budget is a per-consumed-cell cap (at most 32 input
cells inspected) rather than a per-accepted-entry one: the wildcard-skip and
oversized-tag-skip branches now also decrement the budget, so a skip-heavy
adversarial preference list stops at 32 cells instead of walking the whole list.
Now reachable through the newly-publicnegotiate/2/negotiate_with_index/2.
Output is byte-identical for any input whose first 32 consumed cells are all
acceptable; it differs only when acceptable entries appear after 32 consumed
(including skipped) cells — which is exactly the documented anti-DoS contract.
Fixed
erli18n_interptruncation now cuts on a UTF-8 codepoint boundary. Both the
per-value clamp (clamp_value/1, at?MAX_VALUE_BYTES) and the output cap
(append_and_check/2, at?MAX_OUTPUT_BYTES) previously truncated with a
fixed-offsetbinary:part/3; because neither cap is codepoint-aligned, a cut
could split a multi-byte codepoint and leave a dangling partial sequence —
invalid UTF-8. A new totaltruncate_utf8/2(withcodepoint_start/2/
is_utf8_continuation/1) backs off to the codepoint's lead byte when the cut
lands inside a multi-byte sequence, so a value that was valid UTF-8 stays valid
after clamping or truncation. Output for any value within the cap is unchanged.erli18n_po:escape_string/1is now total over anybinary(). A byte that is
not part of a valid UTF-8 sequence (e.g. a lone0xFF) matched no clause and
raisedfunction_clause, crashingdump/1on a catalog value carrying arbitrary
bytes. A final byte-wise clause now passes such a byte through verbatim — the same
way the PO reader tolerates raw bytes on parse — honoring the
-spec binary() -> binary()totality contract. The five GNU gettext escapes and
all valid-UTF-8 output are byte-for-byte unchanged.