new: first ARK version#7
Merged
Merged
Conversation
isTravis
approved these changes
May 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ARK Server
Context
Underlay needs persistent, citeable identifiers for collections and records — standard academic/archival infrastructure. We're implementing ARK (Archival Resource Key) identifiers, which are free, open persistent identifiers that support inflection (metadata queries via
?info,?json) and redirect via standard web redirects.The implementation follows ARK conventions and best practices by giving each org a unique "shoulder" (
ul{counter}{digit}), minting opaque betanumeric IDs for collections, and resolving ARK URLs to collection overviews, specific versions, or individual records. If enabled, record-level ARKs redirect to a URL field within the record's data.In this way, Underlay can be used not just for data storage, but also as a persistent identifier and metadata retrieval service for both collections and individual records they reference on remote servers.
Architecture Overview
URL format:
https://underlay.org/ark:NAAN/ul{counter}{digit}{collectionHash}Examples:
ark:12345/ulb0xkm3nq8park:12345/ulb0xkm3nq8p.v2ark:12345/ulb0xkm3nq8p/Person/author-001ark:12345/ulb0xkm3nq8p.v2/Person/author-001Parsing shoulder from name: Shoulders use the ARK "primordinal" convention — they end at the first digit. Since shoulder =
ul{consonants}{digit}, parsing scans for the digit afterul. E.g.ulb0xkm3nq8p→ shoulder=ulb0, collectionArkId=xkm3nq8p. To ensure unambiguous parsing, collection hash characters must start with a consonant (guaranteed by the betanumeric encoding: if the first character is a digit, prepend a consonant pad character).Betanumeric alphabet:
bcdfghjkmnpqrstvwxz0123456789(consonants without 'l' + digits, 29 chars)Routing: ARK URLs are user-facing (
/ark:NAAN/...), handled by Astro middleware (intercepts before page routing), which calls Fastify API internally to resolve and then returns a redirect or metadata response.Storage: Minimize writes. We store:
ark_shoulders— one row per org (created on first ARK use)ark_collections— one row per collection (ARK ID, enabled flag, custom redirect URL)ark_record_types— one row per (collection, record type) where ARKs are enabled (stores which URL field to redirect to)Step 1: Database Schema
File:
src/db/schema.tsAdd to
accountstable:arkNaan: text("ark_naan")(nullable; if set, overrides default NAAN for all ARKs)Add three new tables:
Migration file:
src/db/migrations/0004_{name}.sqlStep 2: ARK Utility Library
New file:
src/lib/ark.tsFunctions:
BETANUMERIC = "bcdfghjkmnpqrstvwxz0123456789"(29 chars)BETANUMERIC_CONSONANTS = "bcdfghjkmnpqrstvwxz"(19 chars — used for shoulder counter)collectionToArkId(collectionId: string): string— SHA-256 of UUID → base-29 encode first 48 bits → 8-char betanumeric string (deterministic, no DB write needed for the hash)nextShoulderCounter(existingCount: number): string— converts integer index to consonant-alphabet string (0→"b",1→"c", ...,18→"z",19→"bb", ...)mintShoulder(accountId: string): Promise<string>— atomic: count existing shoulders, compute next counter, append random digit 0-9, insert intoark_shoulders, return full shoulder string like"ulb0"getOrMintShoulder(accountId: string): Promise<string>— get existing or mint newparseArkPath(pathAfterNaan: string): { shoulder, collectionArkId, version?, recordType?, recordId? }— parsesulb0xkm3nq8p.v2/Person/author-001buildArkUrl(naan, shoulder, collectionArkId, version?, recordType?, recordId?): stringgetCollectionArk(collectionId: string, naan: string): Promise<string | null>— returns full ARK URL for a collection if enabledStep 3: ARK API Routes
New file:
src/api/routes/ark.ts, registered insrc/api/server.tswithprefix: "/api"Routes:
GET /api/ark/resolve— main resolution endpoint called by Astro middlewarepath(the full path afterark:)ark_shouldersby shoulder value → get accountIdark_collectionsby arkId → get collectionId, enabled flag, customUrlmetadataobject (see Step 5 for shape by object type)customUrl→ redirect URL/{ownerSlug}/{collectionSlug}/v/{number}, metadata includes version details (semver, message, pushedBy, createdAt)/{ownerSlug}/{collectionSlug}, metadata reflects latest versionark_record_types→ fetch record from correct version → extract URL field → redirect URL; metadata includesrecordType,recordId,data(public fields only),schema(the schema for that type){ type: 'redirect'|'not_found', url?, metadata: { type, ... } }GET /api/collections/:owner/:slug/ark— get ARK settings for collection (owner auth required){ enabled, customUrl, arkUrl, shoulder, arkId }PATCH /api/collections/:owner/:slug/ark— update ARK settings (owner/admin auth){ enabled?: boolean, customUrl?: string | null }GET /api/collections/:owner/:slug/ark/record-types— get schema ARK settings{ recordType, redirectUrlField }for enabled record typesPATCH /api/collections/:owner/:slug/ark/record-types— enable/disable ARK for a record type{ recordType, redirectUrlField: string | null }(null to disable)PATCH /api/accounts/:slug/ark— update org NAAN (admin auth){ naan: string | null }Hook collection creation (
src/api/routes/collections.ts):After
db.insert(schema.collections), auto-mint ARK:getOrMintShoulder(account.id)collectionToArkId(id)ark_collectionswithenabled: trueUpdate collection/version API responses:
GET /collections/:owner/:slug→ addark?: stringfield (the ARK URL, if enabled)GET /collections/:owner/:slug/versions→ addark?: stringper version (with.vNsuffix)GET /collections/:owner/:slug/versions/:n→ addark?: stringfor the versionGET /collections/:owner/:slug/versions/:n/records→ addark?: stringper record (if record type ARK enabled)Step 4: ARK Root Handler
In the Astro middleware (or as part of the ARK resolver logic), handle requests to
/ark:NAAN/(path ends immediately after the NAAN with a trailing slash and nothing more) as a special case:Return
text/plaincontent describing the naming authority:(Full policy text TBD during implementation; should state that Underlay maintains persistent redirects for collections and records, that ARKs are not reassigned, and reference the erc-support info.)
The
erc-support.wherefield in all ERC responses points to this URL:https://underlay.org/ark:{NAAN}/Step 5: Astro Middleware — ARK Resolver
File:
src/middleware.tsIntercept requests where
pathname.startsWith("/ark:"):/api/ark/resolvereturns ametadataobject whose shape depends on the resolved object type:{ type: 'collection', collectionName, ownerName, createdAt, arkUrl }{ type: 'version', collectionName, ownerName, versionNumber, semver, createdAt, message, arkUrl }{ type: 'record', collectionName, ownerName, versionNumber, recordType, recordId, schema, data (filtered), createdAt, arkUrl }buildERC(metadata)— builds ANVL-format ERC response tailored to object type:For a collection:
For a version:
For a record:
For
?json, return the fullmetadataobject as JSON (includingschemafor records, version provenance fields likesemver,message,pushedBy).Step 6: Organization Settings — ARK NAAN
File:
src/pages/[owner]/settings.astroAdd new section below API Keys (before Danger Zone):
orgData.arkNaan ?? 'Default (12345)'with save buttonaction = "update-ark"form handler that callsPATCH /api/accounts/${owner}/arkAdd
PATCH /accounts/:slug/arkhandler insrc/api/routes/accounts.ts:accounts.arkNaanStep 7: Collection Settings — ARK Section
File:
src/pages/[owner]/[collection]/settings.astroAdd section below Export:
GET /api/collections/${owner}/${collection}/arkaction = "update-ark"→ callsPATCH /api/collections/${owner}/${collection}/arkStep 8: Collection Schema Settings — Per-Type ARK Minting
File:
src/pages/[owner]/[collection]/schemas.astroFor each schema type, if there's at least one
type: "string", format: "uri"or"url"field:PATCH /api/collections/${owner}/${collection}/ark/record-typeswith{ recordType, redirectUrlField: field | null }Note: "mint ARKs" in the spec means enabling the ARK infrastructure for that type. ARKs are resolved dynamically at request time, not pre-stored per record.
Step 9: UI — Add ARK Display
Collection overview (
src/pages/[owner]/[collection]/index.astro):navigator.clipboard.writeText()Versions page (
src/pages/[owner]/[collection]/versions.astro):.vNsuffix) as a small copyable<code>element per rowVersion detail page (
src/pages/[owner]/[collection]/v/[n].astro):Record table in version detail:
Step 10: Environment Variable
File:
src/lib/ark.ts(orsrc/lib/page-utils.ts)export const DEFAULT_NAAN = process.env.ARK_DEFAULT_NAAN ?? "12345"Add
ARK_DEFAULT_NAAN=12345to.env.testCritical Files
src/db/schema.tsarkNaanto accounts, add 3 new tablessrc/db/migrations/0004_*.sqlsrc/lib/ark.tssrc/api/server.tsarkRoutessrc/api/routes/ark.tssrc/api/routes/collections.tsarkin responsessrc/api/routes/versions.tsarkwith.vNsuffix in version responsessrc/middleware.ts/ark:*, call resolve, return redirect/ERC/JSONsrc/pages/[owner]/settings.astrosrc/pages/[owner]/[collection]/settings.astrosrc/pages/[owner]/[collection]/schemas.astrosrc/pages/[owner]/[collection]/index.astrosrc/pages/[owner]/[collection]/versions.astrosrc/pages/[owner]/[collection]/v/[n].astroDesign Decisions
No per-record storage: Record ARKs are resolved dynamically (look up collection→type→field→fetch record). Only 3 new tables regardless of collection size.
Deterministic collection ARK IDs:
SHA-256(collectionUUID)→ base-29 encode → 8 betanumeric chars. No lookup table needed for collection IDs — just store once inark_collections.Shoulder uniqueness:
nextShoulderCountercounts existing rows inark_shouldersatomically to assign the next sequential counter. Random digit (0-9) appended for character diversity per ARK best practices.Enabled by default: New collections get
ark_collectionsrow withenabled: true. Users can disable in settings.Record IDs in ARK URLs are literal: Using the actual
recordIdtext (not hashed) avoids a massive reverse-lookup table. Record IDs in Underlay are already not sequential integers.URL field detection for schema ARKs: At save time, the UI only offers fields where
type === "string"and (format === "uri"orformat === "url"). No runtime schema validation in the resolver — we trust the stored field name.Verification
npm run devark_shouldersandark_collectionsrows created/ark:12345/ulb0xkm3nq8p— confirm redirect to collection overview/ark:12345/ulb0xkm3nq8p.v1— confirm redirect to version 1 page?infoinflection:curl "http://localhost:4321/ark:12345/ulb0xkm3nq8p?info"— confirm ERC text/plain response with who/what/when/where + erc-support block?jsoninflection — confirm JSON response with collection metadata/ark:12345/ulb0xkm3nq8p/Person/author-001redirects to URL in that record's data