Skip to content

feat(cli): pre-process and mount bundled API specs for CLI generator#15929

Merged
Swimburger merged 22 commits into
mainfrom
devin/1778875366-raw-spec-mounting
May 20, 2026
Merged

feat(cli): pre-process and mount bundled API specs for CLI generator#15929
Swimburger merged 22 commits into
mainfrom
devin/1778875366-raw-spec-mounting

Conversation

@Swimburger
Copy link
Copy Markdown
Member

@Swimburger Swimburger commented May 15, 2026

Description

Refs FER-9852

Adds plumbing to pre-process API specs (OpenAPI, AsyncAPI, protobuf, OpenRPC, GraphQL) during fern generate and mount them into generator Docker containers as compact, self-contained JSON files. This enables the fernapi/fern-cli generator to embed resolved specs and construct the CLI dynamically at runtime.

Instead of shipping raw files and requiring generators to resolve $refs, the CLI pre-processes each spec before mounting: bundles external $refs (via Redocly), merges overrides, applies overlays, filters out x-fern-ignore operations and non-matching x-fern-audiences, and outputs a single compact JSON file per spec. Schemas referenced from multiple external files are deduplicated into #/components/schemas/ entries by Redocly's bundler. Protobuf and GraphQL specs are copied as-is since they cannot be meaningfully bundled.

Follows the existing protobuf source mounting precedent (sourceMounts in runGenerator.ts). No GeneratorConfig schema changes required.

Also adds a hidden resolve-specs CLI command (fern resolve-specs <path-to-output>) and spec analysis output in the CLI generator that prints basic info (title, version, endpoint count, schema count) about mounted specs.

Updates since last revision

  • Renamed raw-specsspecs throughout: container mount path is now /fern/specs/ (was /fern/raw-specs/), constants renamed (SPECS_DIRECTORY_NAME, SPECS_MANIFEST_FILENAME, CONTAINER_SPECS_DIRECTORY, GENERATORS_WANTING_SPECS), helper renamed to generatorWantsSpecs(), generator-side function renamed to copySpecs(), all test assertions updated. Internal variable/type names (rawApiSpecs, collectRawSpecs, RawSpecsManifest) are unchanged.

Changes Made

  • New rawSpecs.ts modulecollectRawSpecs() dispatches each spec to a type-specific handler:
    • OpenAPI/AsyncAPI: calls loadOpenAPI()/loadAsyncAPI() (bundles $refs, merges overrides, applies overlays), then filters via filterSpec(), then writes compact JSON
    • OpenRPC: reads, parses (JSON or YAML), merges overrides via coreMergeWithOverrides, writes compact JSON
    • Protobuf: copies the .proto root directory + any override files as-is
    • GraphQL: copies the .graphql schema file + any override files as-is
    • Uses assertNever default case for exhaustive type checking on Spec variants
    • Produces a specs-manifest.json with container paths for each spec
  • New filterSpec() function in rawSpecs.ts — filters resolved OpenAPI/AsyncAPI specs:
    • Removes operations with x-fern-ignore: true
    • When audiences is SelectAudiences, removes operations whose x-fern-audiences don't overlap with configured audiences
    • Operations without x-fern-audiences are kept (they are not restricted to any audience)
    • Paths with no remaining operations are removed entirely
    • Preserves non-operation path-level properties (e.g. parameters)
  • constants.ts — added SPECS_DIRECTORY_NAME, SPECS_MANIFEST_FILENAME, CONTAINER_SPECS_DIRECTORY, and generatorWantsSpecs() helper (backed by GENERATORS_WANTING_SPECS accept list). The helper is shared between local generation and seed test paths.
  • runGenerator.ts — accepts optional rawApiSpecs: Spec[], invokes collectRawSpecs with audiences, writes manifest, and pushes a Docker source mount for /fern/specs/
  • runLocalGenerationForWorkspace.ts — uses generatorWantsSpecs() to gate spec mounting to opted-in generators only (currently fernapi/fern-cli)
  • Seed test infrastructure — threads rawApiSpecs through GenerationRunner.RunArgsexecuteGenerator()writeFilesToDiskAndRunGenerator(), and extracts allSpecs from OSSWorkspace in TestRunner.run() (both cached and uncached workspace paths) so seed tests also receive mounted specs
    • TestRunner.run() uses generatorWantsSpecs() to check once, then extracts specs in both paths:
      • Cached path: calls workspaceCache.getOrLoadApiWorkspace() to get the AbstractAPIWorkspace and checks instanceof OSSWorkspace
      • Uncached path: checks instanceof OSSWorkspace on the freshly-loaded apiWorkspace before conversion to FernWorkspace
    • Updated ContainerTestRunner and LocalTestRunner to destructure and pass rawApiSpecs through to their respective generation functions
  • New generators/cli/src/copySpecs.ts — generator-side module that reads specs-manifest.json from the mounted /fern/specs/ directory, copies spec files, and writes a new manifest with output-relative paths. Throws clear errors on ENOENT and re-throws all other lstat errors.
  • New generators/cli/src/analyzeSpecs.ts — parses mounted specs and extracts metadata (title, version, endpoint count, schema count). Called from cli.ts to print spec info during generation.
  • generators/cli/src/cli.ts — calls analyzeSpecs to print spec info, then copySpecs during generation
  • New resolve-specs hidden CLI commandresolveSpecsForWorkspaces.ts + registration in cli.ts
  • Exported collectRawSpecs, filterSpec, generatorWantsSpecs, RawSpecsManifest, RawSpecsManifestEntry, SPECS_MANIFEST_FILENAME from local-workspace-runner/src/index.ts
  • Added js-yaml dependency to local-workspace-runner for YAML parsing in OpenRPC resolution
  • Changelog entries added for both CLI package and CLI generator

Human Review Checklist

  • All generators get spec mounts — Fixed: gated via generatorWantsSpecs() accept list helper
  • lstat failure silently falls back — Fixed: throws on ENOENT, re-throws all other errors
  • No exhaustive check in spec type switch — Fixed: assertNever(spec) default case added
  • Seed tests don't receive spec files — Fixed: rawApiSpecs threaded through TestRunnerContainerTestRunner/LocalTestRunnerGenerationRunner
  • Internal names still use raw prefixrawApiSpecs param, collectRawSpecs() function, RawSpecsManifest type, rawSpecs.ts filename all still have raw in the name. User-facing paths and constants are consistently specs. Confirm this naming split is acceptable.
  • Cached workspace path for seed tests — When WorkspaceCache is used, raw specs are extracted via getOrLoadApiWorkspace() (returns the AbstractAPIWorkspace before conversion). The instanceof OSSWorkspace runtime check determines whether allSpecs is available. Verify this works correctly for all fixture types (e.g. Fern Definition fixtures that aren't OSSWorkspace).
  • getParsedDockerImageName().name used for generator gating in seed tests — The seed TestRunner uses this.getParsedDockerImageName().name (parsed from seed.yml's test.docker.image) to check generatorWantsSpecs(). Verify this resolves to fernapi/fern-cli correctly.
  • Schemas from filtered-out operations remain in components/schemasfilterSpec only removes operations from paths, not their referenced schemas. Unreferenced schemas are harmless but add size.
  • Operations without x-fern-audiences are kept when audience filtering is active — matches IR generation behavior (untagged operations are "public").
  • Manifest type duplicationRawSpecsManifestEntry/RawSpecsManifest are defined independently in both rawSpecs.ts and copySpecs.ts. If these drift, the contract breaks silently.
  • AsyncAPI overlays not passedloadAsyncAPI does not accept an overlays parameter (pre-existing limitation).
  • OpenRPC resolution is manual — external $refs won't be resolved. No existing bundler available.
  • No integration test — requires a published Docker image to exercise end-to-end.

Testing

  • 25 unit tests for rawSpecs.ts — 14 original + 3 integration (ignore, audiences, all-audiences) + 8 filterSpec unit tests
  • 9 unit tests for copySpecs.ts
  • 11 unit tests for analyzeSpecs.ts
  • All tests pass locally
  • Biome lint passes (pnpm run check — 0 errors)
  • No integration test — requires a published Docker image
  • resolve-specs command not unit-tested (thin orchestration layer)

Link to Devin session: https://app.devin.ai/sessions/a0c0b7f6b17d4c849623df9f75344477
Requested by: @Swimburger

@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it.

4 similar comments
@devin-ai-integration
Copy link
Copy Markdown
Contributor

Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it.

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it.

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it.

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it.

const rawSpecsDir = join(workspaceTempDir.path, RAW_SPECS_DIRECTORY_NAME);
await mkdir(rawSpecsDir, { recursive: true });

const containerSpecsDir = environment.usesContainerPaths ? CONTAINER_RAW_SPECS_DIRECTORY : rawSpecsDir;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical bug: containerSpecsDir should always be CONTAINER_RAW_SPECS_DIRECTORY regardless of execution environment. When environment.usesContainerPaths is false (local execution), this sets containerSpecsDir to the host path, which causes the manifest to contain host paths instead of container paths. Since the manifest is mounted into the container and read by the generator, it must contain container paths (e.g., /fern/raw-specs/...), not host paths.

// Bug - uses host path when usesContainerPaths is false:
const containerSpecsDir = environment.usesContainerPaths ? CONTAINER_RAW_SPECS_DIRECTORY : rawSpecsDir;

// Fix - always use container path for manifest entries:
const containerSpecsDir = CONTAINER_RAW_SPECS_DIRECTORY;

The rawSpecsDir (host path) is correctly used for file I/O operations, but containerSpecsDir should only be used for generating the manifest paths that the container will read.

Suggested change
const containerSpecsDir = environment.usesContainerPaths ? CONTAINER_RAW_SPECS_DIRECTORY : rawSpecsDir;
const containerSpecsDir = CONTAINER_RAW_SPECS_DIRECTORY;

Spotted by Graphite

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in the latest commit. containerSpecsDir now always uses CONTAINER_RAW_SPECS_DIRECTORY since the manifest is read inside the container.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Docs Generation Benchmark Results

Comparing PR branch against median of 5 nightly run(s) on main (latest: 2026-05-19T05:19:53Z).

Fixture main PR Delta
docs 219.6s (n=5) 218.5s (35 versions) -1.1s (-0.5%)

Docs generation runs fern generate --docs --preview end-to-end against the benchmark fixture with 35 API versions (each version: markdown processing + OpenAPI-to-IR + FDR upload).
Delta is computed against the nightly baseline on main.
Baseline from nightly run(s) on main (latest: 2026-05-19T05:19:53Z). Trigger benchmark-baseline to refresh.
Last updated: 2026-05-20 02:53 UTC

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

SDK Generation Benchmark Results

Comparing PR branch against median of 5 nightly run(s) on main (latest: 2026-05-19T05:19:53Z).

Full benchmark table (click to expand)
Generator Spec main (generator) main (E2E) PR (generator) Delta
csharp-sdk square 69s (n=5) 105s (n=5) 67s -2s (-2.9%)
go-sdk square 133s (n=5) 278s (n=5) 129s -4s (-3.0%)
java-sdk square 206s (n=5) 252s (n=5) 222s +16s (+7.8%)
php-sdk square 54s (n=5) 81s (n=5) 50s -4s (-7.4%)
python-sdk square 133s (n=5) 232s (n=5) 127s -6s (-4.5%)
ruby-sdk-v2 square 90s (n=5) 118s (n=5) 81s -9s (-10.0%)
rust-sdk square 167s (n=5) 163s (n=5) 151s -16s (-9.6%)
swift-sdk square 52s (n=5) 739s (n=5) 46s -6s (-11.5%)
ts-sdk square 230s (n=5) 235s (n=5) 222s -8s (-3.5%)

main (generator): generator-only time via --skip-scripts (includes Docker image build, container startup, IR parsing, and code generation — this is the same Docker-based flow customers use via fern generate). main (E2E): full customer-observable time including build/test scripts (nightly baseline, informational). Delta is computed against generator-only baseline.
⚠️ = generation exited with a non-zero exit code (timing may not reflect a successful run).
Baseline from nightly runs on main (latest: 2026-05-19T05:19:53Z). Trigger benchmark-baseline to refresh.
Last updated: 2026-05-20 02:54 UTC

Comment thread generators/cli/src/cli.ts Outdated
import path from "path";

interface RawSpecsManifestEntry {
type: "openapi" | "protobuf" | "openrpc" | "graphql";
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about asyncapi?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. AsyncAPI specs currently flow through the OpenAPISpec type (the OpenAPILoader detects them by checking for "asyncapi" in the file contents), so they show up as type: "openapi" in the Spec union. But the manifest should support an explicit asyncapi type for when the generator needs to distinguish them. Adding it now.

Swimburger and others added 10 commits May 18, 2026 17:24
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…t copySpecs module

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…ef paths

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Only copy files explicitly declared in generators.yml (spec, overrides,
overlays) plus any external $ref targets discovered by parsing the specs.
Protobuf roots are still copied as full directories.

- Add discoverExternalRefs() that recursively scans YAML/JSON for $ref values
- Add collectExternalRefPaths() to walk parsed documents for external $refs
- Replace cp(commonRoot) tree copy with per-file copyPathPreservingStructure()
- Handle transitive $ref chains, circular refs, missing files gracefully
- Add js-yaml dependency for YAML parsing during $ref discovery
- Update tests: 38 tests covering targeted copy, $ref discovery, transitive
  chains, circular refs, URL filtering, missing file handling

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…utting compact JSON

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@devin-ai-integration devin-ai-integration Bot force-pushed the devin/1778875366-raw-spec-mounting branch from 8f3ec68 to cf24aca Compare May 18, 2026 17:24
@devin-ai-integration devin-ai-integration Bot changed the title feat(cli): add raw API spec file mounting and embedding for CLI generator feat(cli): pre-process and mount bundled API specs for CLI generator May 18, 2026
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it.

Swimburger and others added 2 commits May 18, 2026 18:46
…mkdir

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Comment on lines +71 to +75
} catch (err: unknown) {
if (err != null && typeof err === "object" && "code" in err && err.code === "ENOENT") {
throw new Error(`Spec file not found at mount path: ${containerPath}`);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-ENOENT errors are silently swallowed instead of being re-thrown. If lstat() fails with a permission error (EACCES) or any other error, the catch block will not throw, leaving isDir = false. The subsequent cp() call on line 80 will then fail with a confusing error message.

Fix: Re-throw non-ENOENT errors:

} catch (err: unknown) {
    if (err != null && typeof err === "object" && "code" in err && err.code === "ENOENT") {
        throw new Error(`Spec file not found at mount path: ${containerPath}`);
    }
    throw err;
}
Suggested change
} catch (err: unknown) {
if (err != null && typeof err === "object" && "code" in err && err.code === "ENOENT") {
throw new Error(`Spec file not found at mount path: ${containerPath}`);
}
}
} catch (err: unknown) {
if (err != null && typeof err === "object" && "code" in err && err.code === "ENOENT") {
throw new Error(`Spec file not found at mount path: ${containerPath}`);
}
throw err;
}

Spotted by Graphite

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — non-ENOENT errors are now re-thrown instead of being silently swallowed. See ec98fdf.

disableTelemetry,
rawApiSpecs: workspace instanceof OSSWorkspace ? workspace.allSpecs : undefined
rawApiSpecs:
workspace instanceof OSSWorkspace && generatorInvocation.name === "fernapi/fern-cli"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should make this a helper function with an accept list of generators that want raw API specs

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — extracted a generatorWantsRawSpecs() helper backed by a GENERATORS_WANTING_RAW_SPECS accept list (ReadonlySet<string>). New generators can opt in by adding their name to the set. See ec98fdf.

Parses mounted pre-processed specs and prints summary info:
- Title, version, endpoint count, schema count per spec
- Supports OpenAPI, AsyncAPI (channels), OpenRPC (methods), protobuf, GraphQL
- Gracefully handles malformed or missing spec files
- 11 unit tests for analyzeSpecs, 2 for formatSpecAnalysis

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

Devin is archived and cannot be woken up. Please unarchive Devin if you want to continue using it.

Swimburger and others added 3 commits May 18, 2026 19:47
…ssing

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
audiences?: Audiences;
}): Promise<RawSpecsManifestEntry> {
switch (spec.type) {
case "openapi":
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no asyncapi?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Spec union type is OpenAPISpec | ProtobufSpec | OpenRPCSpec | GraphQLSpec — there is no separate asyncapi variant. AsyncAPI specs flow through OpenAPISpec (the OpenAPILoader detects them by checking for "asyncapi" in the file content). The resolveOpenAPIOrAsyncAPI handler already detects and handles both via isAsyncAPISpec(), and writes type: "asyncapi" in the manifest when detected (line 144).

The assertNever added in the same commit (ec98fdf) enforces exhaustiveness at compile time — if a new spec variant were added to the Spec union, this switch would fail to compile.

case "protobuf":
return copyProtobuf({ spec, hostOutputDir, containerBaseDir, index });
case "graphql":
return copyGraphQL({ spec, hostOutputDir, containerBaseDir, index });
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should always assertNever to enforce exhaustiveness.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added assertNever(spec) as the default case. This enforces exhaustiveness at compile time — if a new variant is added to the Spec union, this switch will fail to compile. See ec98fdf.

disableTelemetry,
rawApiSpecs: workspace instanceof OSSWorkspace ? workspace.allSpecs : undefined
rawApiSpecs:
workspace instanceof OSSWorkspace && generatorInvocation.name === "fernapi/fern-cli"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please...

Swimburger and others added 5 commits May 18, 2026 21:55
…cept list helper, add assertNever

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…unt specs for generators

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@Swimburger Swimburger merged commit fc79a22 into main May 20, 2026
215 checks passed
@Swimburger Swimburger deleted the devin/1778875366-raw-spec-mounting branch May 20, 2026 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants