Skip to content

feat: Remote mounts, serving tiers, and peer-mode CLI#1428

Open
bplatz wants to merge 7 commits into
feature/rdfs-enforcement-entailmentfrom
feature/remote-mounts
Open

feat: Remote mounts, serving tiers, and peer-mode CLI#1428
bplatz wants to merge 7 commits into
feature/rdfs-enforcement-entailmentfrom
feature/remote-mounts

Conversation

@bplatz

@bplatz bplatz commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Serve ledgers to downstream consumers in two tiers and let a consumer mount a remote Fluree's ledgers as read-only, locally-queryable sources. Design doc: docs/design/remote-mounts.md; user guide: docs/guides/sharing-data.md.

The model

  • Query serving — the origin executes queries (its compute, row-level policy per identity). The only tier with fine-grained permissioning.
  • Block serving — the origin serves canonical, CID-verified index content and the consumer executes queries locally (their compute, Iceberg-style). Strictly all-or-nothing per (token, ledger); fine-grained access stays on the query tier.

What's included

Raw block serving. ProxyStorage gains explicit read modes: Raw fetches canonical CAS bytes via GET /storage/objects/{cid} with client-side CID verification; Filtered keeps the FLKB negotiation. Peer proxy mode now uses Raw — which is what makes binary-indexed ledgers actually readable over the proxy (the FLKB tier has no leaf decoder on the read path).

Per-ledger serving posture. New f:servingDefaults setting group (f:serveQuery / f:serveBlocks / f:publicVisibility) in the ledger config graph, enforced on transaction-role servers only (query gate 403; blocks gate 404 on /storage/block, /storage/objects, /commits, /pack) and advertised per-caller on NS record responses plus a coarse block in /.well-known/fluree.json.

Remote mounts. FlureeBuilder::with_remote_mount composes a CompositeNameService (prefix-routed reads with record localization — remote inventory:main appears as acme/inventory:main; writes to mounts rejected) with a new StorageBackend::Routed variant (namespace-prefix store selection at the single content_store seam). Mounted ledgers keep full native semantics including mixed local+mounted dataset queries. ProxyStorage/ProxyNameService moved to fluree-db-nameservice-sync (server re-exports preserve the old paths).

CLI peer mode. fluree track add --mode peer runs queries locally over blocks fetched on demand, CID-verified and cached in a persistent per-remote disk cache; writes still forward over HTTP. Plus fluree remote ledgers (the token's auth-filtered catalog with serving tiers) and fluree cache status|clear.

HTTP Range. Single-range requests on /storage/objects (206 + Content-Range, full-object CID verification before slicing) and native ranged reads in ProxyStorage.

Vended S3 credentials. For S3-backed origins, GET /storage/credentials?ledger= mints STS AssumeRole grants narrowed by session policy to the ledger's name-level prefix (covers all branches + the @shared dict namespace). Consumers auto-refresh grants inside an expiry margin; the CLI probes and prefers S3-direct automatically, falling back to proxied reads on 404. Single-bucket S3 only in this iteration; short TTLs since grants outlive revocation until expiry.

Fix along the way: @shared dict-blob addresses carry only the ledger name, so proxy clients derive a default-branch alias — the block/object endpoints now branch-resolve dict-blob requests (previously peers tracking non-main branches 404'd on dictionary fetches). The legacy per-branch dict layout also parses now.

Docs

New: design/remote-mounts.md, guides/sharing-data.md, cli/cache.md. Updated: operations/query-peers.md (two read tiers; corrected /storage/block leaf semantics), ledger-config/setting-groups.md, design/auth-contract.md, cli/track.md, cli/remote.md, api/endpoints.md, and cli/server-integration.md (the contract an embedding server must expose for peer-mode consumers, including the dict-blob branch-resolution and 404-fallback requirements).

Testing

  • proxy_integration.rs: 28 tests including end-to-end peer query over HTTP against an indexed ledger, remote-mount mixed-dataset query + write rejection, serving-posture gate/advertisement flips, raw-mode byte identity + ranged reads, vended-credentials 404 gate.
  • LocalStack round trip (it_vended_credentials_testcontainers): mint against real STS → build reader from the grant → read a CAS object through the fluree address layer.
  • Wiremock coverage for grant fetch 404-fallback and refresh-on-expiry; unit tests for the session-policy shape, mount routing/localization, and CLI target resolution (peer → local queries, downgrade to HTTP for writes).
  • CI parity: clippy (all features/targets, -D warnings) and cargo check --workspace --all-features --all-targets clean.

Notes for review

  • Base is feature/rdfs-enforcement-entailment (this branch was cut from its tip).
  • Serving posture binds only the origin's serving surface by design — a consumer holding the blocks always queries its own copy; rationale in the design doc.
  • LocalStack community doesn't enforce IAM session policies, so prefix-scoping enforcement rests on the policy-shape unit tests + AWS semantics; worth one manual verification against real AWS before production reliance.
  • Follow-ups deliberately out of scope: server-level mount config flags, --mode auto negotiation, the f:publicVisibility anonymous tier, split-bucket vend grants.

bplatz added 7 commits July 4, 2026 10:07
Serve ledgers to other Fluree instances in two tiers and let a consumer
mount a remote's ledgers as read-only, locally-queryable sources.

- ProxyStorage read modes: Raw fetches canonical CAS bytes via
  GET /storage/objects/{cid} with client-side CID verification (what
  makes indexed ledgers readable over the proxy — the FLKB tier has no
  leaf decoder on the read path); Filtered keeps the FLKB negotiation.
  Peer proxy mode now uses Raw.
- Per-ledger serving posture: new f:servingDefaults setting group
  (f:serveQuery / f:serveBlocks / f:publicVisibility) in the ledger
  config graph, enforced on transaction-role servers only (query gate
  403, blocks gate 404 on /storage/block, /storage/objects, /commits,
  /pack) and advertised per-caller as serving: ["query","blocks"] on
  NS record responses plus a coarse block in /.well-known/fluree.json.
- Remote mounts: FlureeBuilder::with_remote_mount composes a
  CompositeNameService (prefix-routed reads with record localization,
  writes to mounted aliases rejected) with StorageBackend::Routed
  (namespace-prefix store selection at the content_store seam), so
  mounted ledgers get full native semantics including mixed datasets.
- ProxyStorage/ProxyNameService moved to fluree-db-nameservice-sync
  (server re-exports keep the peer paths); from_api_base constructors
  for non-default API mounts; mount-prefix stripping on derived aliases.
- HTTP Range on /storage/objects (206 + Content-Range, full-object CID
  verification before slicing) and native ranged reads in ProxyStorage.
- Fix: dict-blob requests are branch-resolved server-side — @shared
  addresses carry only the ledger name, so non-main-branch peers
  previously 404'd on dict fetches; the legacy per-branch dict layout
  now parses too.
- fluree track add --mode peer: queries execute locally against index
  blocks fetched on demand from the remote's raw storage tier
  (CID-verified), while writes and admin commands keep forwarding over
  HTTP (resolve_ledger_mode downgrades the peer target to Tracked).
- Per-remote persistent artifact cache under the OS cache dir; entries
  are content-addressed and immutable so clearing is always safe.
  verify_freshness_on_cache_hit keeps heads current against the remote.
- fluree remote ledgers <name>: the remote's auth-filtered catalog with
  the serving tiers each ledger offers (query / blocks).
- fluree cache status|clear for the peer cache.
- track list shows the mode column; peer entries persist as
  mode = "peer" in [[tracked_ledgers]].
- New docs/design/remote-mounts.md: the serving-tier model (query /
  blocks / reserved filtered tier), per-caller resolution, mount
  architecture (CompositeNameService + StorageBackend::Routed +
  ProxyStorage modes), and the CID-verified cache-forever integrity
  semantics, with the fine-grained and vended-origin extension points.
- setting-groups.md: f:servingDefaults as a ledger-scoped group.
- query-peers.md: the two read tiers, corrected /storage/block leaf
  semantics, Range behavior, raw-tier access model.
- auth-contract.md: discovery serving capability block.
- CLI docs: track --mode peer, remote ledgers, new cache page.
For S3-backed origins, hand authorized peers short-lived STS credentials
scoped to a ledger's prefix so they read index content directly from S3
(native ranged reads, no origin bandwidth) instead of proxying every
object through the origin's HTTP server.

- fluree-db-api::vended_credentials: STS AssumeRole minting with a
  session policy narrowed to the ledger's name-level prefix (covers all
  branches + the @shared dict namespace, matching the all-or-nothing
  raw tier); s3:ListBucket is prefix-conditioned so missing keys stay
  404 (the reader's legacy dict fallback depends on 404-vs-403);
  S3VendScope extraction from the connection config (single-bucket S3
  only — split commit/index layouts are refused).
- Server: GET /storage/credentials?ledger= behind the same guards as
  raw object serving (bearer scope, namespace guard, f:serveBlocks
  posture; 404 anti-leak). Config: --storage-vend-enabled,
  --storage-vend-role-arn, --storage-vend-ttl-secs (default 900, the
  STS minimum — grants outlive revocations until expiry, so short TTLs).
- fluree-db-nameservice-sync (feature aws): grant fetch client, a
  ProvideCredentials impl that refreshes grants inside a 60s expiry
  margin (single-flight), and build_vended_s3_storage composing an
  S3Storage whose credentials auto-refresh; 404 means fall back to
  proxied reads.
- CLI peer mode probes the endpoint and prefers direct S3 automatically,
  falling back to ProxyStorage.
- Tests: LocalStack round trip (mint -> grant -> S3 reader -> CAS object
  read through the fluree address layer), wiremock refresh/404 paths,
  policy-shape unit tests, server 404 gate test.
End-to-end guide covering the sharing patterns (query serving vs
peer/block serving vs replication) with a decision table, provider setup
(trusted issuers, token minting per tier, per-ledger f:servingDefaults
participation, identity-bound row-level permissioning, vended S3
credentials), the consumer-side CLI workflow (remote add / auth login /
remote ledgers / track modes / clone / cache), programmatic mounts, and
the revocation/integrity/freshness semantics. Indexed in SUMMARY and the
guides README.
Document the endpoints an embedding server must expose for CLI peer mode
(fluree track --mode peer) and fluree remote ledgers: the NsRecord lookup
and CAS object endpoints with their required semantics (all-or-nothing
authorization with 404 anti-leak, dict-blob branch resolution for
name-scoped @shared artifacts, exact CID-verifiable bytes), recommended
Range support, and the optional vended-credentials endpoint with its
404-fallback contract.
@bplatz bplatz force-pushed the feature/remote-mounts branch from 078af0b to 929b024 Compare July 4, 2026 14:07

@aaj3f aaj3f left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice feature addition -- glad to have it and eager to use it

if state.config.server_role != ServerRole::Peer {
return Ok(handle.snapshot().await.to_ledger_state());
let ledger_state = handle.snapshot().await.to_ledger_state();
let serving = crate::routes::serving::effective_serving_from_state(&ledger_state).await?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query serving gate calls effective_serving_from_stateconfig_resolver::resolve_ledger_config, which resolves the entire ledger config graph (policy, shacl, reasoning, datalog, transact, full_text, serving, graph_overrides — 8 group reads plus a find_instances_of_type scan) on every query to a transaction-role server, purely to read f:serveQuery. It duplicates the config resolution the db-view build already performs per query (fluree-db-api/src/view/fluree_ext.rs:120). Bounding factors (why this is minor, not major): resolve_ledger_config early-returns cheaply when the config graph is empty (guard at config_resolver.rs:67-84), so unconfigured ledgers pay almost nothing; the gate runs only on transaction/origin servers (peers skip it); and the in-memory config graph is dwarfed by the query it precedes. Still worth fixing: resolve only the serving group (see the serving.rs suggestion), thread the already-resolved config into the gate, or memoize the posture per (ledger_id, t).

// Prefer a targeted resolver (see serving.rs suggestion) so the gate reads
// only f:servingDefaults instead of the full config graph, or reuse the
// ResolvedConfig the view build already computes for this same snapshot/t.
let serving = crate::routes::serving::effective_serving_from_state(&ledger_state).await?;

Comment on lines +717 to +720
// 3c. Serving gate: the ledger's f:serveBlocks posture must allow raw
// content serving.
let serving =
crate::routes::serving::effective_serving(&state.fluree, &effective_ledger).await?;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_object_by_cid is the peer-mode raw-read hot path: a cold peer sync fetches every index leaf, branch, root, dict blob, and commit through this endpoint. This PR adds, per object, both a nameservice lookup (via resolve_block_ledger, line 708) and a full resolve_ledger_config (via effective_serving, line 720). Previously this handler only checked token scope and read bytes — zero ledger loads, zero config resolution. Bounding factors (why this is minor, not major): the ledger handle is cached (no per-object reload — effective_serving reuses ledger_cached); resolve_ledger_config early-returns for unconfigured ledgers; and per-request cost is dominated by JWT verification + storage IO + full-object SHA-256, so the aggregate config-resolution overhead on a cold sync is negligible next to the byte transfer. The ns.lookup namespace guard is a legitimate correctness addition (namespace guard + dict-blob branch resolution), not waste — only the repeated full-config resolution is avoidable. Memoize EffectiveServing per (effective_ledger, t) on AppState and/or resolve only the serving group.

// e.g. a DashMap<(String, i64), EffectiveServing> on AppState keyed by
// (effective_ledger, snapshot.t), populated on miss — a config change bumps t.
let serving = state.effective_serving_cached(&effective_ledger).await?;

Comment on lines +55 to +69
/// Resolve the serving posture from an already-loaded ledger state.
///
/// Reads the config graph as-of `state.t()` (novelty-inclusive), so a
/// committed-but-unindexed config change takes effect immediately.
pub(crate) async fn effective_serving_from_state(
state: &LedgerState,
) -> Result<EffectiveServing, ServerError> {
let overlay: &dyn OverlayProvider = &*state.novelty;
let config = config_resolver::resolve_ledger_config(&state.snapshot, overlay, state.t())
.await
.map_err(|e| ServerError::internal(format!("Serving config resolution failed: {e}")))?;
Ok(EffectiveServing::from_config(
config.as_ref().and_then(|c| c.serving.as_ref()),
))
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

effective_serving_from_state resolves the whole LedgerConfig (all 8 setting groups) but uses only config.serving. Every gate check (query, object, block, commits, pack, credentials, NS advertisement) pays for 7 unused group reads. Add a targeted resolver that reads only f:servingDefaults.

// In config_resolver: expose a resolve_serving_only(snapshot, overlay, to_t)
// that runs find_instances_of_type + read_serving_defaults and nothing else.
let serving = config_resolver::resolve_serving_only(&state.snapshot, overlay, state.t()).await?;
Ok(EffectiveServing::from_config(serving.as_ref()))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants