Skip to content

feat: extract openshell-policy-schema as a thin crate with no proto dependency #1608

@feloy

Description

@feloy

Problem Statement

openshell-policy is consumed by external projects (e.g. openshell-image-builder) that only
need YAML parsing and serialization:

openshell_policy::parse_sandbox_policy(yaml_str)   // &str → SandboxPolicy (proto)
openshell_policy::serialize_sandbox_policy(&policy) // SandboxPolicy (proto) → String

Both operations are pure YAML — no gRPC, no networking, no runtime server. Yet openshell-policy
pulls in openshell-core, which unconditionally depends on tonic, tonic-build, and
protobuf-src. The last of these compiles protobuf from C++ source using autotools, which is
incompatible with MSVC on Windows and forces downstream projects to set up MSYS2 just to parse
YAML.

The root mismatch is that the public API returns a proto type (SandboxPolicy) that
consumers do not actually need — they need the YAML representation.

Proposed Design

Observation: the serde types are already the canonical source of truth

crates/openshell-policy/src/lib.rs documents this explicitly:

"The serde types here are the single canonical representation of the YAML policy schema.
Both parsing (YAML→proto) and serialization (proto→YAML) use these types, ensuring round-trip
fidelity."

The private structs PolicyFile, FilesystemDef, NetworkPolicyRuleDef, NetworkEndpointDef,
etc. define the YAML schema. The proto types are a secondary wire format layered on top of them.
Consumers who only need YAML should depend on the serde types directly, with no proto in the
picture.

New crate: openshell-policy-schema

Extract the serde types and the YAML-only logic from openshell-policy into a new crate with
minimal dependencies:

crates/openshell-policy-schema/

Cargo.toml — only pure-Rust dependencies:

[dependencies]
serde     = { workspace = true }
serde_yml = { workspace = true }
serde_json = { workspace = true }
miette    = { workspace = true }

No openshell-core, no tonic, no protobuf-src, no prost.

Public API of openshell-policy-schema:

/// Top-level YAML policy document.
pub struct SandboxPolicy { ... }       // renamed from PolicyFile
pub struct FilesystemPolicy { ... }    // renamed from FilesystemDef
pub struct LandlockPolicy { ... }      // renamed from LandlockDef
pub struct ProcessPolicy { ... }       // renamed from ProcessDef
pub struct NetworkPolicyRule { ... }   // renamed from NetworkPolicyRuleDef
pub struct NetworkEndpoint { ... }     // renamed from NetworkEndpointDef
// ... all supporting types

/// Parse a policy.yaml string.
pub fn parse_policy(yaml: &str) -> Result<SandboxPolicy>;

/// Serialize a policy to a YAML string.
pub fn serialize_policy(policy: &SandboxPolicy) -> Result<String>;

/// Serialize a policy to JSON.
pub fn serialize_policy_json(policy: &SandboxPolicy) -> Result<String>;

/// Load a policy from a file or the OPENSHELL_SANDBOX_POLICY env var.
pub fn load_policy(cli_path: Option<&str>) -> Result<Option<SandboxPolicy>>;

/// Validate a policy for safety violations (path traversal, overly broad
/// paths, TLD wildcards, process identity, etc.).
pub fn validate_policy(policy: &SandboxPolicy) -> std::result::Result<(), Vec<PolicyViolation>>;

/// Return the restrictive default policy.
pub fn restrictive_default() -> SandboxPolicy;

/// Ensure run_as_user/run_as_group are set to "sandbox".
pub fn ensure_sandbox_process_identity(policy: &mut SandboxPolicy);

pub const CONTAINER_POLICY_PATH: &str = "/etc/openshell/policy.yaml";
pub const LEGACY_CONTAINER_POLICY_PATH: &str = "/etc/navigator/policy.yaml";

All of this logic already exists in openshell-policy/src/lib.rs. The move is a refactor, not a
rewrite. The serde structs and the validation / load / default functions contain no proto
references and can be lifted verbatim.

openshell-policy becomes an adapter

openshell-policy retains its existing public API but delegates to openshell-policy-schema for
all YAML work, adding only the proto conversions:

# openshell-policy/Cargo.toml
[dependencies]
openshell-policy-schema = { path = "../openshell-policy-schema" }
openshell-core          = { path = "../openshell-core" }
// openshell-policy/src/lib.rs
use openshell_policy_schema as schema;
use openshell_core::proto;

pub fn parse_sandbox_policy(yaml: &str) -> Result<proto::SandboxPolicy> {
    let policy = schema::parse_policy(yaml)?;
    Ok(to_proto(policy))   // existing conversion logic
}

pub fn serialize_sandbox_policy(policy: &proto::SandboxPolicy) -> Result<String> {
    let schema_policy = from_proto(policy);   // existing conversion logic
    schema::serialize_policy(&schema_policy)
}

compose.rs and merge.rs (which manipulate proto types) stay in openshell-policy unchanged.
The existing public surface of openshell-policy is fully preserved.

Consumer usage

openshell-image-builder switches from openshell-policy to openshell-policy-schema:

openshell-policy-schema = { git = "...", tag = "..." }
// Before (pulls in tonic + protobuf-src):
use openshell_policy::{parse_sandbox_policy, serialize_sandbox_policy};
let policy: proto::SandboxPolicy = parse_sandbox_policy(yaml)?;

// After (zero heavy deps):
use openshell_policy_schema::{parse_policy, serialize_policy};
let policy: SandboxPolicy = parse_policy(yaml)?;

The only code change is the import path and the type name — the fields and behavior are
identical.

Alternatives Considered

Add a grpc feature flag to openshell-core (see issue.md): gates tonic, tonic-build,
and protobuf-src behind an optional feature and checks in a pre-generated sandbox.proto
Rust file. This keeps the existing SandboxPolicy proto type in the public API and avoids
changing any consumer code, but requires maintaining a pre-generated file in sync with
sandbox.proto and adds a feature-flag to reason about across the workspace.

The thin-crate approach is simpler to reason about, has no generated file to maintain, and
gives consumers a type that is precisely what they need (the YAML model, not the wire model).
The tradeoff is a small consumer-side import path change.

Agent Investigation

Explored codebase on 2026-05-28:

  • crates/openshell-policy/src/lib.rs (lines 37–229): the private serde structs (PolicyFile,
    FilesystemDef, NetworkPolicyRuleDef, NetworkEndpointDef, GraphqlOperationDef,
    L7RuleDef, L7AllowDef, L7DenyRuleDef, L7QueryMatcher, NetworkBinaryDef, LandlockDef,
    ProcessDef) define the complete YAML schema. The doc comment on the module explicitly names
    them "the single canonical representation of the YAML policy schema".

  • docs/reference/policy-schema.mdx: the published human-readable schema reference. Every
    field, type, and constraint documented there maps 1-to-1 to those serde structs.

  • There is no JSON Schema or OpenAPI spec for the policy YAML. The serde types and the MDX
    docs are the only authoritative definitions.

  • parse_sandbox_policy / serialize_sandbox_policy in lib.rs are thin wrappers: they call
    serde_yml::from_str into the serde types, then call to_proto / from_proto. The
    YAML-handling code is completely separate from the proto-handling code.

  • crates/openshell-policy/src/compose.rs and merge.rs: use openshell_core::proto types
    (SandboxPolicy, NetworkPolicyRule, etc.). These stay in openshell-policy unchanged.

  • crates/openshell-policy/src/lib.rs validation (validate_sandbox_policy, normalize_path,
    PolicyViolation): operates on proto::SandboxPolicy today, but only accesses plain string
    and bool fields — no proto encoding or gRPC involved. It can be rewritten against the serde
    types with minimal change, or kept in openshell-policy and duplicated as
    validate_policy in openshell-policy-schema.

  • proto/sandbox.proto: the proto schema from which SandboxPolicy is generated. All field
    names and types have a 1-to-1 correspondence with the serde structs — the conversion functions
    to_proto / from_proto are straightforward field copies with no logic.

Checklist

  • I've reviewed existing issues and the architecture docs
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

No one assigned

    Labels

    state:triage-neededOpened without agent diagnostics and needs triage

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions