Skip to content

bug: import_check accepts bundles whose manifest lists files missing from the tarball (no issue raised) #80

Description

@galuis116

What happened

bundle.import_check walks two loops in series: one over manifest
entries (to compute new_files / conflicts / identical), and one
over tar members (to schema-validate and now — after #74 / #75 — to
verify the per-member sha256). Neither loop flags the case where
manifest.json lists a path that has no corresponding tar member.

The symptom is the inverse of hash mismatch and is just as silent:

  • import_check returns ok=True for a bundle where, say,
    manifest.json claims claims/c1.yaml exists with a particular
    sha256, but the tarball contains no claims/c1.yaml member.
  • import_apply then iterates tar members, never reaches the
    manifest-listed-but-missing path, and writes nothing for it. The
    bundle.import audit event records a clean success with the
    bundle id, claiming "imported the bundle" — but the resulting KB
    is missing artifacts the manifest promised.

export_check already catches this (it walks recorded paths and
checks tar.getmember(path)). The same companion check is missing
on the import side — exactly the asymmetry #74 was about, just for a
different attack/corruption shape (omission instead of substitution).

What you expected

import_check should append a manifest lists missing file: <path>
issue for any manifest entry without a matching tar member,
mirroring export_check. import_apply then refuses to import
because check.issues is non-empty.

Reproduction sketch

# Build a bundle whose manifest references claims/c1.yaml but the
# tarball contains only manifest.json.
manifest = {
    "spec": bundle.SPEC_VERSION,
    "bundle_id": "deadbeef",
    "files": [{
        "path": "claims/c1.yaml",
        "size": 16,
        "sha256": hashlib.sha256(b"text: any
").hexdigest(),
    }],
    "counts": {},
    "safety": {"has_proposed": False, "has_state_db": False, "has_audit_log": False},
}
with tarfile.open("missing.tar.gz", "w:gz") as tar:
    mf = json.dumps(manifest).encode()
    info = tarfile.TarInfo(bundle.MANIFEST_NAME)
    info.size = len(mf)
    tar.addfile(info, io.BytesIO(mf))

diff = bundle.import_check(store.kb_dir, "missing.tar.gz")
# Today: diff.ok is True. Expected: diff.ok is False with
# a "manifest lists missing file" issue.

Environment

Suggested fix

In src/vouch/bundle.py import_check, after collecting
manifest_paths and walking the manifest entries for the dest
diff, add the missing-member pass that export_check already has:

member_names = {
    m.name for m in tar.getmembers()
    if m.isfile() and m.name != MANIFEST_NAME
}
for path in manifest_paths - member_names:
    issues.append(f"manifest lists missing file: {path}")

(or equivalently try: tar.getmember(path) except KeyError: ...
to mirror export_check's shape exactly).

Add a regression test in tests/test_bundle.py that builds a
bundle with a manifest entry and no matching tar member, asserts
import_check.ok is False, and asserts import_apply raises
RuntimeError("refusing to import: manifest lists missing file: ...").

Why it's worth fixing

Flagged in the review of #75 by the maintainer: "same integrity
surface" as the sha256-verification work that just shipped. Closes
the last omission-shaped hole in the import-side integrity story
so import_check and export_check enforce the same invariants.

Checked for duplicates

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions