Skip to content

lc verify reports broken_chain for analysis-level Inputs missing from manifest input_versions #90

@EiffL

Description

@EiffL

What happened

Any output whose Output.inputs references an analysis-level Input (declared at the top-level inputs: block with a source: field) fails lc verify with broken_chain. The generated Snakefile only plumbs sibling-output inputs into the rule's inputs={...} dict, so analysis-level Inputs are used for recipe-template substitution ({inputs.<id>}) but never fingerprinted into the manifest's input_versions. lc verify then walks Output.inputs from the spec, sees the analysis-level input id is absent from input_versions, and reports broken_chain even when nothing is actually wrong.

Where in the code

src/lightcone/engine/snakefile.py ~lines 340–352: rule_inputs is populated only from find_upstream_output(...); analysis-level Inputs go into external_inputs and feed only the recipe-template renderer. Result: inputs={...} passed to run_rule (and therefore to write_manifest) omits external Inputs entirely.

src/lightcone/engine/manifest.py write_manifest correctly fingerprints whatever it's given via inputs.items() — but the snakefile generator doesn't pass the external Inputs in.

Error

$ lc verify
Universe baseline
  ✗ broken_chain  map_fit         input 'union21_table' missing from manifest
  ✗ broken_chain  hubble_diagram  input 'union21_table' missing from manifest
  ✗ broken_chain  chi2_per_dof    input 'union21_table' missing from manifest

3 integrity failure(s).

Manifest sample (note empty input_versions):

{
  "output_id": "map_fit",
  "input_versions": {},
  "decisions": {"priors": "flat_uninformative", ...},
  ...
}

Generated Snakefile rule for map_fit (note inputs={}):

rule map_fit:
    output:
        data=directory("results/{universe}/map_fit"),
        manifest="results/{universe}/map_fit/.lightcone-manifest.json",
    params:
        cfg=lambda wc: CFG["map_fit"][wc.universe],
    run:
        run_rule(
            rule_key="map_fit",
            universe=wildcards.universe,
            output_dir=Path(output.data),
            inputs={},
            cfg=dict(params.cfg),
        )

The corresponding Output declares inputs: [union21_table] and the Input declares source: "data/SCPUnion2.1_mu_vs_z.txt".

Reproduction

  1. Scaffold a project with one analysis-level Input pointing at a local file:

    inputs:
      - id: union21_table
        type: data
        source: "data/SCPUnion2.1_mu_vs_z.txt"
    outputs:
      - id: map_fit
        type: data
        inputs: [union21_table]
        decisions: [...]
        recipe:
          command: python src/fit_map.py --data {inputs.union21_table} --out {output}/map_fit.h5
  2. Materialize: lc run → succeeds, output is written.

  3. lc verifybroken_chain map_fit input 'union21_table' missing from manifest.

lc status is unaffected (reports ok); only lc verify surfaces the problem.

Environment

  • ASTRA: 0.2.6
  • lightcone-cli: 0.2.1.dev4+g6b45e42bf
  • Python: 3.13.11
  • OS: Linux 6.4.0-150600 (Cray Shasta / NERSC Perlmutter)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions