Skip to content

Support GCS runtime dataset URIs in policyengine.py #379

@anth-volk

Description

@anth-volk

The simulation API needs to pass GCS dataset URIs for runtime dataset loading while keeping policyengine.py release manifests centered on canonical Hugging Face artifact references.

Today, policyengine.py resolves bundled datasets to HF URIs and passes remote strings directly into country package microsimulation constructors. For the US path, policyengine-us intercepts the dataset string before policyengine-core can handle gs://, so gs://policyengine-us-data/... fails instead of materializing to a local file.

Expected behavior:

  • policyengine.py can materialize gs:// dataset URIs to local files before country package construction.
  • policyengine.py can still materialize hf:// dataset URIs.
  • US and UK dataset creation and managed microsimulation helpers pass local runtime sources to country packages while preserving canonical bundle provenance.
  • Sim API can normalize manifest-derived HF URIs to GCS runtime URIs once this support is released.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions