Skip to content

GCS uploads bypass staging — production bucket overwritten before promotion #559

@baogorek

Description

@baogorek

Description

The two-phase publish pipeline (stage → promote) has an asymmetry: GCS uploads go directly to production paths during the staging phase, while HuggingFace correctly uploads to staging/ paths. This means GCS production data is overwritten immediately, before the promote step is ever run.

Root cause

In modal_app/local_area.py, upload_to_staging() (lines 308-320) calls upload_local_area_file(..., skip_hf=True) which writes to GCS at production paths like states/AL.h5. Meanwhile, the HF upload correctly goes to staging/states/AL.h5.

The promote_publish() function only operates on HuggingFace (copy staging/ → production, delete staging/). GCS is never part of the promotion step.

Impact

  • GCS production data is overwritten as soon as staging runs, with no rollback
  • If promote is never run or fails, GCS has new data while HF production has stale data
  • The staging/promotion safety guarantee is completely bypassed for GCS

Proposed fix

Move GCS uploads from upload_to_staging() to promote_publish(), so both GCS and HF are updated together during promotion. The files are already available on the Modal staging volume.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions