Skip to content

bug: syncRegeneratedSession uploads content to LFS but discards fileRefs, creating orphan blobs #533

@galexy

Description

@galexy

Symptom

ox session regenerate <name> (default mode — artifact regeneration) uploads regenerated content files to LFS, but throws away the returned file references. meta.json is never updated to point at the newly-uploaded blobs. The blobs become orphaned in LFS storage (eligible for server-side GC) and subsequent uploads still reference whatever OIDs were in the old meta.json.

Evidence

Reproduced live during #519 investigation. After regenerating session artifacts from a rebuilt raw.jsonl:

  • uploadSessionLFS pushed the 235-entry raw.jsonl content to LFS and returned a fresh OID.
  • meta.json was never updated to reference the fresh OID.
  • The next ox session upload used the old OID from the preserved meta.json and published the stale (50-entry) version.
  • The 235-entry blob remains in LFS storage with no reference anywhere. It's an orphan.

Root cause

cmd/ox/session_regenerate.go:207-223:

// syncRegeneratedSession re-uploads to LFS and pushes to the ledger for a single session.
func syncRegeneratedSession(projectRoot, sessionPath, sessionName string) error {
    if _, err := uploadSessionLFS(projectRoot, sessionPath); err != nil {
        //         ^^^
        //         fileRefs returned by uploadSessionLFS are DISCARDED
        return fmt.Errorf("LFS upload: %w", err)
    }

    ledgerPath, err := resolveLedgerPath()
    if err != nil {
        return fmt.Errorf("resolve ledger: %w", err)
    }

    if err := commitAndPushLedger(ledgerPath, sessionName); err != nil {
        return fmt.Errorf("commit and push: %w", err)
    }

    return nil
}

The return value from uploadSessionLFS is assigned to _. No meta.json update, no WritePointerFiles. commitAndPushLedger then commits whatever *.jsonl / *.md files the glob matches — which are still the un-pointerized content files — pushing full content to the ledger repo (also wrong, since LFS pointers should be what's committed).

Impact

  • Orphan LFS blobs. Every ox session regenerate call leaves orphaned blobs behind. Eventually GC'd, but wastes storage and network in the meantime.
  • meta.json drift. After regenerate, meta.json on disk still references old OIDs while new blobs exist in LFS. Any subsequent read that follows the pointer gets stale data.
  • Bad git commits. Because pointer-replacement never runs, commitAndPushLedger commits the full content files instead of pointer stubs. The ledger git repo grows with every regenerate.
  • Subsequent uploads publish wrong content. See sibling issue (ox session upload replaces files with pointers from preserved meta.json before reading them).

Fix direction

Mirror the pattern used correctly in cmd/ox/agent_session.go:1218-1252:

fileRefs, err := uploadSessionLFS(projectRoot, sessionPath)
if err != nil { ... }

// load + update meta.json
meta, err := lfs.ReadSessionMeta(sessionPath)
if err != nil { ... }
meta.Files = fileRefs
if err := lfs.WriteSessionMetaOnly(sessionPath, meta); err != nil { ... }

// commit with pointer stubs
if err := commitAndPushLedger(ledgerPath, sessionName); err != nil { ... }

// replace content with pointers AFTER successful push
if _, err := lfs.WritePointerFiles(sessionPath, meta.Files); err != nil { ... }

Alternatively, extract a shared helper that encapsulates the upload → meta.Files update → commit → post-push pointer-replacement sequence and use it from every code path that uploads (session_upload_cmd.go, session_regenerate.go, agent_session.go, daemon agentwork).

Acceptance

  • After ox session regenerate <name>, meta.json references the fresh LFS blob OIDs for any content that was regenerated.
  • ox session upload <name> immediately after regenerate uploads correctly (no stale content).
  • No orphan blobs produced by the regenerate → upload sequence.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions