Skip to content

Backfill can miss solved_by_pr links when PR metadata runs before issues exist #28

@it-education-md

Description

@it-education-md

Description

During backfill, the code queues PR metadata jobs before issue backfill finishes.
If a PR metadata job runs first, it tries to write issues.solved_by_pr before the issue row exists.
That write becomes a no-op, and the code does not repair it later. As a result, the mirror can keep wrong issue-to-PR resolution data after the initial backfill.

Root cause

The backfill flow splits related work into separate async steps without enforcing the required order. handlePrMetadata() assumes the issue row already exists, but handleBackfill() queues PR metadata jobs before backfillIssues() inserts issue rows.
Because the code uses issueRepo.update(...) instead of a repair/reconciliation path, the missed write is silently lost.

Scenario

A merged PR closes issue #N. Backfill stores the PR, queues the PR metadata job, and that job runs before issue #N has been inserted. The metadata job correctly fetches the closing issue number, but its update to issues.solved_by_pr affects zero rows. Later, the issue row is inserted, but solved_by_pr remains NULL. The API can then show the issue without its solving PR, even though the PR already records the closing link.

Fix direction

  • backfill issues before queueing PR metadata jobs
  • add a reconciliation pass after backfill that rebuilds solved_by_pr from merged PR closing links
  • keep the reconciliation even if ordering is improved, so correctness does not depend on queue timing

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions