Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

v5.1.5 Migrator: v5.0.3 and greater to v5.1.5 upgrades fail #55414

@DaedalusG

Description

@DaedalusG
  • Sourcegraph version: v5.1.5
  • Platform information: All deployment types utilizing migrator will be affected by this bug

Internal debugging thread: https://sourcegraph.slack.com/archives/C032Z79NZQC/p1690655844061789

Steps to reproduce:

  1. Attempt to upgrade a Sourcegraph instance from any version to v5.1.5
  2. Observe Migrator exit loop on SQL error

Reproduction

A Sourcegraph instance is initialized in v5.0.3 and upgrade via a standard upgrade (up) and multiversion (upgrade) command.

Migrator is observed to crash loop on up and the upgrade command errors

Migrator standard upgrade up

λ ~/deploy-sourcegraph-docker/docker-compose/ v5.0.3 docker ps
CONTAINER ID   IMAGE                                         COMMAND                  CREATED          STATUS                             PORTS                                                         NAMES
20d28200cb4b   sourcegraph/frontend:5.0.3                    "/sbin/tini -- /usr/…"   9 seconds ago    Up 8 seconds (health: starting)                                                                  sourcegraph-frontend-0
1dd8f4855233   sourcegraph/frontend:5.0.3                    "/sbin/tini -- /usr/…"   15 seconds ago   Up 14 seconds (healthy)                                                                          sourcegraph-frontend-internal
a1fed0eb27e4   sourcegraph/redis-store:5.0.3                 "/sbin/tini -- redis…"   30 seconds ago   Up 26 seconds                      6379/tcp                                                      redis-store
b5482347503e   sourcegraph/gitserver:5.0.3                   "/sbin/tini -- /usr/…"   30 seconds ago   Up 26 seconds                                                                                    gitserver-0
8139a5566639   sourcegraph/github-proxy:5.0.3                "/sbin/tini -- /usr/…"   30 seconds ago   Up 27 seconds                                                                                    github-proxy
9d798eb0d309   sourcegraph/codeinsights-db:5.0.3             "/postgres.sh"           30 seconds ago   Up 26 seconds (healthy)            5432/tcp                                                      codeinsights-db
c5fb78c13663   sourcegraph/repo-updater:5.0.3                "/sbin/tini -- /usr/…"   30 seconds ago   Up 26 seconds                                                                                    repo-updater
0ca89e93d099   sourcegraph/search-indexer:5.0.3              "/sbin/tini -- zoekt…"   30 seconds ago   Up 27 seconds                                                                                    zoekt-indexserver-0
89fa6b53c0a2   sourcegraph/indexed-searcher:5.0.3            "/sbin/tini -- /bin/…"   30 seconds ago   Up 27 seconds (healthy)                                                                          zoekt-webserver-0
8397b3c2e1e2   sourcegraph/redis-cache:5.0.3                 "/sbin/tini -- redis…"   30 seconds ago   Up 26 seconds                      6379/tcp                                                      redis-cache
ab77fef558a5   sourcegraph/codeintel-db:5.0.3                "/postgres.sh"           30 seconds ago   Up 26 seconds (healthy)            5432/tcp                                                      codeintel-db
274dc1bfa353   sourcegraph/postgres_exporter:5.0.3           "/usr/local/bin/post…"   30 seconds ago   Up 27 seconds                      9187/tcp                                                      codeinsights-db-exporter
736262a3c903   sourcegraph/cadvisor:5.0.3                    "/usr/bin/cadvisor -…"   30 seconds ago   Up 26 seconds (health: starting)   8080/tcp                                                      cadvisor
fe94549a3b4a   sourcegraph/syntax-highlighter:5.0.3          "sh -c '/http-server…"   30 seconds ago   Up 26 seconds (healthy)            9238/tcp                                                      syntect-server
785337d43111   sourcegraph/worker:5.0.3                      "/sbin/tini -- /usr/…"   30 seconds ago   Up 26 seconds                      3189/tcp                                                      worker
04ba6a67957e   sourcegraph/postgres-12-alpine:5.0.3          "/postgres.sh"           30 seconds ago   Up 26 seconds (healthy)            5432/tcp                                                      pgsql
2b84e6a881c8   sourcegraph/prometheus:5.0.3                  "/bin/prom-wrapper"      30 seconds ago   Up 26 seconds                      0.0.0.0:9090->9090/tcp                                        prometheus
a34f0dbba700   sourcegraph/precise-code-intel-worker:5.0.3   "/sbin/tini -- /usr/…"   30 seconds ago   Up 27 seconds (healthy)            3188/tcp                                                      precise-code-intel-worker
48badbf888e9   sourcegraph/node-exporter:5.0.3               "/bin/node_exporter …"   30 seconds ago   Up 27 seconds                      9100/tcp                                                      node-exporter
3ab568c192fe   sourcegraph/postgres_exporter:5.0.3           "/usr/local/bin/post…"   30 seconds ago   Up 27 seconds                      9187/tcp                                                      codeintel-db-exporter
937b415051a8   caddy:2.5.2-alpine                            "caddy run --config …"   30 seconds ago   Up 26 seconds                      0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp, 443/udp, 2019/tcp   caddy
cf661fdeebb6   sourcegraph/blobstore:5.0.3                   "/sbin/tini -- /opt/…"   30 seconds ago   Up 27 seconds (healthy)            9000/tcp                                                      blobstore
70e1b393a342   sourcegraph/postgres_exporter:5.0.3           "/usr/local/bin/post…"   30 seconds ago   Up 27 seconds                      9187/tcp                                                      pgsql-exporter
0f201630afec   sourcegraph/opentelemetry-collector:5.0.3     "/bin/otelcol-source…"   30 seconds ago   Up 27 seconds                                                                                    otel-collector
dd308fdaeba5   sourcegraph/searcher:5.0.3                    "/sbin/tini -- /usr/…"   30 seconds ago   Up 28 seconds (healthy)                                                                          searcher-0
cbf039437210   sourcegraph/grafana:5.0.3                     "/opt/grafana/entry.…"   30 seconds ago   Up 27 seconds                      0.0.0.0:3370->3370/tcp                                        grafana
b7957dc4dc10   sourcegraph/symbols:5.0.3                     "/sbin/tini -- /usr/…"   30 seconds ago   Up 28 seconds (healthy)            3184/tcp                                                      symbols-0

After checking out the v5.1.5 tag and running docker-compose up -d the terminal session hangs
Screenshot 2023-07-29 at 10 36 39 PM

In another terminals session we check the status of containers and find migrator and dependent services restarting:

λ ~/ docker ps
CONTAINER ID   IMAGE                                         COMMAND                  CREATED          STATUS                         PORTS                                                         NAMES
cf82e9a4e9a7   sourcegraph/migrator:5.1.5                    "/sbin/tini -- /migr…"   40 seconds ago   Restarting (1) 1 second ago

Checking migrators container logs we find migrator continuously restarts:

λ ~/ docker logs migrator
✱ Sourcegraph migrator 5.1.5
ℹ️ Connection DSNs used: frontend => postgres://sg:sg@pgsql:5432/sg?sslmode=disable, codeintel => postgres://sg:sg@codeintel-db:5432/sg?sslmode=disable, codeinsights => postgres://postgres:password@codeinsights-db:5432/postgres?sslmode=disable
Attempting connection to postgres://sg:sg@pgsql:5432/sg?sslmode=disable...
✅ Connection to "postgres://sg:sg@pgsql:5432/sg?sslmode=disable" succeeded
Attempting connection to postgres://sg:sg@codeintel-db:5432/sg?sslmode=disable...
✅ Connection to "postgres://sg:sg@codeintel-db:5432/sg?sslmode=disable" succeeded
Attempting connection to postgres://postgres:password@codeinsights-db:5432/postgres?sslmode=disable...
✅ Connection to "postgres://postgres:password@codeinsights-db:5432/postgres?sslmode=disable" succeeded
{"SeverityText":"ERROR","Timestamp":1690695371906853900,"InstrumentationScope":"migrator.migrations.Up","Caller":"store/store.go:387","Function":"github.com/sourcegraph/sourcegraph/internal/database/migration/store.(*Store).Up","Body":"operation.error","Resource":{"service.name":"migrator","service.version":"5.1.5","service.instance.id":"bbc6d73c-3ed9-4241-b5d0-707e0b19b2e0"},"Attributes":{"count":1,"elapsed":0.0564196,"error":"ERROR: relation \"repo_paths\" does not exist (SQLSTATE 42P01)"}}
{"SeverityText":"ERROR","Timestamp":1690695371910218000,"InstrumentationScope":"migrator.migrations.WithMigrationLog","Caller":"store/store.go:461","Function":"github.com/sourcegraph/sourcegraph/internal/database/migration/store.(*Store).WithMigrationLog","Body":"operation.error","Resource":{"service.name":"migrator","service.version":"5.1.5","service.instance.id":"bbc6d73c-3ed9-4241-b5d0-707e0b19b2e0"},"Attributes":{"count":1,"elapsed":0.0639039,"error":"failed to apply migration 1682683129:\n```\nCREATE TABLE IF NOT EXISTS own_aggregate_recent_view\n(\n    id                  SERIAL PRIMARY KEY,\n    viewer_id           INTEGER NOT NULL REFERENCES users (id) ON DELETE CASCADE DEFERRABLE,\n    viewed_file_path_id INTEGER NOT NULL REFERENCES repo_paths (id),\n    views_count         INTEGER DEFAULT 0\n);\nCREATE UNIQUE INDEX IF NOT EXISTS own_aggregate_recent_view_viewer\n    ON own_aggregate_recent_view\n        USING btree (viewed_file_path_id, viewer_id);\nCOMMENT ON TABLE own_aggregate_recent_view\n    IS 'One entry contains a number of views of a single file by a given viewer.';\nCREATE TABLE IF NOT EXISTS event_logs_scrape_state_own\n(\n    id          SERIAL\n        CONSTRAINT event_logs_scrape_state_own_pk\n            PRIMARY KEY,\n    bookmark_id INT NOT NULL,\n    job_type    INT NOT NULL\n);\nCOMMENT ON TABLE event_logs_scrape_state_own IS 'Contains state for own jobs that scrape events if enabled.';\nCOMMENT ON COLUMN event_logs_scrape_state_own.bookmark_id IS 'Bookmarks the maximum most recent successful event_logs.id that was scraped';\n```: ERROR: relation \"repo_paths\" does not exist (SQLSTATE 42P01)"}}
failed to run migration for schema "frontend": failed to apply migration 1682683129:
CREATE TABLE IF NOT EXISTS own_aggregate_recent_view
(
    id                  SERIAL PRIMARY KEY,
    viewer_id           INTEGER NOT NULL REFERENCES users (id) ON DELETE CASCADE DEFERRABLE,
    viewed_file_path_id INTEGER NOT NULL REFERENCES repo_paths (id),
    views_count         INTEGER DEFAULT 0
);
CREATE UNIQUE INDEX IF NOT EXISTS own_aggregate_recent_view_viewer
    ON own_aggregate_recent_view
        USING btree (viewed_file_path_id, viewer_id);
COMMENT ON TABLE own_aggregate_recent_view
    IS 'One entry contains a number of views of a single file by a given viewer.';
CREATE TABLE IF NOT EXISTS event_logs_scrape_state_own
(
    id          SERIAL
        CONSTRAINT event_logs_scrape_state_own_pk
            PRIMARY KEY,
    bookmark_id INT NOT NULL,
    job_type    INT NOT NULL
);
COMMENT ON TABLE event_logs_scrape_state_own IS 'Contains state for own jobs that scrape events if enabled.';
COMMENT ON COLUMN event_logs_scrape_state_own.bookmark_id IS 'Bookmarks the maximum most recent successful event_logs.id that was scraped';
: ERROR: relation "repo_paths" does not exist (SQLSTATE 42P01)
✱ Sourcegraph migrator 5.1.5
ℹ️ Connection DSNs used: frontend => postgres://sg:sg@pgsql:5432/sg?sslmode=disable, codeintel => postgres://sg:sg@codeintel-db:5432/sg?sslmode=disable, codeinsights => postgres://postgres:password@codeinsights-db:5432/postgres?sslmode=disable
Attempting connection to postgres://sg:sg@pgsql:5432/sg?sslmode=disable...
✅ Connection to "postgres://sg:sg@pgsql:5432/sg?sslmode=disable" succeeded
Attempting connection to postgres://sg:sg@codeintel-db:5432/sg?sslmode=disable...
✅ Connection to "postgres://sg:sg@codeintel-db:5432/sg?sslmode=disable" succeeded
Attempting connection to postgres://postgres:password@codeinsights-db:5432/postgres?sslmode=disable...
✅ Connection to "postgres://postgres:password@codeinsights-db:5432/postgres?sslmode=disable" succeeded
failed to run migration for schema "frontend": dirty database: schema "frontend" marked the following migrations as failed: 1682683129
The target schema is marked as dirty and no other migration operation is seen running on this schema. The last migration operation over this schema has failed (or, at least, the migrator instance issuing that migration has died). Please contact support@sourcegraph.com for further assistance.

Migrator is erroring while attempting to run a schema migration 1682683129 setting the database as dirty.

In another session shelling into the database and checking the migration_logs table to see where the migration runner has failed:

sg=# SELECT * FROM migration_logs WHERE success IS false;
-[ RECORD 1 ]-----------------+--------------------------------------------------------------------------------------------------------------------------------------------
id                            | 566
migration_logs_schema_version | 2
schema                        | schema_migrations
version                       | 1682683129
up                            | t
started_at                    | 2023-07-30 05:36:11.846863+00
finished_at                   | 2023-07-30 05:36:11.908437+00
success                       | f
error_message                 | failed to apply migration 1682683129:                                                                                                      +
                              | ```                                                                                                                                        +
                              | CREATE TABLE IF NOT EXISTS own_aggregate_recent_view                                                                                       +
                              | (                                                                                                                                          +
                              |     id                  SERIAL PRIMARY KEY,                                                                                                +
                              |     viewer_id           INTEGER NOT NULL REFERENCES users (id) ON DELETE CASCADE DEFERRABLE,                                               +
                              |     viewed_file_path_id INTEGER NOT NULL REFERENCES repo_paths (id),                                                                       +
                              |     views_count         INTEGER DEFAULT 0                                                                                                  +
                              | );                                                                                                                                         +
                              | CREATE UNIQUE INDEX IF NOT EXISTS own_aggregate_recent_view_viewer                                                                         +
                              |     ON own_aggregate_recent_view                                                                                                           +
                              |         USING btree (viewed_file_path_id, viewer_id);                                                                                      +
                              | COMMENT ON TABLE own_aggregate_recent_view                                                                                                 +
                              |     IS 'One entry contains a number of views of a single file by a given viewer.';                                                         +
                              | CREATE TABLE IF NOT EXISTS event_logs_scrape_state_own                                                                                     +
                              | (                                                                                                                                          +
                              |     id          SERIAL                                                                                                                     +
                              |         CONSTRAINT event_logs_scrape_state_own_pk                                                                                          +
                              |             PRIMARY KEY,                                                                                                                   +
                              |     bookmark_id INT NOT NULL,                                                                                                              +
                              |     job_type    INT NOT NULL                                                                                                               +
                              | );                                                                                                                                         +
                              | COMMENT ON TABLE event_logs_scrape_state_own IS 'Contains state for own jobs that scrape events if enabled.';                              +
                              | COMMENT ON COLUMN event_logs_scrape_state_own.bookmark_id IS 'Bookmarks the maximum most recent successful event_logs.id that was scraped';+
                              | ```: ERROR: relation "repo_paths" does not exist (SQLSTATE 42P01)
backfilled                    | f

This is causing the runner to fail here: https://sourcegraph.com/github.com/sourcegraph/sourcegraph@3ab5bb324b2f8404feffb5f4f6e7bc876eefac4d/-/blob/internal/database/migration/runner/runner.go?L393-396

Multiversion upgrade upgrade

A multi-version upgrade was also tested by initializing an instance at v5.0.3 and running the migrator upgrade command:

  migrator:
    container_name: migrator
    image: 'index.docker.io/sourcegraph/migrator:5.1.5'
    cpus: 0.5
    mem_limit: '500m'
    command: ['upgrade', '--from=v5.0.3', '--to=v5.1.5']
...
λ ~/deploy-sourcegraph-docker/docker-compose/ v5.0.3* docker-compose up migrator
Creating pgsql           ... done
Creating codeintel-db    ... done
Creating codeinsights-db ... done
Creating migrator        ... done
Attaching to migrator
migrator                         | ✱ Sourcegraph migrator 5.1.5
migrator                         | ℹ️ Connection DSNs used: frontend => postgres://sg:sg@pgsql:5432/sg?sslmode=disable, codeintel => postgres://sg:sg@codeintel-db:5432/sg?sslmode=disable, codeinsights => postgres://postgres:password@codeinsights-db:5432/postgres?sslmode=disable
migrator                         | Attempting connection to postgres://sg:sg@pgsql:5432/sg?sslmode=disable...
migrator                         | ✅ Connection to "postgres://sg:sg@pgsql:5432/sg?sslmode=disable" succeeded
migrator                         | ℹ️ Connection DSNs used: frontend => postgres://sg:sg@pgsql:5432/sg?sslmode=disable&timezone=UTC, codeintel => postgres://sg:sg@codeintel-db:5432/sg?sslmode=disable, codeinsights => postgres://postgres:password@codeinsights-db:5432/postgres?sslmode=disable
migrator                         | Attempting connection to postgres://sg:sg@pgsql:5432/sg?sslmode=disable&timezone=UTC...
migrator                         | ✅ Connection to "postgres://sg:sg@pgsql:5432/sg?sslmode=disable&timezone=UTC" succeeded
migrator                         | Attempting connection to postgres://sg:sg@codeintel-db:5432/sg?sslmode=disable...
migrator                         | ✅ Connection to "postgres://sg:sg@codeintel-db:5432/sg?sslmode=disable" succeeded
migrator                         | Attempting connection to postgres://postgres:password@codeinsights-db:5432/postgres?sslmode=disable...
migrator                         | ✅ Connection to "postgres://postgres:password@codeinsights-db:5432/postgres?sslmode=disable" succeeded
migrator                         | 👉 Migrating to v5.1 (step 1 of 1)
migrator                         | 👉 Running schema migrations
migrator                         | {"SeverityText":"ERROR","Timestamp":1690697789737890800,"InstrumentationScope":"migrator.migrations.Up","Caller":"store/store.go:387","Function":"github.com/sourcegraph/sourcegraph/internal/database/migration/store.(*Store).Up","Body":"operation.error","Resource":{"service.name":"migrator","service.version":"5.1.5","service.instance.id":"8193870e-86c8-4ee2-8ef3-1a6ade9401a1"},"Attributes":{"count":1,"elapsed":0.0084011,"error":"ERROR: relation \"repo_paths\" does not exist (SQLSTATE 42P01)"}}
migrator                         | {"SeverityText":"ERROR","Timestamp":1690697789740881800,"InstrumentationScope":"migrator.migrations.WithMigrationLog","Caller":"store/store.go:461","Function":"github.com/sourcegraph/sourcegraph/internal/database/migration/store.(*Store).WithMigrationLog","Body":"operation.error","Resource":{"service.name":"migrator","service.version":"5.1.5","service.instance.id":"8193870e-86c8-4ee2-8ef3-1a6ade9401a1"},"Attributes":{"count":1,"elapsed":0.0140002,"error":"failed to apply migration 1682683129:\n```\nCREATE TABLE IF NOT EXISTS own_aggregate_recent_view\n(\n    id                  SERIAL PRIMARY KEY,\n    viewer_id           INTEGER NOT NULL REFERENCES users (id) ON DELETE CASCADE DEFERRABLE,\n    viewed_file_path_id INTEGER NOT NULL REFERENCES repo_paths (id),\n    views_count         INTEGER DEFAULT 0\n);\nCREATE UNIQUE INDEX IF NOT EXISTS own_aggregate_recent_view_viewer\n    ON own_aggregate_recent_view\n        USING btree (viewed_file_path_id, viewer_id);\nCOMMENT ON TABLE own_aggregate_recent_view\n    IS 'One entry contains a number of views of a single file by a given viewer.';\nCREATE TABLE IF NOT EXISTS event_logs_scrape_state_own\n(\n    id          SERIAL\n        CONSTRAINT event_logs_scrape_state_own_pk\n            PRIMARY KEY,\n    bookmark_id INT NOT NULL,\n    job_type    INT NOT NULL\n);\nCOMMENT ON TABLE event_logs_scrape_state_own IS 'Contains state for own jobs that scrape events if enabled.';\nCOMMENT ON COLUMN event_logs_scrape_state_own.bookmark_id IS 'Bookmarks the maximum most recent successful event_logs.id that was scraped';\n```: ERROR: relation \"repo_paths\" does not exist (SQLSTATE 42P01)"}}
migrator                         | failed to run migration for schema "frontend": failed to apply migration 1682683129:
migrator                         | ```
migrator                         | CREATE TABLE IF NOT EXISTS own_aggregate_recent_view
migrator                         | (
migrator                         |     id                  SERIAL PRIMARY KEY,
migrator                         |     viewer_id           INTEGER NOT NULL REFERENCES users (id) ON DELETE CASCADE DEFERRABLE,
migrator                         |     viewed_file_path_id INTEGER NOT NULL REFERENCES repo_paths (id),
migrator                         |     views_count         INTEGER DEFAULT 0
migrator                         | );
migrator                         | CREATE UNIQUE INDEX IF NOT EXISTS own_aggregate_recent_view_viewer
migrator                         |     ON own_aggregate_recent_view
migrator                         |         USING btree (viewed_file_path_id, viewer_id);
migrator                         | COMMENT ON TABLE own_aggregate_recent_view
migrator                         |     IS 'One entry contains a number of views of a single file by a given viewer.';
migrator                         | CREATE TABLE IF NOT EXISTS event_logs_scrape_state_own
migrator                         | (
migrator                         |     id          SERIAL
migrator                         |         CONSTRAINT event_logs_scrape_state_own_pk
migrator                         |             PRIMARY KEY,
migrator                         |     bookmark_id INT NOT NULL,
migrator                         |     job_type    INT NOT NULL
migrator                         | );
migrator                         | COMMENT ON TABLE event_logs_scrape_state_own IS 'Contains state for own jobs that scrape events if enabled.';
migrator                         | COMMENT ON COLUMN event_logs_scrape_state_own.bookmark_id IS 'Bookmarks the maximum most recent successful event_logs.id that was scraped';
migrator                         | ```: ERROR: relation "repo_paths" does not exist (SQLSTATE 42P01)

The same error pattern holds here, with migration 1682683129 failing.

If you would like immediate help on this, please email support@sourcegraph.com (you can still create the issue, but there are no SLAs on issues like there are for support requests).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions