Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple migrations: Update on-premise deployment instructions #25256

Closed
efritz opened this issue Sep 22, 2021 · 13 comments
Closed

Decouple migrations: Update on-premise deployment instructions #25256

efritz opened this issue Sep 22, 2021 · 13 comments
Assignees
Labels
team/delivery Delivery team

Comments

@efritz
Copy link
Contributor

efritz commented Sep 22, 2021

Partially implements Step 4 in RFC 469.

Update the instructions in deploy-sourcegraph and deploy-sourcegraph-docker repositories to mention the explicit upgrade step that needs to be performed. This also needs to be reflected in the changelog and the update instructions in the user-facing docs.

Note: This is a docs-only change that is the minimum necessary change to move forward. We'll try to improve ergonomics for site-admins during upgrades in future issues.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 1, 2021

Heads up @dan-mckean @caugustus-sourcegraph @kevinwojo - the "team/delivery" label was applied to this issue.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2021

Heads up @daxmc99 @JenRed777 @danieldides - the "team/devops" label was applied to this issue.

@danieldides danieldides self-assigned this Dec 13, 2021
@danieldides
Copy link
Contributor

Making steady progress on this. I've put together a pretty docker-compose.yaml file with just the frontend service and the pgsql container pinnned at a version after Eric's changes:

version: '2.4'
services:
  # This container terminates so we can be pretty generous with resources
  migrator:
    container_name: sourcegraph-migrator
    image: 'index.docker.io/sourcegraph/migrator:122149_2021-12-17_1f7179c'
    cpus: 4
    mem_limit: '8g'
    command:
      ["up", "-db", "frontend"]
    environment:
      - PGHOST=pgsql
      - PGPORT=5432
      - PGUSER=sg
      - PGPASSWORD=sg
      - PGDATABASE=sg
      - PGSSLMODE=disable
    restart: "on-failure"
    networks: 
      - sourcegraph
    depends_on:
      pgsql:
        condition: service_healthy

  sourcegraph-frontend-0:
    container_name: sourcegraph-frontend-0
    image: 'index.docker.io/sourcegraph/frontend:122149_2021-12-17_1f7179c'
    cpus: 4
    mem_limit: '8g'
    environment:
      - DEPLOY_TYPE=docker-compose
      - GOMAXPROCS=12
      - JAEGER_AGENT_HOST=jaeger
      - PGHOST=pgsql
      - CODEINTEL_PGHOST=codeintel-db
      - CODEINSIGHTS_PGDATASOURCE=postgres://postgres:password@codeinsights-db:5432/postgres
      - 'SRC_GIT_SERVERS=gitserver-0:3178'
      - 'SRC_SYNTECT_SERVER=http://syntect-server:9238'
      - 'SEARCHER_URL=http://searcher-0:3181'
      - 'SYMBOLS_URL=http://symbols-0:3184'
      - 'INDEXED_SEARCH_SERVERS=zoekt-webserver-0:6070'
      - 'SRC_FRONTEND_INTERNAL=sourcegraph-frontend-internal:3090'
      - 'REPO_UPDATER_URL=http://repo-updater:3182'
      - 'GRAFANA_SERVER_URL=http://grafana:3370'
      - 'JAEGER_SERVER_URL=http://jaeger:16686'
      - 'GITHUB_BASE_URL=http://github-proxy:3180'
      - 'PROMETHEUS_URL=http://prometheus:9090'
      - 'NEW_MIGRATIONS=true'
    healthcheck:
      test: "wget -q 'http://127.0.0.1:3080/healthz' -O /dev/null || exit 1"
      interval: 5s
      timeout: 10s
      retries: 5
      start_period: 300s
    volumes:
      - 'sourcegraph-frontend-0:/mnt/cache'
    networks:
      - sourcegraph
    restart: always
    depends_on:
      sourcegraph-frontend-internal:
        condition: service_healthy

  sourcegraph-frontend-internal:
    container_name: sourcegraph-frontend-internal
    image: 'index.docker.io/sourcegraph/frontend:122149_2021-12-17_1f7179c'
    cpus: 4
    mem_limit: '8g'
    environment:
      - DEPLOY_TYPE=docker-compose
      - GOMAXPROCS=4
      - PGHOST=pgsql
      - CODEINTEL_PGHOST=codeintel-db
      - CODEINSIGHTS_PGDATASOURCE=postgres://postgres:password@codeinsights-db:5432/postgres
      - 'SRC_GIT_SERVERS=gitserver-0:3178'
      - 'SRC_SYNTECT_SERVER=http://syntect-server:9238'
      - 'SEARCHER_URL=http://searcher-0:3181'
      - 'SYMBOLS_URL=http://symbols-0:3184'
      - 'INDEXED_SEARCH_SERVERS=zoekt-webserver-0:6070'
      - 'SRC_FRONTEND_INTERNAL=sourcegraph-frontend-internal:3090'
      - 'REPO_UPDATER_URL=http://repo-updater:3182'
      - 'GRAFANA_SERVER_URL=http://grafana:3000'
      - 'JAEGER_SERVER_URL=http://jaeger:16686'
      - 'GITHUB_BASE_URL=http://github-proxy:3180'
      - 'PROMETHEUS_URL=http://prometheus:9090'
      - 'NEW_MIGRATIONS=true'
    volumes:
      - 'sourcegraph-frontend-internal-0:/mnt/cache'
    networks:
      - sourcegraph
    restart: always
    healthcheck:
      test: "wget -q 'http://127.0.0.1:3080/healthz' -O /dev/null || exit 1"
      interval: 5s
      timeout: 10s
      retries: 5
      start_period: 300s
    depends_on:
      migrator:
        condition: service_completed_successfully

  pgsql:
    container_name: pgsql
    image: 'index.docker.io/sourcegraph/postgres-12.6-alpine:3.34.2@sha256:231c6d176f81928dcd2fd81cb7976c5a61dc59e4c4c4bad3825f9c29e88a5fb8'
    cpus: 4
    mem_limit: '2g'
    healthcheck:
      test: '/liveness.sh'
      interval: 10s
      timeout: 1s
      retries: 10
      start_period: 15s
    volumes:
      - 'pgsql:/data/'
    networks:
      - sourcegraph
    restart: always
    stop_grace_period: 120s

  redis-cache:
    container_name: redis-cache
    image: 'index.docker.io/sourcegraph/redis-cache:3.34.2@sha256:44632658f0eb74a1dd89333221404adcd20088753e8b936d92d234a26536ea0f'
    cpus: 1
    mem_limit: '7g'
    volumes:
      - 'redis-cache:/redis-data'
    networks:
      - sourcegraph
    restart: always

  redis-store:
    container_name: redis-store
    image: 'index.docker.io/sourcegraph/redis-store:3.34.2@sha256:873c5c9c1eec0d7bf589e26b7f29b882067dd5ae6af2bb75018dae0cc2fa59f6'
    cpus: 1
    mem_limit: '7g'
    volumes:
      - 'redis-store:/redis-data'
    networks:
      - sourcegraph
    restart: always

volumes:
  pgsql:
  redis-cache:
  redis-store:
  sourcegraph-frontend-0:
  sourcegraph-frontend-internal-0:
networks:
  sourcegraph:

If you run this without the migrator container, the frontend will log:
sourcegraph-frontend-internal | ERROR: failed to connect to frontend database: database schma out of date endlessly. Once you enable the migrator container however, the startup logs show a successful startup sequence, where pgsql comes online, the migrator container waits for it to be ready, executes the migrations, and then the frontend is allowed online

pgsql | 2021-12-18 03:09:44.719 UTC [1] LOG: database system is ready to accept connections
sourcegraph-migrator | t=2021-12-18T03:09:55+0000 lvl=info msg="Checked current version" schema=frontend version=0 dirty=false
sourcegraph-migrator | t=2021-12-18T03:09:55+0000 lvl=info msg="Upgrading schema" schema=frontend
sourcegraph-migrator | t=2021-12-18T03:09:55+0000 lvl=info msg="Running up migration" schema=frontend migrationID=1528395834
...
sourcegraph-migrator | t=2021-12-18T03:09:58+0000 lvl=info msg="Running up migration" schema=frontend migrationID=1528395959
sourcegraph-migrator exited with code 0
sourcegraph-frontend-internal | t=2021-12-18T03:09:59+0000 lvl=info msg="endpoints: using rendezvous hashing"
sourcegraph-frontend-internal | t=2021-12-18T03:09:59+0000 lvl=info msg="Checked current version" schema=frontend version=1528395959 dirty=false

I think this is will work for restructuring the compose file. The migrator depends_on the database, frontend depends on migrator to exit cleanly (it exits with a 0 exit code on success and a 1 on failure, which will trigger a restart), and then all other apps can depend_on the frontend becoming live.

@danieldides
Copy link
Contributor

Finally got around to migrating some of these changes into actual PRs:

  • deploy-sourcegraph-docker - These changes will implement the migrator containers for docker compose deployments
  • sourcegraph/sourcegraph - This change adds docs for admins that may wish to migrate their DB before upgrading the application containers

I think these changes may round out what's left on this ticket. We still need to figure out an upgrade strategy for k8s deployments.

@JenRed777
Copy link

Waiting for @efritz review

@efritz
Copy link
Contributor Author

efritz commented Jan 3, 2022

@JenRed777 Back from vacation! 👀 , left approvals and comments.

@danieldides Seems like https://github.com/sourcegraph/sourcegraph/pull/29329/files is specific to docker-compose, although the general structure of such a doc would be the same for pure-docker as well as k8s.

@danieldides
Copy link
Contributor

Thanks for taking a look @efritz! Right now these changes are just for the docker-compose deployments. As far as I could tell there's no way to run pure-docker (which I assume is the single-container deployment type?) with an external database so I didn't include instructions for that use case.

We're still figuring out Kubernetes (it's blocked by some other, large changes we're making to our deployments) and that will be tracked here: #25257. I'll include the documentation changes when that ticket is completed.

@JenRed777
Copy link

@efritz @danieldides is this one complete?

@efritz
Copy link
Contributor Author

efritz commented Jan 10, 2022

I believe that @danieldides has updated the instructions for compose, but we're still waiting for the migrator instance to run as a Kubernetes job (cc @daxmc99).

@efritz
Copy link
Contributor Author

efritz commented Jan 10, 2022

I see that sourcegraph/deploy-sourcegraph-docker#684 is still pending as well.

@danieldides
Copy link
Contributor

Those PRs are probably safe to be converted from drafts into actual PRs now and merged. I was keeping them unmerged so they couldn't possibly mess up the 3.35 release (since the documentation changes will go live "immediately" on sourcegraph.com) but we're long past that now. I'll go through today and clean everything up.

Kubernetes work is moving forward steadily. We've got an agreed-upon plan and I'm just in the finishing touches getting the work merged in. This ticket didn't explicitly call out the Kubernetes requirement but there are "on-premise" instances running Kubernetes. I'm not sure if we need to keep this ticket open, however.

@danieldides
Copy link
Contributor

Kubernetes update: sourcegraph/deploy-sourcegraph#4058

@JenRed777
Copy link

Documentation is complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team/delivery Delivery team
Projects
None yet
Development

No branches or pull requests

4 participants