New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to PostgreSQL 11.1 #1404

Closed
nicksnyder opened this Issue Dec 13, 2018 · 20 comments

Comments

Projects
None yet
5 participants
@nicksnyder
Copy link
Member

nicksnyder commented Dec 13, 2018

We are currently running a very old version of PostgreSQL (9.4). This is an opportunity to upgrade since 3.0 is a major release. Ideally we would have auto-upgrading.

Other context sourcegraph/enterprise#11192

Status:

  • Switch deploy-sourcegraph to 11
  • Switch server to 11
  • Production upgraded
  • deploy-sourcegraph auto-upgrades
  • server auto-upgrades
  • test plans

@nicksnyder nicksnyder added the roadmap label Dec 13, 2018

@nicksnyder nicksnyder added this to the 3.0 milestone Dec 13, 2018

@keegancsmith

This comment has been minimized.

Copy link
Member

keegancsmith commented Dec 14, 2018

I actually mentioned this yesterday (and said pg 11.1) https://sourcegraph.slack.com/archives/CCLF4R6EM/p1544695576192700 So yes its something I can look into when I get back from vacation.

@nicksnyder nicksnyder changed the title Upgrade to PostgreSQL 10 Upgrade to PostgreSQL 11.1 Dec 14, 2018

@beyang

This comment has been minimized.

Copy link
Member

beyang commented Jan 8, 2019

I looked into this over the holidays and concluded it would be more than a day's worth of work to get it done for 3.0 (as it would include writing up migration docs and some manual steps for existing customers). Discussed with @nicksnyder and decided to backlog.

@beyang beyang modified the milestones: 3.0, Backlog Jan 8, 2019

@keegancsmith

This comment has been minimized.

Copy link
Member

keegancsmith commented Jan 9, 2019

However, 3.0 is targeting new users and some large customers won't upgrade straight away. So shipping a new version (and including instructions on what to do to upgrade) makes a lot of sense since this will be an annoying change. To me it makes sense to get this out since its one of our oldest tech debt items.

@nicksnyder

This comment has been minimized.

Copy link
Member Author

nicksnyder commented Jan 9, 2019

Good point, I forgot about the details of what we discussed on Monday (I should have written it down here!)

@beyang @dadler @sqs is it ok if we make 3.0 a hard “current users don’t upgrade” release so we can get stuff like this in and then make 3.1 the fast follow of “current users can upgrade”?

@nicksnyder

This comment has been minimized.

Copy link
Member Author

nicksnyder commented Jan 9, 2019

@keegancsmith You are right and the conversation we had on Monday (which I forgot about) still stands. I am going to put this back in the milestone.

@nicksnyder nicksnyder modified the milestones: Backlog, 3.0 Jan 9, 2019

@tsenart

This comment has been minimized.

Copy link
Contributor

tsenart commented Jan 12, 2019

So shipping a new version (and including instructions on what to do to upgrade) makes a lot of sense since this will be an annoying change.

To pick this up, I need more complete context of what makes this annoying and what to look out for. Would you be able to brain-dump the details here @keegancsmith?

@tsenart

This comment has been minimized.

Copy link
Contributor

tsenart commented Jan 12, 2019

@tsenart

This comment has been minimized.

Copy link
Contributor

tsenart commented Jan 12, 2019

One thing I noticed we're missing is a Grafana Dashboard for Postgres. Have we attempted to deploy a Postgres exporter and setup a dashboard before? I'd like to get a sense of our workloads and data volume without needing to SSH into any servers.

@tsenart

This comment has been minimized.

Copy link
Contributor

tsenart commented Jan 12, 2019

There are three separate high level streams of work to this upgrade process that I can identify:

  1. Ship Postgres 11.1 with Sourcegraph 3.0-beta and make sure everything works with it for new customers.
  2. Write documentation AND/OR automation to upgrade existing customer deployments:
  3. Migrate sourcegraph.com with minimal downtime from 9.4 to 11.1.

I think the first item is the most pressing and should be done for 3.0-beta, the second and third come next and ought to be ready in time for 3.0.

Upgrading our own sourcegraph.com deployment should drive the documentation and automation we build for upgrading customer's deployments. With that in mind, whatever automation we use / build should be deployment agnostic (e.g. https://github.com/rtshome/pgrepup), so that it can be used in K8S, single Docker image, and others.

For all of that work to be reasonably safe, we should invest in better monitoring beforehand by deploying https://github.com/wrouesnel/postgres_exporter or something like it and setting up a good Grafana dashboard.

@tsenart

This comment has been minimized.

Copy link
Contributor

tsenart commented Jan 12, 2019

For the love of God, I can't find sourcegraph/server Dockerfile. Any pointers?

@dadler

This comment has been minimized.

Copy link

dadler commented Jan 12, 2019

FYI @nicksnyder, you may have "at-ted" the wrong username; I haven't been involved with this repo.

@nicksnyder

This comment has been minimized.

Copy link
Member Author

nicksnyder commented Jan 13, 2019

I agree about (1) being the most important for 3.0-beta

For the love of God, I can't find sourcegraph/server Dockerfile. Any pointers?

#1129

FYI @nicksnyder, you may have "at-ted" the wrong username; I haven't been involved with this repo.

Sorry! (I meant dadlerj but didn’t have autocomplete on mobile)

@nicksnyder

This comment has been minimized.

Copy link
Member Author

nicksnyder commented Jan 15, 2019

This is merged in master and works for new deployments (tested on dogfood). The remaining work is documentation and migrating sourcegraph.com, which is for 3.0

@nicksnyder nicksnyder modified the milestones: 3.0-beta, 3.0 Jan 15, 2019

@tsenart tsenart referenced this issue Jan 22, 2019

Open

Monitor and fix slow Postgres queries #1983

0 of 2 tasks complete
@keegancsmith

This comment has been minimized.

Copy link
Member

keegancsmith commented Jan 30, 2019

Updated the issue description. What is missing is auto-upgrades in server and test plans. I expect this to be ready by Thursday morning pacific time so we have time to test.

The current idea is to re-use our postgresql auto-upgrade we use in deploy-sourcegraph. However, it requires us to switch from alpine to debian/ubuntu. This should not be a big deal after some refactoring I did to our docker build process.

Note: The process of upgrading production made our auto-upgrade pretty battle hardened. So this should work well.

@keegancsmith

This comment has been minimized.

Copy link
Member

keegancsmith commented Jan 31, 2019

The current status of this is I was blocked by our CI and docker versions. I spent most of the day fixing this (went down the wrong path multiple times). Finally our CI supports multi-stage builds. But this leaves little time for me to do the automatic upgrades + us testing. I am working on it now, but it will be a bit delayed. See #2083

To mitigate risk, we could detect outdated PGDATA, and tell users to run a specific command to upgrade for server? This would be much easier to implement quickly and we can test that (while still hoping to get into automatic upgrades). cc @nicksnyder.

@nicksnyder

This comment has been minimized.

Copy link
Member Author

nicksnyder commented Jan 31, 2019

Any step that mitigates risk and allows us to release 3.0 on time is worth doing.

tell users to run a specific command to upgrade for server

How would we communicate this to users? At the least we would document this in the upgrade docs, but what happens if a user doesn't read that?

@keegancsmith

This comment has been minimized.

Copy link
Member

keegancsmith commented Jan 31, 2019

but what happens if a user doesn't read that?

We fail startup with a helpful diagnostic message which is essentially what would be in the upgrade docs.

@nicksnyder

This comment has been minimized.

Copy link
Member Author

nicksnyder commented Feb 1, 2019

@tsenart There are two paths forward, and you can choose whichever path you think is least risky for us cutting a tested 3.0 release:

Path 1:

Path 2:

  • Detect if there is old PGDATA. If so, don't startup and instead print a message for a manual command to run to perform migration
  • Write a test plan.
  • Update migration guide

Keegan recommended path 2

@tsenart

This comment has been minimized.

Copy link
Contributor

tsenart commented Feb 1, 2019

There were a few things we couldn't finish in time for 3.0. Additionally, we figured out that upgrading to 11.1 was a blocker for certain customers, and hence, learned last minute that we'd need to upgrade to 10.6 instead.

Here's a TODO list for us to refer to afterwards.


  • Tag 3.0.1 release when all of the above is done. Kick-off announcement.

@nicksnyder nicksnyder changed the title Upgrade to PostgreSQL 11.1 Upgrade to PostgreSQL 10.6 Feb 1, 2019

@nicksnyder nicksnyder modified the milestones: 3.0, 3.1 Feb 1, 2019

@tsenart tsenart changed the title Upgrade to PostgreSQL 10.6 Upgrade to PostgreSQL 11.1 Feb 6, 2019

@nicksnyder

This comment has been minimized.

Copy link
Member Author

nicksnyder commented Feb 8, 2019

This is done

@nicksnyder nicksnyder closed this Feb 8, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment