Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We do not mention postgres admin is required, leading to confusing migration issues #5087

Closed
dadlerj opened this issue Aug 5, 2019 · 5 comments · Fixed by #18413
Closed
Assignees
Labels
deploy-sourcegraph Issues that affect sourcegraph/deploy-sourcegraph docs important planned/3.16 Issues that were planned for the given milestone. Used by cmd/tracking-issue. quick

Comments

@dadlerj
Copy link
Member

dadlerj commented Aug 5, 2019

  • Sourcegraph version: Sourcegraph 3.4.2
  • Platform information: K8s cluster

Reported by a customer:

Steps to reproduce:

I am trying to install Sourcegraph on Kubernetes using an external RDS DB and keep getting the error message: "Fatal error connecting to Postgres DB: Failed to migrate the DB. Please contact support@sourcegraph.com for further assistance: Dirty database version 1503574972. Fix and force version."

I am using the cloned repo from here: https://github.com/sourcegraph/deploy-sourcegraph.

And repro steps:

I had originally deployed Sourcegraph entirely on Kubernetes and everything worked well. Then, I wanted to move the DB to RDS and so I created an RDS instance with a sourcegraph DB and user and reconfigured all the yaml files. I then deleted all the deployed resources, including the pgsql pod and redeployed with the new configuration.

The only other thing I did was remove the pgsql directory from base/ as I did not want the script (kubectl-apply-all.sh) to redeploy the pgsql pod and service.

@dadlerj dadlerj added deploy-sourcegraph Issues that affect sourcegraph/deploy-sourcegraph team/distribution 🚢📦💨 labels Aug 5, 2019
@dadlerj dadlerj added this to the SWAT milestone Aug 5, 2019
@slimsag
Copy link
Member

slimsag commented Aug 12, 2019

Looking at the error, it appears we partially initialized the DB schema and failed on the first migration we run. I don't know how that could possibly occur, so it is quite bizarre.

One possibility I can think of, if you copied an existing DB into RDS, is that the copy was incomplete in some way (e.g. perhaps the schema_migrations table was omitted in the copy by accident?)

If it was a fresh DB creation (which it sounds like it was?), then it would seem Sourcegraph began initializing the DB and was interrupted and got stuck here for some reason. That is odd, but we can correct it. This seems likely because I note the error message indicates it is failing on the very first schema migration we would normally apply -- indicating the DB schema isn't initialized at all yet.

To correct this, please try:

  1. Confirm the logs still show 1503574972 as the problematic DB version. If not, don't continue with these steps.
  2. Get a psql prompt connected to your RDS instance, with the Sourcegraph DB name and user.
  3. Confirm that running \d+ shows only a few tables. If it shows a fuller list like this then don't continue with these steps.
  4. Assuming the above output showed only a few tables, run the problematic migration manually, by entering the following:
CREATE EXTENSION IF NOT EXISTS citext;
CREATE EXTENSION IF NOT EXISTS hstore;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
  1. Force the DB schema version by running the following:
UPDATE schema_migrations SET version=1503574972, dirty=false;
  1. Restart (kubectl delete pod $POD_ID) the frontend/management-console/repo-updater pods so they more quickly retry.

After this, my expectation is that the error will go away after about 5 minutes (once the first restarted pod has had a chance to initialize the DB schema).

@slimsag
Copy link
Member

slimsag commented Aug 13, 2019

Update: The user discovered that the problem was that the sourcegraph user role in postgres had been created without admin privileges (for security reasons) and that prevented the first migration's CREATE EXTENSIONs from working.

We should update our external DB docs to include mention of this.

@slimsag slimsag changed the title Can't connect to remote DB: "Fatal error connecting to Postgres DB: Failed to migrate the DB" docs: Clarify that Sourcegraph schema migrations require postgres user to be admin, include advice on how to workaround Aug 13, 2019
@slimsag slimsag assigned slimsag and unassigned ggilmore Aug 13, 2019
@beyang beyang modified the milestones: SWAT, 3.8 Aug 20, 2019
@beyang
Copy link
Member

beyang commented Aug 20, 2019

Taking this out of SWAT as the updating of the docs is not a p0

@slimsag slimsag added the docs label Sep 16, 2019
@slimsag slimsag modified the milestones: 3.8, Backlog Sep 16, 2019
@slimsag
Copy link
Member

slimsag commented Sep 16, 2019

This didn't make 3.8.

@uwedeportivo
Copy link
Contributor

Dear all,

This is your release captain speaking. 🚂🚂🚂

Branch cut for the 3.16 release is scheduled for tomorrow.

Is this issue / PR going to make it in time? Please change the milestone accordingly.
When in doubt, reach out!

Thank you

@beyang beyang modified the milestones: 3.16, Backlog May 18, 2020
@beyang beyang added the planned/3.16 Issues that were planned for the given milestone. Used by cmd/tracking-issue. label May 18, 2020
@slimsag slimsag moved this from Needs triage to To do in OLD - Distribution - use "Distribution 🚢" Jun 3, 2020
@slimsag slimsag changed the title docs: Clarify that Sourcegraph schema migrations require postgres user to be admin, include advice on how to workaround We do not mention postgres admin is required, leading to confusing migration issues Jun 23, 2020
@slimsag slimsag moved this from To do to Medium priority (ordered) in OLD - Distribution - use "Distribution 🚢" Jun 30, 2020
@kghopson kghopson added this to Backlog in Documentation Aug 7, 2020
@kghopson kghopson moved this from Backlog to Next in Documentation Sep 11, 2020
@christinaforney christinaforney moved this from Next to Backlog in Documentation Oct 12, 2020
@pecigonzalo pecigonzalo assigned daxmc99 and unassigned beyang Feb 5, 2021
@pecigonzalo pecigonzalo removed this from the Backlog milestone Feb 5, 2021
@daxmc99 daxmc99 moved this from To do to In progress in Distribution: 2021.02.08 - Priceless Rhinoceros Feb 10, 2021
Documentation automation moved this from Backlog to Done Feb 19, 2021
Distribution: 2021.02.08 - Priceless Rhinoceros automation moved this from In progress to Done Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deploy-sourcegraph Issues that affect sourcegraph/deploy-sourcegraph docs important planned/3.16 Issues that were planned for the given milestone. Used by cmd/tracking-issue. quick
Projects
Development

Successfully merging a pull request may close this issue.

7 participants