Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practices for database migrations #423

Open
theoephraim opened this issue May 3, 2019 · 18 comments
Open

Best practices for database migrations #423

theoephraim opened this issue May 3, 2019 · 18 comments

Comments

@theoephraim
Copy link

After looking through the docs and issues, I'm still a bit confused about best practices for running database migrations as part of a deployment. This seems like a pretty standard requirement for this type of system. Please forgive me if I'm missing something obvious.

It looks like you do have a pre-deploy hook, but this does not have access to the actual code, is more for dealing with container configuration.

Basically I'm looking for the equivalent of Heroku's release phase

  • I want to run these migrations on the actual cluster during runtime, not while building the image
  • the migration script needs access to the codebase and all dependencies
  • if the deploy script fails, the deployment should be aborted
  • if deploying several instances, ideally this this script should be run by only 1 instance

Any pointers would be much appreciated.

Thanks!

@githubsaturn
Copy link
Collaborator

This is not something that's currently available as a built-in solution. But it's definitely something that we need to add in near future. Just to be clear, the first two items:

  • I want to run these migrations on the actual cluster during runtime, not while building the image
  • the migration script needs access to the codebase and all dependencies

are already possible by the virtue of modifying the CMD line in Dockerfile or manually using docker exec to run bash command inside the container.

The second two items:

  • if the deploy script fails, the deployment should be aborted
  • if deploying several instances, ideally this this script should be run by only 1 instance

require major refactoring in the building pipeline. Can you please provide some use cases for this feature so we can make sure that these are addressed when we perform the refactoring?

@jpolvora
Copy link
Contributor

My sugestion is run migrations during the startup off your web application. You can just make a flag in yout DB and check if some other migration is runnig (which would be another instance)
IN case another migration is running , you can wait finish before, then start your instance (make a loop with some delay and check again if the flag was cleared).

@cazgp
Copy link

cazgp commented Nov 30, 2019

I didn't want to bake the DATABASE_URL into the Dockerfile. My migrations user also has higher permissions than the web app user, so running at the start of the web app is also not an option. So I've amended the pre-deploy script.

In the deployed Dockerfile is a migrate.sh script that looks for a DATABASE_URL.

DATABASE_URL is defined in the caprover environment variables.

The pre-deploy script is run in the main captain-captain docker service. It doesn't have a docker command (so child_process.exec doesn't work), however the external /var/run/docker.sock is mounted. DockerApi wraps dockerode and allows us to use the socket to call the parent docker to run containers.

To debug amending this script, run docker service logs captain-captain --since 60m --follow.

const api = require('./built/docker/DockerApi')
const docker = new api.default()

var preDeployFunction = function (captainAppObj, dockerUpdateObject) {
    const imageName = dockerUpdateObject.TaskTemplate.ContainerSpec.Image
    const databaseUrl = captainAppObj.envVars.find(x => x.key === 'DATABASE_URL').value

    return Promise.resolve()
        .then(() => docker.dockerode.runPromise(imageName, ['./migrate.sh'], process.stdout, { Env: ['DATABASE_URL=' + databaseUrl] }))
        .then(() => dockerUpdateObject)
};

@SteffenL
Copy link

SteffenL commented Mar 23, 2020

Here's my version of the pre-deploy script above:

var preDeployFunction = async function (captainAppObj, dockerUpdateObject) {
    const DockerApi = require("./built/docker/DockerApi");
    const api = new DockerApi.default();

    const setServiceInstances = async (service, count) => {
        const inspection = await service.inspect();
        const updateObject = { ...inspection.Spec, Mode: { Replicated: { Replicas: count } }, version: inspection.Version.Index };
        await service.update(updateObject);
    };

    const run = async args => {
        const imageName = dockerUpdateObject.TaskTemplate.ContainerSpec.Image;
        const env = captainAppObj.envVars.map(kv => kv.key + "=" + kv.value);
        const config = { Env: env, HostConfig: { AutoRemove: true, NetworkMode: captainAppObj.networks[0] } };

        const {output} = await api.dockerode.run(imageName, args, process.stdout, config);

        if (output.StatusCode !== 0) {
            throw new Error(`Failed to run image ${imageName} with args ${args} (status code ${output.StatusCode}).`);
        }
    };

    const service = api.dockerode.getService(dockerUpdateObject.Name);
    await setServiceInstances(service, 0);
    await run(["--migrate-database"]);
    dockerUpdateObject.version = (await service.inspect()).Version.Index;

    return dockerUpdateObject;
};

Wrote a blog post about this with some more information.

In short:

  1. Shuts down app instances (scales down to 0 instances).
  2. Runs database migration feature built into my app. Docker container for this is also cleaned up.
  3. Passes in environment variables specified for the app and uses the same network so that the database server can be reached at srv-captain--*.

@derek-adair
Copy link

@SteffenL pretty slick! Thanks!

@githubsaturn If I wanted to run arbitrary migration commands in this script could i just...

await run("<arbitrary command>")

@githubsaturn
Copy link
Collaborator

I am not too sure about the inner mechanics of the script that @SteffenL posted, but it indeed appears that await run("<arbitrary command>") would work. Just keep in mind that this starts a new container, and run the command, then exists the container.

@derek-adair
Copy link

derek-adair commented Mar 25, 2020

...starts a new container, and run the command, then exists the container.

This is exactly what I am looking for!

For me it's not too appealing to write pre-deploy scripts in javascript. I dunno about others but I would much prefer to have these in a shell script... I'm assuming i can do something like...

await run("./migrate.sh")

@berksonj
Copy link

Using @SteffenL's pre-deploy script above, I receive an error.

I have replaced await run(["--migrate-database"]) with await run(["python manage.py migrate --no-input"]), (I tried just running ls as well, same error).
My Dockerfile has a CMD and no ENTRYPOINT.

captain-captain logs below:

captain-captain.1.abc123@Captain    | September 24th 2020, 5:27:09.899 pm    Updating app started: staging-django
captain-captain.1.abc123@Captain    | September 24th 2020, 5:27:09.931 pm    Ensure service inited and Updated for: staging-django
captain-captain.1.abc123@Captain    | September 24th 2020, 5:27:09.942 pm    Check if service is running: srv-captain--staging-django
captain-captain.1.abc123@Captain    | September 24th 2020, 5:27:09.949 pm    Service is already running: srv-captain--staging-django
captain-captain.1.abc123@Captain    | September 24th 2020, 5:27:09.980 pm    Updating service srv-captain--staging-django with image index.docker.io/myusername/myproject-staging:123abc
captain-captain.1.abc123@Captain    | September 24th 2020, 5:27:09.996 pm    Running preDeployFunction
captain-captain.1.abc123@Captain    | GET /api/v2/user/apps/appData/staging-django/logs?encoding=hex 304 32.442 ms - -
captain-captain.1.abc123@Captain    | GET /api/v2/user/apps/appData/staging-django 304 1.666 ms - -
captain-captain.1.abc123@Captain    | September 24th 2020, 5:27:12.665 pm    Error: (HTTP code 400) unexpected - OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"python manage.py migrate --no-input\": executable file not found in $PATH": unknown
captain-captain.1.abc123@Captain    | Error: (HTTP code 400) unexpected - OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"python manage.py migrate --no-input\": executable file not found in $PATH": unknown
captain-captain.1.abc123@Captain    |     at /usr/src/app/node_modules/docker-modem/lib/modem.js:301:17
captain-captain.1.abc123@Captain    |     at getCause (/usr/src/app/node_modules/docker-modem/lib/modem.js:331:7)
captain-captain.1.abc123@Captain    |     at Modem.buildPayload (/usr/src/app/node_modules/docker-modem/lib/modem.js:300:5)
captain-captain.1.abc123@Captain    |     at IncomingMessage.<anonymous> (/usr/src/app/node_modules/docker-modem/lib/modem.js:275:14)
captain-captain.1.abc123@Captain    |     at IncomingMessage.emit (events.js:326:22)
captain-captain.1.abc123@Captain    |     at endReadableNT (_stream_readable.js:1244:12)
captain-captain.1.abc123@Captain    |     at processTicksAndRejections (internal/process/task_queues.js:80:21)
captain-captain.1.abc123@Captain    | POST /api/v2/user/apps/appDefinitions/update 500 2776.637 ms - 21
captain-captain.1.abc123@Captain    | GET /api/v2/user/apps/appData/staging-django 304 1.324 ms - -
captain-captain.1.abc123@Captain    | GET /api/v2/user/apps/appData/staging-django/logs?encoding=hex 304 52.843 ms - -
captain-captain.1.abc123@Captain    | GET /api/v2/user/apps/appData/staging-django 304 1.413 ms - -
captain-captain.1.abc123@Captain    | GET /api/v2/user/apps/appData/staging-django/logs?encoding=hex 200 10.183 ms - 80

@cazgp
Copy link

cazgp commented Sep 24, 2020

@berksonj It looks like from the error "exec: \"python manage.py migrate --no-input\": executable file not found in $PATH": unknown that your container is attempting to launch an executable named "python manage.py migrate --no-input".

Instead, try running await run(["python", "manage.py", "migrate" "--no-input"]) (note the separate strings) to ensure that the arguments are passed as individual arguments to the entrypoint, rather than all as one command.

@berksonj
Copy link

Hi @cazgp, thanks for the tip, that worked!

I ran into a problem after, where the output variable was undefined.

Changing

const {output} = await api.dockerode.run(imageName, args, process.stdout, config);

to

const [output] = await api.dockerode.run(imageName, args, process.stdout, config);

solved the problem.

@matbgn
Copy link

matbgn commented Jul 6, 2022

@githubsaturn any news about that?:

This is not something that's currently available as a built-in solution. But it's definitely something that we need to add in near future.

@kasvith
Copy link

kasvith commented Nov 1, 2022

This is kind of frustrating, as running DB migrations is one of a common task expected from something like CapRover

@zenflow
Copy link

zenflow commented Nov 1, 2022

This is kind of frustrating, as running DB migrations is one of a common task expected from something like CapRover

The simplest approach, whether using CapRover or Heroku or whatever, is to include the migrations as a startup task in your Docker image. This is a portable solution so it "just works" (without adaptation) regardless of where you're running.

Maybe Heroku has a feature for migrations, but their platform has been around since before containerization was common/standard, and I'm willing to bet that feature was developed at that time.

@kasvith
Copy link

kasvith commented Nov 1, 2022

The simplest approach, whether using CapRover or Heroku or whatever, is to include the migrations as a startup task in your Docker image. This is a portable solution so it "just works" (without adaptation) regardless of where you're running.

This works well if you have only a single instance running, imaging running migrations on multiple instances same time. The best approach that comes to my mind is having a post-deployment hook which allows end users to configure the migration strategy themselves. It can be a one-off container which does migration and dies

Maybe Heroku has a feature for migrations, but their platform has been around since before containerization was common/standard, and I'm willing to bet that feature was developed at that time.

I often used the release phase in Heroku Build Pack, which runs migrations after deploying the code.

@derek-adair
Copy link

derek-adair commented Nov 1, 2022

This works well if you have only a single instance running, imaging running migrations on multiple instances same time. The best approach that comes to my mind is having a post-deployment hook which allows end users to configure the migration strategy themselves. It can be a one-off container which does migration and dies

I have mixed feelings about this. It seems like it would be a very nice convenience feature. A simple post-deploy hook for running arbitrary bash commands would be powerful. BUT there are probably design complications. For example, how would this work with the "revert build" feature? Having an undo button for a build is more compelling to me than a migration hook; i'm not sure how this would be kept intact with a migration hook outside of the build context... Although, now that I think of it this may break when running migrations in a pre-deploy script.

Regardless I think its pretty clear there is some type of improvement worthy here.

PERSONALLY, i've been manually migrating... but I am running quite insignificant infrastructure, when/if any of my projects grow this will certainly be something I need to address. Anyone have some feedback that is using caprover in a larger scale would be beneficial to this conversation.


At a minimum it would be very nice to provide a recipe for this kind of thing in the wiki.

@matbgn
Copy link

matbgn commented Nov 5, 2022

2 cents opinion:

[CONTEXT ON] We are increasing our dependency on Caprover for services that in the coming weeks would more and more be deployed via CI/CD. Those services are mainly tests environments provided to our costumers ; for bad or good reasons of course this is our stuff. [CONTEXT OFF]

But considering this and the fact that pre-deploy (actually already used by ourselves) and a post-deploy bash have/would have similar behavior regarding the rollback capabilities, we will way more appreciate the bash option and then take our responsibilities in case of rollbacks will be needed.

Actual pre-deploy sequence is unusable with a proper CD pipeline in my opinion. Therefore I would propose something like documenting that if the user will use pre/post-deploy scripts he will encounter some drawbacks like revert build downgrade. But this is totally acceptable I would say since pre/post-deploy are already a more advanced topic.

Does it make sense for you @derek-adair?

@kasvith
Copy link

kasvith commented Nov 5, 2022

Having an undo button for a build is more compelling to me than a migration hook; i'm not sure how this would be kept intact with a migration hook outside of the build context... Although, now that I think of it this may break when running migrations in a pre-deploy script.

Well, this is a very rare usecase. You can always do a reverse migration to the point you were. This highly depends on your stack. Some tools like Knex allows you to write backward migration strategies, but you have to specify the point for revert.

Most importantly automating the release would be ideal

I was really frustrated when I saw caprover doesn't have a post-deployment script, this is a much-needed feature for a PaaS(well since its like Heroku anyways)

@krystofbe
Copy link

krystofbe commented Jun 7, 2023

in case you don't want to search the whole thread for the right script, this one works:

var preDeployFunction = async function (captainAppObj, dockerUpdateObject) {
    const DockerApi = require("./built/docker/DockerApi");
    const api = new DockerApi.default();

    const run = async args => {
        const imageName = dockerUpdateObject.TaskTemplate.ContainerSpec.Image;
        const env = captainAppObj.envVars.map(kv => kv.key + "=" + kv.value);
        const config = { Env: env, HostConfig: { AutoRemove: true, NetworkMode: captainAppObj.networks[0] } };

        const [output] = await api.dockerode.run(imageName, args, process.stdout, config);
        console.log(output);

        if (output.StatusCode !== 0) {
            throw new Error(`Failed to run image ${imageName} with args ${args} (status code ${output.StatusCode}).`);
        }
    };

    const service = api.dockerode.getService(dockerUpdateObject.Name);
    await run(["PATH_TO_YOUR_SCRIPT", "ADDITIONAL_PARAMTER"]);
    dockerUpdateObject.version = (await service.inspect()).Version.Index;

    return dockerUpdateObject;
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests