This article will explain in depth how to tackle issues related to database compatibility and the deployment process. We will present what can happen with your production applications if you try to perform such a deployment unprepared. We will then walk through the steps in the lifecycle of an application that are necessary to have zero downtime. The result of our operations will be applying a backward incompatible database change in a backward compatible way.
If you want to work through the code samples below, you will find everything you need in GitHub.
What is this mythical zero downtime deployment? You can say that your application is deployed that way if you can successfully introduce a new version of your application to production without making the user see that the application went down in the meantime. From the user’s and the company’s point of view it’s the best possible scenario of deployment since new features can be introduced and bugs can be eliminated without any outage.
How can you achieve that? There are number of ways but one of them is just to:
-
deploy version 1 of your service
-
migrate your database to a new version
-
deploy version 2 of your service in parallel to the version 1
-
once you see that version 2 works like a charm just bring down version 1
-
you’re done!
Easy, isn’t it? Unfortunately, it’s not that easy and we’ll focus on that later on. Right now let’s check another common deployment process which is the blue green deployment.
Have you ever heard of blue green deployment? With Cloud Foundry it’s extremely easy to do. Just check out this article where we describe it in more depth. To quickly recap, doing blue green deployment is as simple as:
maintain two copies of your production environment (“blue” and “green”);
route all traffic to the the blue environment by mapping production URLs to it;
deploy and test any changes to the application in the green environment;
“flip the switch” by mapping URLs onto green and unmapping them from blue.
Blue green deployment is an approach that gives you ease of introducing new features without the stress that something will completely blow up on production. That’s due to the fact that even if that would be the case, you can easily rollback your router to point to a previous environment just by "flipping the switch".
After reading all of the above you could ask yourself a question: What does zero downtime deployment have to do with Blue green deployment?
Well, they have quite a lot in common since maintaining two copies of the same environment leads to doubling the effort required to support it. That’s why some teams, as Martin Fowler states it, tend to perform a variation of that approach:
Another variation would be to use the same database, making the blue-green switches for web and domain layers.
Databases can often be a challenge with this technique, particularly when you need to change the schema to support a new version of the software.
And here we arrive at the main problem that we will touch in this article. The database. Let’s have another glimpse on this phrase:
migrate your database to a new version
Now you should ask yourself a question - what if the database change is backward incompatible? Won’t my version 1 of the application just blow up? Actually, it will…
So even though the benefits of zero downtime / blue green deployment are gigantic, companies tend to follow such a safer process of deploying their apps:
-
prepare a package with the new version of the application
-
shut down the running application
-
run the database migration scripts
-
deploy and run the new version of the application
In this article we’ll describe in more depth how you can work with your database and your code so that you can profit from the benefits of the zero downtime deployment.
If you have a stateless application that doesn’t store any data in the database then you can start doing zero downtime deployment right now. Unfortunately, most software has to store the data somewhere. That’s why you have to think twice before doing any sort of schema changes. Before we go into the details of how to change the schema in such a way that zero downtime deployment is possible let’s focus on schema versioning first.
In this article we will use Flyway as a schema versioning tool. Naturally we’re also writing a Spring Boot application
that has native support for Flyway and will execute the schema migration upon application context setup. When using Flyway
you can store the migration scripts inside your projects folder (by default under classpath:db/migration
). Here you can see an example
of such migration files
└── db
└── migration
├── V1__init.sql
├── V2__Add_surname.sql
├── V3__Final_migration.sql
└── V4__Remove_lastname.sql
In this example we can see 4 migration scripts that, if not executed previously, will be executed one after another when the application
starts. Let’s take a look at one of the files (V1__init.sql
) as an example.
CREATE TABLE PERSON (
id BIGINT GENERATED BY DEFAULT AS IDENTITY,
first_name varchar(255) not null,
last_name varchar(255) not null
);
insert into PERSON (first_name, last_name) values ('Dave', 'Syer');
It’s pretty self-explanatory: you can use SQL to define how your database should be changed. For more information about Spring Boot and Flyway check the Spring Boot Docs.
Using a schema versioning tool with Spring Boot, you receive 2 great benefits.
-
you decouple database changes from the code changes
-
database migration happens together with your application deployment - your deployment process gets simplified
In the following section of the article we will focus on presenting two approaches to database changes.
-
backward incompatible
-
backward compatible
The first one will be shown as a warning to not to try to do zero downtime deployment without some preparations. The second one will present a suggested solution of how one can perform zero downtime deployment and maintain backward compatibility at the same time.
Our project that we will work on will be a simple Spring Boot Flyway application in which we have a Person
that has a first_name
and a last_name
in the database. We want to rename the last_name
column into surname
.
Before we go into details we need to define a couple of assumptions towards our applications. The key result that we would like to obtain is to have a fairly simple process.
Tip
|
Business PRO-TIP. Simplifying processes can save you a lot of money on support (the more people work in your company the more money you can save)! |
We don’t want to do database rollbacks
Not doing them simplifies the deployment process (some database rollbacks are close to impossible like rolling back a delete). We prefer to rollback only the applications. That way even if you have different databases (e.g. SQL and NoSQL) then your deployment pipeline will look the same.
We want ALWAYS to be able to rollback the application one version back (not more)
We want to rollback only as a necessity. If there is a bug in the current version that can’t be solved easily we want to be able to bring back the last working version. We assume that this last working version is the previous one. Maintaining code and database compatibility for more than a single deployment would be extremely difficult and costly.
Tip
|
For readability purposes we will be versioning the applications in this article with major increments. |
Version of the app: 1.0.0
Version of the DB: v1
The db contains a column called last_name
.
CREATE TABLE PERSON (
id BIGINT GENERATED BY DEFAULT AS IDENTITY,
first_name varchar(255) not null,
last_name varchar(255) not null
);
insert into PERSON (first_name, last_name) values ('Dave', 'Syer');
The app stores the Person data into a column called last_name
:
/*
* Copyright 2012-2016 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package sample.flyway;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
@Entity
public class Person {
@Id
@GeneratedValue
private Long id;
private String firstName;
private String lastName;
public String getFirstName() {
return this.firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
public String getLastName() {
return this.lastName;
}
public void setLastName(String lastname) {
this.lastName = lastname;
}
@Override
public String toString() {
return "Person [firstName=" + this.firstName + ", lastName=" + this.lastName
+ "]";
}
}
Let’s take a look at the following example if you want to change the column name:
Warning
|
The following example is deliberately done in such a way that it will break. We’re showing it to depict the problem of database compatibility. |
Version of the app: 2.0.0.BAD
Version of the DB: v2bad
Current changes DO NOT allow us to run two instances (old and new) at the same time. Thus zero down time deployment will be difficult to achieve (if we take into consideration our assumptions it’s actually impossible).
The current situation is that we have an app deployed to production in version 1.0.0
and db in v1
. We want to deploy the second
instance of the app that will be in version 2.0.0.BAD
and update the db to v2bad
.
Steps:
-
a new instance is deployed in version
2.0.0.BAD
that updates the db tov2bad
-
in
v2bad
of the database the columnlast_name
is no longer existing - it got changed tosurname
-
the db and app upgrade is successful and you have some instances working in
1.0.0
, others in2.0.0.BAD
. All are talking to db inv2bad
-
all instances of version
1.0.0
will start producing exceptions cause they will try to insert data tolast_name
column which is no longer there -
all instances of version
2.0.0.BAD
will work without any issues
As you can see if we do backward incompatible changes of the DB and the application, A/B testing is impossible.
Let’s assume that after trying to do A/B deployment we’ve decided that we need to rollback the app back to version 1.0.0
. We assumed
that we don’t want to roll back the database.
Steps:
-
we shut down the instance that was running with version
2.0.0.BAD
-
the database is still in
v2bad
-
since version
1.0.0
doesn’t understand whatsurname
column is it will produce exceptions -
hell broke loose and we can’t go back
As you can see if we do backward incompatible changes of the DB and the application, we can’t roll back to a previous version.
Backward incompatible scenario:
01) Run 1.0.0
02) Wait for the app (1.0.0) to boot
03) Generate a person by calling POST localhost:9991/person to version 1.0.0
04) Run 2.0.0.BAD
05) Wait for the app (2.0.0.BAD) to boot
06) Generate a person by calling POST localhost:9991/person to version 1.0.0 <-- this should fail
07) Generate a person by calling POST localhost:9992/person to version 2.0.0.BAD <-- this should pass
Starting app in version 1.0.0
Generate a person in version 1.0.0
Sending a post to 127.0.0.1:9991/person. This is the response:
{"firstName":"b73f639f-e176-4463-bf26-1135aace2f57","lastName":"b73f639f-e176-4463-bf26-1135aace2f57"}
Starting app in version 2.0.0.BAD
Generate a person in version 1.0.0
Sending a post to 127.0.0.1:9991/person. This is the response:
curl: (22) The requested URL returned error: 500 Internal Server Error
Generate a person in version 2.0.0.BAD
Sending a post to 127.0.0.1:9995/person. This is the response:
{"firstName":"e156be2e-06b6-4730-9c43-6e14cfcda125","surname":"e156be2e-06b6-4730-9c43-6e14cfcda125"}
The migration script renames the column from last_name
to surname
Initial Flyway script:
CREATE TABLE PERSON (
id BIGINT GENERATED BY DEFAULT AS IDENTITY,
first_name varchar(255) not null,
last_name varchar(255) not null
);
insert into PERSON (first_name, last_name) values ('Dave', 'Syer');
Script renaming last_name
.
-- This change is backward incompatible - you can't do A/B testing
ALTER TABLE PERSON CHANGE last_name surname VARCHAR;
This is the most frequent situation that we can encounter. We need to perform backward incompatible changes. We have already proven that to do zero downtime deployment we must not simply apply the database migration without extra work. In this section of the article we will go through 3 deployments of the application together with the database migrations to achieve the desired effect and at the same time be backward compatible.
Tip
|
As a reminder - Let’s assume that we have the DB in version v1 . It contains the columns first_name and last_name .
We want to change the last_name into surname . We also have the app in version 1.0.0 which doesn’t use the surname column just yet.
|
Version of the app: 2.0.0
Version of the DB: v2
By adding a new column and copying its contents we have created backward compatible changes of the db. ATM if we rollback the JAR / have an old JAR working at the same time it won’t break at runtime.
Steps:
-
migrate your db to create the new column called
surname
. Now your db is inv2
-
copy the data from the
last_name
column tosurname
. NOTE that if you have a lot of this data then you should consider batch migration! -
write the code to use BOTH the new and the old column. Now your app is in version
2.0.0
-
read the surname value from
surname
column if it’s not null and fromlast_name
ifsurname
wasn’t set. You can remove thegetLastName()
from the code since it will produce nulls when your app is rolled back from3.0.0
to2.0.0
.
If you’re using Spring Boot Flyway first two steps will be performed upon booting the version 2.0.0
of the app. If you’re running
database versioning tool manually then you’d have to do it in separate processes (first manually upgrade the db version and then deploy
the new app).
Important
|
Remember that the newly created column MUST NOT be NOT NULL. If you rollback, the old app has no knowledge of the new
column and won’t set it upon Insert . But if you add that constraint and your db is in v2 it would require the value of the new
column to be set. That would result in constraint violations.
|
Important
|
You should remove the getLastName() method because in version 3.0.0 there is no notion of last_name column in the code.
That means that nulls will be set there. You can leave the method and add null-checks but a much better solution would be to ensure
that in the logic of getSurname() you pick the proper, non-null value.
|
The current situation is that we have an app deployed to production in version 1.0.0
and db in v1
. We want to deploy the second
instance of the app that will be in version 2.0.0
and update the db to v2
.
Steps:
-
a new instance is deployed in version
2.0.0
that updates the db tov2
-
in the meantime some requests got processed by instances being in version
1.0.0
-
the upgrade is successful and you have some instances working in
1.0.0
, others in2.0.0
. All are talking to db inv2
-
version
1.0.0
is not using the database’s columnsurname
and version2.0.0
is. They don’t interfere with each other, no exceptions should be thrown. -
version
2.0.0
is saving data to both old and new column thus it’s backward compatible
Important
|
If you have any queries that count items basing on values from old / new column you have to remember that now you have
duplicate values (most likely still being migrated). E.g. if you want to count the number of users whose last name (however you call it)
starts with a letter A then until the data migration (old → new column) is done you might have inconsistent data if you
perform the query against the new column.
|
The db contains a column called last_name
.
Initial Flyway script:
CREATE TABLE PERSON (
id BIGINT GENERATED BY DEFAULT AS IDENTITY,
first_name varchar(255) not null,
last_name varchar(255) not null
);
insert into PERSON (first_name, last_name) values ('Dave', 'Syer');
Script adding surname
column.
Warning
|
Remember NOT TO ADD any NOT NULL constraints to the added column. Cause if you rollback the JAR the old version doesn’t have the notion of the added column and automatically a NULL value will be set. In case of having a constraint the old application will blow up. |
-- NOTE: This field can't have the NOT NULL constraint cause if you rollback, the old version won't know about this field
-- and will always set it to NULL
ALTER TABLE PERSON ADD surname varchar(255);
-- WE'RE ASSUMING THAT IT'S A FAST MIGRATION - OTHERWISE WE WOULD HAVE TO MIGRATE IN BATCHES
UPDATE PERSON SET PERSON.surname = PERSON.last_name
We are storing data in both last_name
and surname
. Also, we are reading from the last_name
column cause
it is most up to date. During the deployment process some requests might have been processed by the instance that
hasn’t yet been upgraded.
/*
* Copyright 2012-2016 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package sample.flyway;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
@Entity
public class Person {
@Id
@GeneratedValue
private Long id;
private String firstName;
private String lastName;
private String surname;
public String getFirstName() {
return this.firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
/**
* Reading from the new column if it's set. If not the from the old one.
*
* When migrating from version 1.0.0 -> 2.0.0 this can lead to a possibility that some data in
* the surname column is not up to date (during the migration process lastName could have been updated).
* In this case one can run yet another migration script after all applications have been deployed in the
* new version to ensure that the surname field is updated.
*
* However it makes sense since when looking at the migration from 2.0.0 -> 3.0.0. In 3.0.0 we no longer
* have a notion of lastName at all - so we don't update that column. If we rollback from 3.0.0 -> 2.0.0 if we
* would be reading from lastName, then we would have very old data (since not a single datum was inserted
* to lastName in version 3.0.0).
*/
public String getSurname() {
return this.surname != null ? this.surname : this.lastName;
}
/**
* Storing both FIRST_NAME and SURNAME entries
*/
public void setSurname(String surname) {
this.lastName = surname;
this.surname = surname;
}
@Override
public String toString() {
return "Person [firstName=" + this.firstName + ", lastName=" + this.lastName + ", surname=" + this.surname
+ "]";
}
}
Version of the app: 3.0.0
Version of the DB: v3
By adding a new column and copying its contents we have created backward compatible changes of the db. ATM if we rollback the JAR / have an old JAR working at the same time it won’t break at runtime.
The current situation is that we have app in version 3.0.0
and db in v3
. Version 3.0.0
is not storing data
into the last_name
column. That means that most up to date information is stored in the surname
column.
Steps:
-
roll back your app to version
2.0.0
. -
version
2.0.0
is using bothlast_name
andsurname
column. -
version
2.0.0
will pick firstsurname
column if it’s not null and if that’s not the case then it will picklast_name
There are no structure changes in the DB. The following script is executed that performs the final migration of old data:
-- WE'RE ASSUMING THAT IT'S A FAST MIGRATION - OTHERWISE WE WOULD HAVE TO MIGRATE IN BATCHES
-- ALSO WE'RE NOT CHECKING IF WE'RE NOT OVERRIDING EXISTING ENTRIES. WE WOULD HAVE TO COMPARE
-- ENTRY VERSIONS TO ENSURE THAT IF THERE IS ALREADY AN ENTRY WITH A HIGHER VERSION NUMBER
-- WE WILL NOT OVERRIDE IT.
UPDATE PERSON SET PERSON.surname = PERSON.last_name;
-- DROPPING THE NOT NULL CONSTRAINT; OTHERWISE YOU WILL TRY TO INSERT NULL VALUE OF THE LAST_NAME
-- WITH A NOT_NULL CONSTRAINT.
ALTER TABLE PERSON MODIFY COLUMN last_name varchar(255) NULL DEFAULT NULL;
We are storing data in both last_name
and surname
. Also, we are reading from the last_name
column cause
it is most up to date. During the deployment process some requests might have been processed by the instance that
hasn’t yet been upgraded.
/*
* Copyright 2012-2016 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package sample.flyway;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
@Entity
public class Person {
@Id
@GeneratedValue
private Long id;
private String firstName;
private String surname;
public String getFirstName() {
return this.firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
public String getSurname() {
return this.surname;
}
public void setSurname(String lastname) {
this.surname = lastname;
}
@Override
public String toString() {
return "Person [firstName=" + this.firstName + ", surname=" + this.surname
+ "]";
}
}
Version of the app: 4.0.0
Version of the DB: v4
Since the code of version 3.0.0
wasn’t using last_name
column, if we roll back to 3.0.0
after removing the
column from the database then nothing bad will happen at runtime.
We will do it in the following way:
01) Run 1.0.0
02) Wait for the app (1.0.0) to boot
03) Generate a person by calling POST localhost:9991/person to version 1.0.0
04) Run 2.0.0
05) Wait for the app (2.0.0) to boot
06) Generate a person by calling POST localhost:9991/person to version 1.0.0
07) Generate a person by calling POST localhost:9992/person to version 2.0.0
08) Kill app (1.0.0)
09) Run 3.0.0
10) Wait for the app (3.0.0) to boot
11) Generate a person by calling POST localhost:9992/person to version 2.0.0
12) Generate a person by calling POST localhost:9993/person to version 3.0.0
13) Kill app (3.0.0)
14) Run 4.0.0
15) Wait for the app (4.0.0) to boot
16) Generate a person by calling POST localhost:9993/person to version 3.0.0
17) Generate a person by calling POST localhost:9994/person to version 4.0.0
Starting app in version 1.0.0
Generate a person in version 1.0.0
Sending a post to 127.0.0.1:9991/person. This is the response:
{"firstName":"52b6e125-4a5c-429b-a47a-ef18bbc639d2","lastName":"52b6e125-4a5c-429b-a47a-ef18bbc639d2"}
Starting app in version 2.0.0
Generate a person in version 1.0.0
Sending a post to 127.0.0.1:9991/person. This is the response:
{"firstName":"e41ee756-4fa7-4737-b832-e28827a00deb","lastName":"e41ee756-4fa7-4737-b832-e28827a00deb"}
Generate a person in version 2.0.0
Sending a post to 127.0.0.1:9992/person. This is the response:
{"firstName":"0c1240f5-649a-4bc5-8aa9-cff855f3927f","lastName":"0c1240f5-649a-4bc5-8aa9-cff855f3927f","surname":"0c1240f5-649a-4bc5-8aa9-cff855f3927f"}
Killing app 1.0.0
Starting app in version 3.0.0
Generate a person in version 2.0.0
Sending a post to 127.0.0.1:9992/person. This is the response:
{"firstName":"74d84a9e-5f44-43b8-907c-148c6d26a71b","lastName":"74d84a9e-5f44-43b8-907c-148c6d26a71b","surname":"74d84a9e-5f44-43b8-907c-148c6d26a71b"}
Generate a person in version 3.0.0
Sending a post to 127.0.0.1:9993/person. This is the response:
{"firstName":"c6564dbe-9ab5-40ae-9077-8ae6668d5862","surname":"c6564dbe-9ab5-40ae-9077-8ae6668d5862"}
Killing app 2.0.0
Starting app in version 4.0.0
Generate a person in version 3.0.0
Sending a post to 127.0.0.1:9993/person. This is the response:
{"firstName":"cbe942fc-832e-45e9-a838-0fae25c10a51","surname":"cbe942fc-832e-45e9-a838-0fae25c10a51"}
Generate a person in version 4.0.0
Sending a post to 127.0.0.1:9994/person. This is the response:
{"firstName":"ff6857ce-9c41-413a-863e-358e2719bf88","surname":"ff6857ce-9c41-413a-863e-358e2719bf88"}
In comparison to v3
we’re just removing the last_name
column and add missing constraints.
-- REMOVE THE COLUMN
ALTER TABLE PERSON DROP last_name;
-- ADD CONSTRAINTS
UPDATE PERSON SET surname='' WHERE surname IS NULL;
ALTER TABLE PERSON ALTER COLUMN surname VARCHAR NOT NULL;
We have successfully applied the backward incompatible change of renaming the column by doing a couple of backward compatible deploys. Here you can find the summary of the performed actions:
-
deploy version
1.0.0
of the application withv1
of db schema (column name =last_name
) -
deploy version
2.0.0
of the application that saves data tolast_name
andsurname
columns. The app reads fromlast_name
column. Db is in versionv2
containing bothlast_name
andsurname
columns. Thesurname
column is a copy of thelast_name
column. (NOTE: this column must not have the not null constraint) -
deploy version
3.0.0
of the application that saves data only tosurname
and reads fromsurname
. As for the db the final migration oflast_name
tosurname
takes place. Also the NOT NULL constraint is dropped fromlast_name
. Db is now in versionv3
-
deploy version
4.0.0
of the application - there are no changes in the code. Deploy db inv4
that first preforms a final migration oflast_name
tosurname
and removes thelast_name
column. Here you can add any missing constraints
By following this approach you can always rollback one version back without breaking the database / application compatibility.
All the code used in this article is available at Github. Below you can find some additional description.
Once you clone the repo you’ll see the following folder structure.
├── boot-flyway-v1 - 1.0.0 version of the app with v1 of the schema
├── boot-flyway-v2 - 2.0.0 version of the app with v2 of the schema (backward-compatible - app can be rolled back)
├── boot-flyway-v2-bad - 2.0.0.BAD version of the app with v2bad of the schema (backward-incompatible - app cannot be rolled back)
├── boot-flyway-v3 - 3.0.0 version of the app with v3 of the schema (app can be rolled back)
└── boot-flyway-v4 - 4.0.0 version of the app with v4 of the schema (app can be rolled back)
You can run the scripts to execute the scenario that shows the backward compatible and incompatible changes applied to the db.
To check the backward compatible case just run:
./scripts/scenario_backward_compatible.sh
To check the backward incompatible case just run:
./scripts/scenario_backward_incompatible.sh
All samples are clones of the Spring Boot Sample Flyway
project.
You can look at http://localhost:8080/flyway
to review the list of scripts.
The sample also enables the H2 console (at http://localhost:8080/h2-console
)
so that you can review the state of the database (the default jdbc url is
jdbc:h2:mem:testdb
).