Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate from SQLite to PostgreSQL #716

Closed
pbanaszkiewicz opened this issue Mar 1, 2016 · 15 comments · Fixed by #1750
Closed

Migrate from SQLite to PostgreSQL #716

pbanaszkiewicz opened this issue Mar 1, 2016 · 15 comments · Fixed by #1750

Comments

@pbanaszkiewicz
Copy link
Contributor

Why? Because AMY is going to serve a higher load and SQLite isn't meant for that.

Why PostgreSQL? Because it has a really nice support in Django.

Challenges:

  1. Reworking development process. SQLite is excellent for newcomers, while PostgreSQL requires at least something installed: Docker, Vagrant or "pure" PostgreSQL server.
  2. Running continuous integration on PostgreSQL.
  3. Installation and secure configuration of PostgreSQL on the server.
  4. Database backup.
  5. settings.py that works with both servers (use https://github.com/kennethreitz/dj-database-url ?).
@wking
Copy link
Contributor

wking commented Mar 1, 2016

On Tue, Mar 01, 2016 at 02:20:06PM -0800, Piotr Banaszkiewicz wrote:

  1. Reworking development process. SQLite is excellent for newcomers,
    while PostgreSQL requires at least something installed: Docker,
    Vagrant or "pure" PostgreSQL server.

Do we need any of PostgreSQL-specific features? I expect we can
develop on SQLite and deploy to PostgreSQL. Folks who are concerned
about portability can always test on PostgreSQL too, or adjust the CI
testing to do that.

  1. Database backup.

This is going to be pretty similar to SQLite, you just have to adjust
the backup job to dump the database to a file before it's existing
“copy off to somewhere safe”.

@pbanaszkiewicz
Copy link
Contributor Author

Hi @wking,

Do we need any of PostgreSQL-specific features? I expect we can develop on SQLite and deploy to PostgreSQL.

If we were using PostgreSQL for development too, then it would be safer to use PostgreSQL-specific features – and I know a few places we could use them (e.g. ArrayField for Tags)

Anyway, this is just something I was thinking for a while. It may take some time before we actually go this way.

@wking
Copy link
Contributor

wking commented Mar 2, 2016

On Tue, Mar 01, 2016 at 10:36:04PM -0800, Piotr Banaszkiewicz wrote:

Do we need any of PostgreSQL-specific features? I expect we can
develop on SQLite and deploy to PostgreSQL.

If we were using PostgreSQL for development too, then it would be
safer to use PostgreSQL-specific features – and I know a few places
we could use them (e.g. ArrayField for Tags)

I'm not sure that benefit is worth breaking SQLite for dev/testing.
You can get to PostgreSQL for production scaling 1 without breaking
SQLite for dev/testing, and I think everyone would agree that is a
win. Whether using PostgreSQL-specific features and breaking SQLite
for dev/testing is a net win is less clear to me. So I'm in favor of
just switching to PostgreSQL in production (no new PostgreSQL-specific
code), and seeing how that works before committing all the eggs to the
PostgreSQL basket ;). I'm personally a big fan of PostgreSQL, but it
does make the development setup a bit more complicated.

@sburns
Copy link
Contributor

sburns commented Mar 2, 2016

We can agree that if production runs PostgreSQL, then testing must be done with it. Travis provides docs about integrating with the db server and it looks relatively painless.

The next question is whether local development must use postgres or keep using sqlite. I think that depends on if you're testing locally. Obviously if a PR comes in that was developed against sqlite and CI tests don't pass, it won't be merged until they do. But it could leave a sour taste in the mouth of the submitter; if the docs say it doesn't matter if you use postgres or sqlite in development but a PR isn't accepted even though tests passed locally, that seems like a contradiction to me.

FWIW I've found PostgreSQL to be much easier to install than python. On OS X at least, there's Postgres.app and the homebrew formula (which if you go down that route, brew services postgresql {start,stop} can effectively keep postgres on when you need it and off when you don't). I assume linux installation is straight foward as well but have no idea about windows.

@wking
Copy link
Contributor

wking commented Mar 2, 2016

On Wed, Mar 02, 2016 at 05:43:48AM -0800, Scott Burns wrote:

We can agree that if production runs PostgreSQL, then testing must
be done with it…

I think that would be nice to have, but it's not a hard requirement
for me personally. The odds of accidentally doing something
SQLite-specific seem small, and in the event that something like that
slips through, I'm having trouble imagining it having serious
consequences. But that's just my 2¢, I'm neither the project
maintainer nor the amy.software-carpentry.org sysadmin ;).

@sburns
Copy link
Contributor

sburns commented Mar 2, 2016

The odds of accidentally doing something SQLite-specific seem small,

It's less about doing SQLite-specific stuff and more about thinking operations exercised in the test suite are valid when in production PostgreSQL doesn't (e.g. SQLite doesn't enforce referential integrity by default).

Alas, I too am not the maintainer or sys-admin so take this soapbox with a grain of salt :)

@wking
Copy link
Contributor

wking commented Mar 2, 2016

On Wed, Mar 02, 2016 at 01:55:14PM -0800, Scott Burns wrote:

e.g. SQLite doesn't enforce referential integrity by default

I'm all for turning that on, whether in SQLite 1 or by switching to
PostgreSQL for testing. And I'm not against using PostgreSQL for
testing, I'm just against making AMY require a particular flavor of
SQL. Then folks can test and/or deploy using whichever database
engine they like.

@pbanaszkiewicz
Copy link
Contributor Author

Hi both,

I'm really glad for your insights – that's a discussion I wanted to have when I created this issue.

I think we'll start with "SQLite for dev, PostgreSQL for prod" approach first since that seems to be most popular option.

Thanks,
Piotr

@sburns
Copy link
Contributor

sburns commented Mar 3, 2016

It appears like AMY_DEBUG environment variable is already used for switching between debug/production. So near settings.py#L166 we might branch and separately define the DATABASES based on DEBUG.

Fortunately amy doesn't have a complicated settings.py file :)

@pbanaszkiewicz
Copy link
Contributor Author

@sburns I was thinking about https://github.com/kennethreitz/dj-database-url too, but these are "implementation details"…

@pbanaszkiewicz pbanaszkiewicz modified the milestone: v1.5.3 Apr 7, 2016
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v1.5.3, v1.5.4 Apr 20, 2016
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v1.6, v1.5.4 May 20, 2016
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v1.6, v1.7 May 29, 2016
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v1.7, v1.8 Aug 1, 2016
@pbanaszkiewicz pbanaszkiewicz removed this from the v1.8 milestone Sep 4, 2016
@gvwilson gvwilson removed this from the v1.9 milestone Sep 28, 2016
@chrismedrela
Copy link
Contributor

Before migrating we need to fix #1044.

@pbanaszkiewicz pbanaszkiewicz added this to the v2.0 milestone Jun 24, 2018
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v2.0, v2.1 Jul 30, 2018
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v2.1, v2.2 Sep 20, 2018
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v2.2, v2.3 Oct 8, 2018
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v2.3, v2.4 Dec 1, 2018
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v2.4, v2.5 Dec 29, 2018
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v2.5, v2.6 Jan 30, 2019
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v2.6, v2.7 Feb 28, 2019
@pbanaszkiewicz pbanaszkiewicz modified the milestones: v2.7, v2.8 Mar 29, 2019
@pbanaszkiewicz pbanaszkiewicz removed this from the v2.8 milestone Jun 19, 2019
@maneesha
Copy link
Contributor

@pbanaszkiewicz In an upcoming cycle, could we outline:

  • what it would take to migrate from SQLite to Postgres
  • why we may want to consider it
  • how it will impact our other systems (like Redash)

Leaving this as a reminder for now to revisit soon.

@pbanaszkiewicz
Copy link
Contributor Author

@fmichonneau @maneesha Took us almost 5 years :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment