Improve database reloader #2

steve-chavez · 2017-04-12T04:21:07Z

Right now the sub0_devtools tries to run the user sql scripts and recreates the db in case of failure, this cycle is slower on later stages of development and in projects with a considerable size of master data.

I have a modest dataset related to some local GIS app that takes 12 seconds to recreate.

For now I propose using apgdiff, with this migrations are easier, say we have:

-- v1.sql
CREATE TABLE user (
    email TEXT,
    pass TEXT NOT NULL CHECK (length(pass) < 512),
    first_name TEXT
);
-- v2.sql
CREATE TABLE user (
    email TEXT,
    pass TEXT NOT NULL CHECK (length(pass) < 512),
    first_name TEXT,
    last_name TEXT
);

To migrate from v1.sql to v2.sql we can do this:

java -jar apgdiff-2.4.jar v1.sql v2.sql

This outputs:

ALTER TABLE "user" ADD COLUMN last_name TEXT;

To reverse from v2.sql to v1.sql:

java -jar apgdiff-2.4.jar v2.sql v1.sql

This outputs:

ALTER TABLE "user" DROP COLUMN last_name;

I think this tool approach for migrations is better than the ones that rely on convention and discipline like sqitch or pgrebase.

The thing is apgdiff only operates on sql files, so I could do a pg_dump -s on the user database and just run apgdiff with this and the user current db defined in his folder structure then apply the diff.

The text was updated successfully, but these errors were encountered:

ruslantalpa · 2017-04-12T04:28:56Z

yes, i was looking into using it but for migration from dev to production, but the usecase here is also good.

If you can get this approach working then perfect. What i am not sure will work is the fact that in the project structure, in sql files, i use \ir, can this tool handle that or does it need a flat sql file with no psql instructions?

Anyway, you know the thing we are trying to accomplish, "any change in the sql files is immediately reflected in the db, then the containers are refreshed to reload the new schema".
Feel free to try other approaches to make the process faster. It would be a gain if the approach you suggested works in 90% of the time and when it does not, we fall back to resetting the db.

steve-chavez · 2017-04-13T03:02:59Z

I've been trying working the sub0_kickstart example with apgdiff, and It's giving lots of errors:

The psql slash commands like \{set,echo} only get ignored if they end with ; if they do not end with this seems the next statement that ends with a ; is ignored.

  -- v1.sql
  \set QUIET on
  create schema data;
  -- v2.sql
  \set QUIET on
  create schema data;
  create table data.items(id int);
  -- diff.sql
  Exception in thread "main" java.lang.RuntimeException: Cannot find schema 'data' for statement 'create table data.items(id int);'. 
  Missing CREATE SCHEMA statement?

When adding a new check constraint apgdiff thinks is a new column and it outputs an invalid syntax in the form of ALTER COLUMN "check".

-- v1.sql
create schema data;
create table data.users (  
  firstname            text not null,
  check (length(firstname)>2)
);
-- v2.sql
create schema data;
create table data.users (
  firstname            text not null,
  lastname             text not null,
  check (length(firstname)>2),
  check (length(lastname)>2)
);
-- diff.sql
SET search_path = data, pg_catalog;

ALTER TABLE users
  ADD COLUMN lastname text NOT NULL,
  ALTER COLUMN "check" TYPE (length(lastname)>2) /* TYPE change - table: users original: (length(firstname)>2) new: (length(lastname)>2) */;

Outputs invalid syntax instead of adding a CONSTRAINT when adding REFERENCES to a column.

-- v1.sql
create schema data;
create table data.tasks (
  id           serial primary key,
  project_id   int not null
);
-- v2.sql
create schema data;
create table data.projects( id serial primary key );
create table data.tasks (
  id           serial primary key,
  project_id   int not null references data.projects(id)
);
-- diff.sql
SET search_path = data, pg_catalog;
CREATE TABLE projects (
        id serial primary key
);
ALTER TABLE tasks
        ALTER COLUMN project_id TYPE int not null references data.projects(id) /* TYPE change - table: tasks original: int new: int not null references data.projects(id) */,
        ALTER COLUMN project_id DROP NOT NULL;
-- ERROR:  42601: syntax error at or near "not"
-- ALTER COLUMN project_id TYPE int not null referenc...

Cannot deduce a SERIAL implicit SEQUENCE, failing to do a GRANT.

-- v1.sql
create schema data;
create table data.tasks (
  id           serial primary key,
);
-- v2.sql
create schema data;
create table data.tasks (
  id           serial primary key,
);
create role webuser;
grant usage on sequence data.tasks_id_seq to webuser;
-- diff.sql
Exception in thread "main" java.lang.RuntimeException: Cannot find sequence 'data.tasks_id_seq' for statement 'grant usage on sequence data.tasks_id_seq to webuser;'. 
Missing CREATE SEQUENCE?

The list is not comprehensive I bet there'll be more errors. I think apgdiff has been thought just for handling simple ddl's. This is not counting the \ir that is maybe a worst issue, at first I thought just doing a cat but I forgot the order matters.

steve-chavez · 2017-04-13T03:05:39Z

Patching apgdiff java parser may be too much of an effort, I think there's a better solution for this.

Instead of a custom parser for pg sql I thought of reusing pg parser, there's is an official guide for this
and it seems is already done for select statements, using this would guarantee syntax correctness and sync with newer releases.

For our use case I'd have to include ddl and dcl statements, these are also included in pg parser.c, maybe I could patch libpg_query or see if I can reuse pg code in another way.

The other issue about psql slash commands, I've seen pg code and it handles in a more simpler way separate from the parser.c, I think just the \{i,ir} could be supported.

This pgdiffer could be another project under the subZero organization, separate from the devtools, if it's done maybe later can be open sourced for having more eyes for the bugs.

It's more of a long term project though but I think it's feasible and it would not only benefit the devtools but solve more complex migration issues, if you think is worth it I could research a bit more and then do a small proof of concept and see where it goes.

ruslantalpa · 2017-04-13T03:48:00Z

a good pgdiff that understands all the commands is good and probably will be easyer to write in haskell where you have custom types and easey parsing but it's a complex project so let's put tht on hold, let's get the devtools working with the curent approach of resetting (which could be made faster probably with some tricks)

steve-chavez · 2017-04-13T14:06:32Z

Seeing the amount of errors apgdiff throws with the kickstart I think that it'll be better to not include it in the devtools console, what other tricks could make the resetting faster?

ruslantalpa · 2017-04-13T14:38:55Z

I don't know but reloading a schema with a smal dataset is not a huge deal, it should not take more then 10s (my old laptop, the kickstart is only 2 s) But let's leave this for later. Console now :)

…

On 13 Apr 2017, at 17:06, Steve Chávez ***@***.***> wrote: Seeing the amount of errors apgdiff throws with the kickstart I think that it'll be better to not include it in the devtools console, what other tricks could make the resetting faster? — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

ruslantalpa closed this as completed May 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve database reloader #2

Improve database reloader #2

steve-chavez commented Apr 12, 2017

ruslantalpa commented Apr 12, 2017

steve-chavez commented Apr 13, 2017

steve-chavez commented Apr 13, 2017

ruslantalpa commented Apr 13, 2017

steve-chavez commented Apr 13, 2017

ruslantalpa commented Apr 13, 2017 via email

Improve database reloader #2

Improve database reloader #2

Comments

steve-chavez commented Apr 12, 2017

ruslantalpa commented Apr 12, 2017

steve-chavez commented Apr 13, 2017

steve-chavez commented Apr 13, 2017

ruslantalpa commented Apr 13, 2017

steve-chavez commented Apr 13, 2017

ruslantalpa commented Apr 13, 2017 via email