Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Generating build scripts from a common declarative spec #145

Open
tompollard opened this Issue Nov 3, 2016 · 3 comments

Comments

Projects
None yet
3 participants
Owner

tompollard commented Nov 3, 2016

There are now several build scripts for MIMIC (https://github.com/MIT-LCP/mimic-code/tree/master/buildmimic). Only the Postgres scripts are kept updated by MIT-LCP, so scripts for other database systems become out of sync without community development.

@pszolovits has suggested:

  • (a) generating the scripts automagically from a common declarative spec (perhaps the “canonical” db)
  • (b) running some automated tests to verify that the scripts run correctly.

This is a good idea! @parisni, any suggestions for setting this up?

Contributor

parisni commented Nov 3, 2016

Hi Tom,

Good question, here a kind of answer

For (a) purpose : Basically for buildmimic has 4 scripts :

  1. create table
  2. load data
  3. create indexes
  4. expert user tricks

Some point could be automatized, from a canonical model. BTW there already exist tools for that purpose. (eg: http://www.liquibase.org/databases.html ). With its "standard" model (an xml/yaml file) it is able to answer to problem 1., 2. , and 3. by auto-generating generic scripts. Moreover such tools comes with some interesting feature such versionning etc. Most importantly, it is open-source & maintained.
For sure the auto-generated would not 100% perfectly work and little scripts (sed or such tools) in order to automatically modify the generic scripts in order to make MIMIC working.

Some point cannot be automatized (4. expert user tricks). Expert tricks could be multiple expert advices that would describe the way to increase performances on their favorite database - from the auto-generating generic scripts.

This kind of 3 steps pipeline (eg: i=liquibase; ii=sed script; iii=expert tricks recipe) would be able to produce an up-to-date buildmimic.

About (b) part, it is possible to create kind of database tests that would be run by MIT-LCP before releasing a new mimic version:

  1. install database with version XX (eg: docker container, or cloud based solution)
  2. run the buildmimic scripts
  3. run some sql tests and compare results

But I am not very sure that all what I describe is worthwhile. Maintaining such pipelines needs some experts in all kind of databases in order to modify it when releasing. My point of view is the MIMIC community is for today able to maintain manual scripts and the focus should be made on what kind of modification precisely affect release in order experts change scripts. And if scripts are not up-to-date, this simply means nobody uses the database. This way of doing is more flexible for adding new kind of databases by example - I guess.
The only thing I find really useful is to translate build mimic into a liquibase script in order to be able to autogenerate generic stems for new databases or maintaining existing.

Owner

alistairewj commented Dec 21, 2016

Just had a go at this @parisni - see https://github.com/MIT-LCP/mimic-code/blob/211a04e2f629dbaafe6979cd2047c95b4c54bf5d/buildmimic/mimic-iii.yaml - It doesn't include chartevents at the moment but roughly follows the liquibase spec. Have you ever used the software before?

Contributor

parisni commented Dec 21, 2016

Hi Alistair,

Yes, I introduced this tool few years ago in my team and we tend to use it in order to version the datawarehouse structure (DDL). Since we are focused on postgres, we do not use the yaml syntax, but the sql one ( http://www.liquibase.org/documentation/sql_format.html ) that allows writing our own rollbacks. Each modification is logged in specific tables. Good practices for team work.

But in your use case, the yaml syntax is the best choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment