Skip to content
timabell edited this page Jul 9, 2012 · 4 revisions

Script naming

SqlHawk will accept any script file following the pattern UpgradeScripts/*.sql, including scripts in subfolders (see branches below)

It is recommended that scripts are named as follows:

n_description.sql

where n is an int without leading zeros. I recommend against leading zeros as as windows explorer etc are capable of recognising a leading digit and sorting correctly these days. Using say "01" etc. would cause nasty jumps if you end up with 100 scripts and have started with 01.

The use of numbers in folder / file names is only used to set the order that scripts run in. The tool records the script path as the key of the list of scripts that have already been run in the metadata table, and checks for any scripts that have not yet been run, running them in numbered path & filename order. If there are no numbers at the start of a folder / script name then the scripts will be run in alphabetical order.

sql batches (GO)

GO batches are supported to simplify development, script files will be split on the GO keyword and each batch run separately (although all within the same transaction if supported by the DBMS in use).

Tracking table

Table SqlHawk_UpgradeLog is added and used, which records ChangeId, date applied, script path that was run, and extra description fields for the output of "git describe" or other comments.

You need to run --initialize-tracking to create the table if it's missing, and SqlHawk will not run upgradfe scripts if the tracking table is not there (to help avoid accidentally running against the wrong database).

The table definition is:

  • UpgradeId int not null auto_increment, # simple primary key to see ordering easily, this can be used to indicate version # of a database for display in the UI
  • UpgradeDate TIMESTAMP DEFAULT CURRENT_TIMESTAMP, # when the individual script was applied, will be slightly different across the batch
  • Batch nvarchar(200), # some kind of user generated string that groups together a set of scripts in an upgrade into a single batch, e.g. the output of git describe, or a svn version number.
  • ScriptPath nvarchar(400), # which script was run, used to prevent scripts being run twice

Branch handling

Handling multiple feature branches of development:

SqlHawk supports the following folder structure, allowing you to keep upgrade scripts for different branches in different subfolders, thus avoiding direct conflicts / confusion when the two branches are merged.

  • UpgradeScripts
    • 01_global.sql
    • 01_feature_A
      • 01_new_table.sql
      • 02_new_data.sql
    • 02_feature_B
      • 01_some_other_table.sql
      • 02_some_other_data.sql
      • 03_modify_something_else.sql
      • micro_change
        • 01_mini.sql

It is still possible that two branches may be incompatible, for example if they both modify the same table using the temp-table/copy/drop method, then whoever runs second will fail due to altered schema, or worse still silently drop data. Having said that I don't want this tool to add unnecessary overhead for the case where two branches make independent changes hence the approach taken.

The scripts are run in folder order, outermost folder first, then filename order, this way even if feature A is developed in a different branch to B, when they are merged there is no need to renumber the scripts.

The risk of clashing changes needs more thought, but I'm currently thinking that before/after schema / data validation is the ideal solution in the long run (in conjunction with a C.I. environment, which is a given in my opinion).

Notes

Developers need to be aware that scripts will never be run twice on the same database, so if a script has already been applied (to the production database) then modifying that script will have no effect. Instead a new correction script will have to be written.

If you follow the recommended build process (see Using sqlHawk for database versioning) for test databases of repeatedly dropping, restoring from backup and applying outstanding scripts then modified scripts will be re-run on your test environments where they are a higher version than the backup copy of the database.