Permalink
Browse files

Options to use local Docker for temporary schema (#45)

BACKGROUND:

Skeema's behavior does not rely on parsing SQL DDL, as this can be too brittle across various MySQL versions and vendors, which have subtle differences in features and functionality. Instead, Skeema uses metadata reported directly from the database to introspect schemas, using information_schema as well as various SHOW commands.

In order to accurately introspect the schemas represented in your filesystem's *.sql files, Skeema actually runs the files' CREATE TABLE statements in a temporary location, now called a "workspace." Previously (and still by default), Skeema creates, uses, and then drops a temporary schema on each database it interacts with.

WHAT'S NEW:

This PR adds the ability to instead use a local Docker container for workspace operations. Two new options control this behavior:

* `workspace=docker` tells Skeema to dynamically manage local Docker container(s) for workspace operations, instead of using a temporary schema on each live DB.

* `docker-cleanup` controls how to manage the container lifecycle as Skeema is exiting. The default, `docker-cleanup=none`, leaves containers running so that subsequent invocations of Skeema are faster. Setting `docker-cleanup=stop` stops containers but does not remove them, and `docker-cleanup=destroy` deletes them entirely.

This functionality is especially useful when running Skeema from a different region/datacenter than your database -- for example, running Skeema on your laptop, when your databases are in AWS. Using `workspace=docker` greatly reduces painful network latency in this scenario, especially if you have a large number of tables. See discussion in #25 for background.
  • Loading branch information...
evanelias committed Nov 9, 2018
1 parent 2e3a860 commit b81b89ccdcdced5ad1ffc6f2e0c29438b80a8e65

Some generated files are not rendered by default. Learn more.

Oops, something went wrong.
@@ -4,13 +4,13 @@
[![code coverage](https://img.shields.io/coveralls/skeema/skeema.svg)](https://coveralls.io/r/skeema/skeema)
[![latest release](https://img.shields.io/github/release/skeema/skeema.svg)](https://github.com/skeema/skeema/releases)

Skeema is a tool for managing MySQL tables and schema changes. It provides a CLI tool allowing you to:
Skeema is a tool for managing MySQL tables and schema changes in a declarative fashion using pure SQL. It provides a CLI tool allowing you to:

* Export CREATE TABLE statements to the filesystem, for tracking in a repo (git, hg, svn, etc)
* Diff changes in the schema repo against live DBs to automatically generate DDL
* Manage multiple environments (dev, staging, prod) and keep them in sync with ease
* Manage multiple environments (e.g. dev, staging, prod) and keep them in sync with ease
* Configure use of online schema change tools, such as pt-online-schema-change, for performing ALTERs
* Convert non-online migrations from Rails, Django, etc into online schema changes in production
* Convert non-online migrations from frameworks like Rails or Django into online schema changes in production

Skeema supports a pull-request-based workflow for schema change submission, review, and execution. This permits your team to manage schema changes in exactly the same way as you manage code changes.

@@ -37,7 +37,7 @@ To download, build from master, and install (or upgrade) Skeema, run:

## Status

Skeema is generally available, having reached v1 release milestone in July 2018. Prior to that, it was in public beta since October 2016.
Skeema is generally available, having reached the v1 release milestone in July 2018. Prior to that, it was in public beta since October 2016.

The `skeema` binary is supported on macOS and Linux. For now, it cannot be compiled on Windows.

@@ -1,8 +1,6 @@
package applier

import (
"time"

log "github.com/sirupsen/logrus"
"github.com/skeema/skeema/fs"
"github.com/skeema/skeema/workspace"
@@ -37,10 +35,10 @@ func TargetsForDir(dir *fs.Dir, maxDepth int) (targets []*Target, skipCount int)
var instances []*tengo.Instance
instances, skipCount = instancesForDir(dir)

// For each IdealSchema, obtain a *tengo.Schema representation and then create
// a Target for each instance x schema combination
for _, idealSchema := range dir.IdealSchemas {
thisTargets, thisSkipCount := targetsForIdealSchema(idealSchema, dir, instances)
// For each LogicalSchema, obtain a *tengo.Schema representation and then
// create a Target for each instance x schema combination
for _, logicalSchema := range dir.LogicalSchemas {
thisTargets, thisSkipCount := targetsForLogicalSchema(logicalSchema, dir, instances)
targets = append(targets, thisTargets...)
skipCount += thisSkipCount
}
@@ -107,41 +105,40 @@ func instancesForDir(dir *fs.Dir) (instances []*tengo.Instance, skipCount int) {
return
}

func targetsForIdealSchema(idealSchema *fs.IdealSchema, dir *fs.Dir, instances []*tengo.Instance) (targets []*Target, skipCount int) {
func targetsForLogicalSchema(logicalSchema *fs.LogicalSchema, dir *fs.Dir, instances []*tengo.Instance) (targets []*Target, skipCount int) {
// If dir mapped to no instances, it generates no targets
if len(instances) == 0 {
return
}

// Obtain a *tengo.Schema representation of the dir's *.sql files from a
// workspace
opts := workspace.Options{
Type: workspace.TypeTempSchema,
Instance: instances[0],
SchemaName: dir.Config.Get("temp-schema"),
KeepSchema: dir.Config.GetBool("reuse-temp-schema"),
DefaultCharacterSet: dir.Config.Get("default-character-set"),
DefaultCollation: dir.Config.Get("default-collation"),
LockWaitTimeout: 30 * time.Second,
opts, err := workspace.OptionsForDir(dir, instances[0])
if err != nil {
log.Warnf("Skipping %s: %s\n", dir, err)
return nil, len(instances)
}
fsSchema, tableErrors, err := workspace.MaterializeIdealSchema(idealSchema, opts)
fsSchema, statementErrors, err := workspace.ExecLogicalSchema(logicalSchema, opts)
if err != nil {
log.Warnf("Skipping %s: %s\n", dir, err)
return nil, len(instances)
}
for _, tableError := range tableErrors {
stmt := idealSchema.CreateTables[tableError.TableName]
log.Errorf("%s: %s", stmt.Location(), tableError.Err)
for _, stmtErr := range statementErrors {
log.Error(stmtErr.Error())
}
if len(tableErrors) > 0 {
log.Warnf("Skipping %s due to %d SQL errors", dir, len(tableErrors))
if len(statementErrors) > 0 {
noun := "errors"
if len(statementErrors) == 1 {
noun = "error"
}
log.Warnf("Skipping %s due to %d SQL %s", dir, len(statementErrors), noun)
return nil, len(instances)
}

// Create a Target for each instance x schema combination
for _, inst := range instances {
var schemaNames []string
if idealSchema.Name == "" { // blank means use the schema option from dir config
if logicalSchema.Name == "" { // blank means use the schema option from dir config
schemaNames, err = dir.SchemaNames(inst)
if err != nil {
log.Warnf("Skipping %s for %s: %s", inst, dir, err)
@@ -152,7 +149,7 @@ func targetsForIdealSchema(idealSchema *fs.IdealSchema, dir *fs.Dir, instances [
schemaNames = schemaNames[0:1]
}
} else {
schemaNames = []string{idealSchema.Name}
schemaNames = []string{logicalSchema.Name}
}
schemasByName, err := inst.SchemasByName(schemaNames...)
if err != nil {
@@ -63,6 +63,13 @@ func (s ApplierIntegrationSuite) TestTargetsForDirSimple(t *testing.T) {
t.Fatalf("Unexpected result from TargetsForDir: %+v, %d", targets, skipCount)
}

// Test with invalid workspace option: should return 0 targets, 2 skipped
dir = getDir(t, "../testdata/applier/simple", "--workspace=invalid-option")
targets, skipCount = TargetsForDir(dir, 1)
if len(targets) != 0 || skipCount != 2 {
t.Fatalf("Unexpected result from TargetsForDir: %+v, %d", targets, skipCount)
}

// Test with sufficient maxDepth, but empty instance list: expect 0 targets, 0 skipped
setupHostList(t)
targets, skipCount = TargetsForDir(dir, 1)
@@ -2,8 +2,8 @@ package applier

import (
"fmt"
"time"

"github.com/skeema/skeema/fs"
"github.com/skeema/skeema/workspace"
"github.com/skeema/tengo"
)
@@ -38,37 +38,45 @@ func VerifyDiff(diff *tengo.SchemaDiff, t *Target) error {
mods.AlgorithmClause = "COPY"
}

// Gather CREATE and ALTER for modified tables
statements := make([]string, 0)
// Gather CREATE and ALTER for modified tables, and put into a LogicalSchema,
// which we then materialize into a real schema using a workspace
logicalSchema := &fs.LogicalSchema{
CharSet: t.Dir.Config.Get("default-character-set"),
Collation: t.Dir.Config.Get("default-collation"),
CreateTables: make(map[string]*fs.Statement),
AlterTables: make([]*fs.Statement, 0),
}
expected := make(map[string]*tengo.Table)
for _, td := range diff.FilteredTableDiffs(tengo.TableDiffAlter) {
stmt, err := td.Statement(mods)
if stmt != "" && err == nil {
// Some tables may have multiple ALTERs in the same diff
if _, already := expected[td.From.Name]; already {
statements = append(statements, stmt)
} else {
expected[td.From.Name] = td.To
statements = append(statements, td.From.CreateStatement, stmt)
expected[td.From.Name] = td.To
logicalSchema.CreateTables[td.From.Name] = &fs.Statement{
Type: fs.StatementTypeCreateTable,
Text: td.From.CreateStatement,
TableName: td.From.Name,
}
logicalSchema.AlterTables = append(logicalSchema.AlterTables, &fs.Statement{
Type: fs.StatementTypeAlterTable,
Text: stmt,
TableName: td.From.Name,
})
}
}

opts := workspace.Options{
Type: workspace.TypeTempSchema,
Instance: t.Instance,
SchemaName: t.Dir.Config.Get("temp-schema"),
KeepSchema: t.Dir.Config.GetBool("reuse-temp-schema"),
DefaultCharacterSet: t.Dir.Config.Get("default-character-set"),
DefaultCollation: t.Dir.Config.Get("default-collation"),
LockWaitTimeout: 30 * time.Second,
}
wsSchema, err := workspace.StatementsToSchema(statements, opts)
opts, err := workspace.OptionsForDir(t.Dir, t.Instance)
if err != nil {
return err
}
actualTables := wsSchema.TablesByName()
wsSchema, statementErrors, err := workspace.ExecLogicalSchema(logicalSchema, opts)
if err == nil && len(statementErrors) > 0 {
err = statementErrors[0]
}
if err != nil {
return fmt.Errorf("Diff verification failure: %s", err.Error())
}

actualTables := wsSchema.TablesByName()
for name, toTable := range expected {
expectCreate, _ := tengo.ParseCreateAutoInc(toTable.CreateStatement)
actualCreate, _ := tengo.ParseCreateAutoInc(actualTables[name].CreateStatement)
@@ -3,7 +3,6 @@ package main
import (
"fmt"
"os"
"time"

log "github.com/sirupsen/logrus"
"github.com/skeema/mybase"
@@ -87,44 +86,39 @@ func lintWalker(dir *fs.Dir, lc *lintCounters, maxDepth int) error {
if err != nil {
return err
}
opts := workspace.Options{
Type: workspace.TypeTempSchema,
Instance: inst,
SchemaName: dir.Config.Get("temp-schema"),
KeepSchema: dir.Config.GetBool("reuse-temp-schema"),
DefaultCharacterSet: dir.Config.Get("default-character-set"),
DefaultCollation: dir.Config.Get("default-collation"),
LockWaitTimeout: 30 * time.Second,
opts, err := workspace.OptionsForDir(dir, inst)
if err != nil {
return NewExitValue(CodeBadConfig, err.Error())
}

for _, idealSchema := range dir.IdealSchemas {
schema, tableErrors, err := workspace.MaterializeIdealSchema(idealSchema, opts)
for _, logicalSchema := range dir.LogicalSchemas {
schema, statementErrors, err := workspace.ExecLogicalSchema(logicalSchema, opts)
if err != nil {
log.Warnf("Skipping schema %s in %s due to error: %s", idealSchema.Name, dir.Path, err)
log.Warnf("Skipping schema %s in %s due to error: %s", logicalSchema.Name, dir.Path, err)
lc.errCount++
continue
}
for _, tableErr := range tableErrors {
if ignoreTable != nil && ignoreTable.MatchString(tableErr.TableName) {
log.Debugf("Skipping table %s because ignore-table='%s'", tableErr.TableName, ignoreTable)
for _, stmtErr := range statementErrors {
if ignoreTable != nil && ignoreTable.MatchString(stmtErr.TableName) {
log.Debugf("Skipping table %s because ignore-table='%s'", stmtErr.TableName, ignoreTable)
continue
}
log.Errorf("%s: %s", idealSchema.CreateTables[tableErr.TableName].Location(), tableErr.Err)
log.Error(stmtErr.Error())
lc.sqlErrCount++
}
for _, table := range schema.Tables {
if ignoreTable != nil && ignoreTable.MatchString(table.Name) {
log.Debugf("Skipping table %s because ignore-table='%s'", table.Name, ignoreTable)
continue
}
body, suffix := idealSchema.CreateTables[table.Name].SplitTextBody()
body, suffix := logicalSchema.CreateTables[table.Name].SplitTextBody()
if table.CreateStatement != body {
idealSchema.CreateTables[table.Name].Text = fmt.Sprintf("%s%s", table.CreateStatement, suffix)
length, err := idealSchema.CreateTables[table.Name].FromFile.Rewrite()
logicalSchema.CreateTables[table.Name].Text = fmt.Sprintf("%s%s", table.CreateStatement, suffix)
length, err := logicalSchema.CreateTables[table.Name].FromFile.Rewrite()
if err != nil {
return fmt.Errorf("Unable to write to %s: %s", idealSchema.CreateTables[table.Name].File, err)
return fmt.Errorf("Unable to write to %s: %s", logicalSchema.CreateTables[table.Name].File, err)
}
log.Infof("Wrote %s (%d bytes) -- updated file to normalize format", idealSchema.CreateTables[table.Name].File, length)
log.Infof("Wrote %s (%d bytes) -- updated file to normalize format", logicalSchema.CreateTables[table.Name].File, length)
lc.reformatCount++
}
}
Oops, something went wrong.

0 comments on commit b81b89c

Please sign in to comment.