New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make version table name configurable #34

Closed
sqlalchemy-bot opened this Issue Feb 24, 2012 · 24 comments

Comments

Projects
None yet
1 participant
@sqlalchemy-bot

sqlalchemy-bot commented Feb 24, 2012

Migrated issue, originally created by Karol Kuczmarski (@Xion)

It's currently alembic_version, but it would be nice if this was configurable - either in alembic.ini, or (more likely) in env.py.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Feb 24, 2012

Michael Bayer (@zzzeek) wrote:

yeah this would be an env.py option passed to EnvironmentContext.configure. if you have the inclination to give me a pull request that would expedite.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 2, 2012

Jeff Dairiki (@dairiki) wrote:

I have a case where I have two separately managed schema living in a single database. (I.e. I have two disjoint subsets of tables in a single database. I would like each subset to be managed by its own alembic migration environment.)

If the version table name were configurable, that would work.

Even better, might be to have a single version table which is shared by both migration environments. Another column ('environment_id') would have to be added to the version table. (Alternatively, each environment could just be made to ignore any revisions in the version table which it doesn't recognize as one of its own — this has the advantage of not having to add a column to the version table, but seems frighteningly fragile.)

Then again, adding the logic to deal with two-column version tables alongside legacy one-column version tables (as well as migrating between them) might be a pain...

Anyhow, if you tell me whether or not you think this (shared version table) is a stupid idea, I'll see if I can work up a pull request.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 2, 2012

Felix Schwarz (@felixschwarz) wrote:

just like to mention that I'd like to see an environment id in the same version table even a bitter better than two different version tables.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 2, 2012

Jeff Dairiki (@dairiki) wrote:

Having thought about this a bit more, here's a more detailed proposal.

Add two new options (for EnvironmentContext.configure()):

version_table

The version table name. Default 'alembic_version'.

environment_id

Environment identifier, a string which must be unique among alembic environments sharing the same version table.

version_table_schema

One of:

1 - Use the one-column schema for the version table. It is an error to specify
version_table_schema=1 along with a non-//None// environment_id.

2 - Use the two-column schema, allows for sharing the version table between
alembic environments.

None (the default) - Use schema version //1// if no environment_id is set, else
use schema version //2//.

"auto" - Like //None// except when running in on-line mode, in which case

  • If the existing version table is of the one-column sort, and a non-//None// environment_id
    is specified, the version_table will be automatically migrated to the two-column schema
  • If the existing version table is of the two-column sort, and environment_id is //None//,
    the two-column schema will be used.

In any case, when running in on-line mode, the schema of any existing version table
is determined via reflection. If version_table_schema is not "auto" and error is raised
unless the existing version table conforms to the configured schema.
(When version_table_schema is "auto", Alembic will cope automatically with any schema
mismatch, as described above.)

=== Changes ===

2012-04-04T01:00Z: Added version_table_schema configure parameter to minimize the amount
of magic happens without it being explicitly enabled.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 4, 2012

Jeff Dairiki (@dairiki) wrote:

Okay, here's a first crack: [[ https://bitbucket.org/dairiki/alembic/changeset/1dbb953d33e4 | changeset 1dbb953d33e4 ]].

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 6, 2012

Jeff Dairiki (@dairiki) wrote:

Update: I've made a few changes, and merged my patches into the latest tip (0.3).

Here's the latest: [[https://bitbucket.org/dairiki/alembic/changeset/97b8c0d13b6a |97b8c0d13b6a]]

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 20, 2012

Michael Bayer (@zzzeek) wrote:

I think I like this. Why not have VersionTableV1Helper and VersionTableV2Helper as subclasses of VersionTableHelper (and maybe lose the name "Helper", let this just be, "the version table"), so that we don't need the conditionals ?

One thing I'm possibly worried about with the single table with multiple versions is the ramifications of generating SQL scripts - when I experimented with SQL script generation I found it to be non-trivial whether or not the scripts should include the generation of the version table itself in the script. That is, when we're doing SQL script we don't really have the luxury of being able to reflect and check for it being there. Having distinct version tables per environment at least means you can gen a SQL script from zero and definitely include the generation of the version table.

But you've got the name configurable so now it does both ways.

Do we want two ways to do it ?

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 20, 2012

Changes by Michael Bayer (@zzzeek):

  • removed labels: low priority
@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 21, 2012

Jeff Dairiki (@dairiki) wrote:

I did start out with Table{V1,V2} subclasses. The problem with that was (in the case of version_table_schema='auto' and environment_id unset) the table version is not always known when the helper is constructed. This is because table reflection is deferred until needed (e.g. .get_current_revision() is called.) (I can't remember now whether there was really a good reason to defer the version table reflection. If not, well then...)

(The helper could use another class, which would be instantiated when the table was reflected, to implement
the version table operations. This would eliminate the conditionals, but at the time I didn't think it was worth it.)


You're right. There is no way to tell when a shared version table needs to be created in off-line mode. Maybe "CREATE TABLE IF NOT EXISTS" could help a bit? (How widespread is support for IF NOT EXISTS?)


Making the table name configurable is trivial compared to rest of this. I figure you might as well include it. It doesn't have to be explicitly mentioned (in the tutorials) as an alternate way. (It's a simple enough concept that basic API docs should be enough.)

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 21, 2012

Michael Bayer (@zzzeek) wrote:

yeah I'd push for Version.from_db()->VerisionTableV2 or something like that, those conditionals bug me :)

but yeah what are we gaining here vs. multiple version tables with distinct names, vs. one table with "environment ids".

I dont think the "IF NOT EXISTS" thing is very standard.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 23, 2012

Jeff Dairiki (@dairiki) wrote:

I think the two-column table vs multiple table question is mainly one of aesthetics. A motivating use case is: suppose you want to change the scaffoldings so that alembic environments are “conflict safe” by default. (I think this is a good idea, given that it is tricky to change the "environment id" once any migrations have been run.) To minimize the amount of manual configuration involved, that would probably entail generating a random guid to use as the environment id. In the multiple version table scenario, that means the version table will be named something like 'alembic_version_38f9a2c' — kinda weird and ugly.

OTOH, given the amount of code and verbiage needed to describe the configuration of the shared two-column version table whizziness, that approach is not the ultimate in elegance either.

I'd be happy going either way — make the call. I'll be happy either to clean up the current patch, or come up with a new one supporting just the configurable version table name.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 23, 2012

Michael Bayer (@zzzeek) wrote:

im fine going with the general idea of two approaches as an option, since most of the work/tests are done anyway.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 25, 2012

Jeff Dairiki (@dairiki) wrote:

Here's another refactor: [[https://bitbucket.org/dairiki/alembic/changeset/9993af6d7bcb | 9993af6d7bcb]]. (Perhaps easier on the eyes: [[https://bitbucket.org/dairiki/alembic/compare/..zzzeek/alembic | diffs from your tip]].)

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 25, 2012

Michael Bayer (@zzzeek) wrote:

I like this a lot. though I wonder if we might take out the defensive things I see - that is, where we have the user configure things, then "test" those configs - why not trust them ? that is, why have version_table_schema, isnt environment_id None or not enough for that ? why autoload the existing version table, then check the columns, then "upgrade" the table, what's all that for ?

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 26, 2012

Jeff Dairiki (@dairiki) wrote:

Part of the motivation for version_table_schema is to provide a migration path in the case:

  • User (me) has an alembic environment set up which manages some tables in a database. This environment
    is already in use under the current version of alembic which does not support the multi-column version table.
    (So, effectively, this environment has environment_id = None.)
  • User (again, me) wants to create a second alembic environment to manage some other tables in the same database.

The upgrade path in this case (if one is running on-line migrations) is to

  • In the old environment leave environment_id at the default None. Set version_table_schema to 'auto'.
    This will allow the old environment to work with either one- or two-column version tables.
  • In the new environment, set environment_id to some new unique id; set version_table_schema to 'auto'.
    This will allow the new environment to auto-upgrade the single-column version table to the two-column table it needs.

(If one is restricted to off-line migrations, upgrading the version table will take manual SQL-foo, at present.)


Another motivation for version_table_schema is to make it possible to disable any magic upgrading/adapting
of the version table. (More specifically, to disable the magic unless explicitly enabled.) As long as version_table_schema is not set to 'auto', off-line and on-line migrations will (should, at least) behave identically (modulo the issue you pointed out having to do with knowing when the version table needs to be created in off-line mode.)


You’re right that when version_table_schema is not 'auto', reflection of the version table is not really required. In that case it is used only for my paranoid sanity checks.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented Apr 26, 2012

Michael Bayer (@zzzeek) wrote:

if I were upgrading from a one-table to a two-table system, I'd run a special command: "alembic multienv_init". this command generates a new multi-env table, selects the single identifier from the current single table, and moves it into the multi-env table.

Implications:

  1. a particular database uses either the 1-env or 2-env table for versioning. not both.

  2. no reflection/guessing/possibly conflicting flags

  3. the user knows which scheme he or she wants to use, and when, runs the helper command to set things up explicitly.

  4. the two modes otherwise have no connection to each other.

  5. the story for the "online" mode and "offline" mode now becomes similar (run the same command in "offline" mode, will produce "INSERT ... SELECT" to transfer the version number, perhaps)

should be a lot less complexity, and the "we're switching styles" use case is stowed away in a single explicitly called command.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented May 2, 2012

Jeff Dairiki (@dairiki) wrote:

(Sorry for the delay in response.)

I basically agree with the motives of simplicity and explicitness. But the version table schema upgrade command doesn't feel quite right to me.

If an alembic environment can only deal with either a 1-col or a 2-col version table at a given time, there is a synchronization issue. Does one upgrade env.py then run multienv_init or vice versa? If there are only one or two deployments to worry about, it's not a big deal. If there are more (and if one wants to defer upgrade/migrations on some until later) it becomes a pita to keep track of. If env.py and the version table schema are out of sync, alembic commands which touch the version table will fail, I guess. Or, more worrisome, perhaps the the database will appear to be unversioned?

It seems that you want to disallow None as a valid value for environment_id (so that environment_id = None implies 1-col version table.) If so, then one must specify a env_id when multienv_init is run, I guess as a command-line argument (then make sure env.py is altered to match.) Ugh.

I would think some sanity checking against the possibility of running multienv_init twice would be called for.

I guess to me it seems like with the manual upgrade command, most of the complexity (both in terms of lines-of-code, and configuration/operation confusion) which is saved in migration.py will just be moved into the multienv_init. Either way some sanity checks will be called for.


Given that we still haven't come up with a particularly simple (and safe) way to support both 1- and 2- col version tables, I can still be easily convinced that the best plan is to scrap all the two-column stuff and just go with multiple single-column version tables. Though it feels somehow less elegant, it really is simple and straightforward.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented May 3, 2012

Michael Bayer (@zzzeek) wrote:

i think keeping it simple would be the best compromise for now.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented May 3, 2012

Jeff Dairiki (@dairiki) wrote:

Okay. How's this? [[https://bitbucket.org/dairiki/alembic/changeset/6ed983fc075450d874557e81feb0237d7a28a222 | 6ed983f]].

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented May 4, 2012

Michael Bayer (@zzzeek) wrote:

it looks great. but I'm bummed you went through all this trouble !

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented May 7, 2012

Jeff Dairiki (@dairiki) wrote:

No problem. Sometimes one just has to jump in and see how it works out. I always learn something (though, unfortunately, I don't always remember the lesson the next time a similar idea rolls through.)

(Let me know if you want me to create a bona fide pull request for the commit.)

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented May 7, 2012

Michael Bayer (@zzzeek) wrote:

sure just make one and ill pull it right in.

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented May 16, 2012

Michael Bayer (@zzzeek) wrote:

just back from a trip, guess I pulled this and forgot about it. closing this issue for now, the pull is in [[https://bitbucket.org/zzzeek/alembic/changeset/6ed983fc075450d874557e81feb0237d7a28a222 |6ed983fc075450d874557e81feb0237d7a28a222 ]], thanks for all the effort here !

@sqlalchemy-bot

This comment has been minimized.

sqlalchemy-bot commented May 16, 2012

Changes by Michael Bayer (@zzzeek):

  • changed status to closed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment