Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fixed #22022 -- Add pre_create_sql support to db_parameters. #2266

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
4 participants
Member

mjtamlyn commented Feb 12, 2014

Allow fields to require some SQL to be run in advance of creating the
field. This is ideal for postgres extensions, but could also be used to
define custom functions in a third party app without requiring
migrations.

@mjtamlyn mjtamlyn Fixed #22022 -- Add pre_create_sql support to db_parameters.
Allow fields to require some SQL to be run in advance of creating the
field. This is ideal for postgres extensions, but could also be used to
define custom functions in a third party app without requiring
migrations.
87cb264
Member

mjtamlyn commented Feb 12, 2014

@andrewgodwin first draft of what we discussed yesterday

Member

shaib commented Feb 12, 2014

Two points feel a little odd in this approach:

  1. The pre-create sql ends up mixed into the create sql, and is run again for every field. It would be much nicer (especially in the use case of "use sqlall to hand script to DBA") to collect all pre-create commands, and try to run each only once. In fact, I'd go as far as trying to put all such commands in a separate migration.

Coming to think about it, yes: This should be a migration provided by the app defining the field types, not hidden inside a create-model action for an app that uses the field. Instead of making use of the field run the commands in create-model, just make the model-creating migration depend on the field-type-defining-app's migration.

  1. The approach looks like it marries the custom fields and the database backends too tightly. I'm thinking here of 3rd-party backends and 3rd-party fields -- with this implementation, as far as I understand, either the backend must support the field, or the other way around. I would like to see some mechanism -- a registry, perhaps -- which would allow defining the relevant pieces at the project level.

I am well aware that a project can partially solve the problem by subclassing the field and checking the connection vendor -- but that feels quite ugly; and that this point is not just about pre-create sql, but all sql snippets -- but we should start somewhere 😄 .

Member

mjtamlyn commented Feb 12, 2014

  1. Making it a migration provided by the app means every field requiring an extension to the database needs to be in its own installable app. This is a significant issue with d.c.postgres as some fields in the app should work on postgres 8, but others require extensions not existing until postgres 9. It is reasonable to assume a similar issue for any other third party app. Also how do I go about expressing that any model using a field in my app has my initial migration as a dependency?

I do however like the idea of splitting the pre-create SQL into a separate operation. I suggested this in initial discussions with Andrew but he was concerned about complexity. We could make the autodetector spawn RunSQL operations whenever it sees an addition for a field or model with pre_create_sql, and then collate (and compare) those RunSQL commands. The commands should ideally still be run in every migration which needs them rather than trying to track them as dependencies, so they still always need to be defensively coded (e.g. CREATE MIGRATION IF NOT EXISTS).

  1. I'd actually prefer to go the other way - fields should be able to tell the ORM everything they need. If there is a third party backend then they would need to inspect connection.vendor. I feel like the best way to make the ORM extensible is to remove "global state" in its implementation - see the (somewhat ongoing) work to make Lookups and Transforms own their own implementation, instead of it living in different database backends. They admittedly provide a more straightforwards hook in the form on as_<vendor>_sql(). What they also do which is harder to do here, is allow you to override a lookup across the project straight away. It is simple for a project to change the implementation of say, __contains on every field of a certain type in the project.

In reality, when supporting a third party field on a third party backend, I'd suggest there is a strong probability that they would need to override db_parameters anyway - this is now the canonical place to change the db_type. In that case, overriding the pre_create_sql (or removing it) seems quite reasonable.

Owner

andrewgodwin commented Feb 12, 2014

I agree with @mjtamlyn on point 2 - Django fields unfortunately do have to know about third-party backends, as the current implementation of field type abstraction in core is quite woeful and not something we could easily extend (and to be honest, I don't mind fields being responsible, they're usually the thing under more active and rapid development).

For the "pre-create" stuff, I'm torn. There are some things where you do want to run them multiple times but with different arguments (say you're using a predefined function to set up field stuff), and some times you want to merge them (CREATE EXTENSION).

My particular worry is that having them merged breaks the abstraction that fields are individual units that SchemaEditor can just deal with one at a time and requires both SchemaEditor and the Migrations framework to be aware of what's going on and merge them, which isn't going to happen.

You could say that you make the SQL into separate RunSQL operations, but then you break the promise that you can use SchemaEditor as its own API and just pass fields into it (which is the whole point).

I can't see a good solution - the most we can do is merge them if there's multiple ones during the same create_model() call, but other than that there's going to have to be duplicates. It'll only matter when you're running sqlmigrate to make a SQL script to hand to a DBA, and I'm sure someone can manually trim them at that point if they wish.

Member

mjtamlyn commented Feb 12, 2014

I think if we're going to go down the route of merging them then we need to build the feature properly. That is, we need to treat database extensions as a proper new feature, tracked, created and removed like models are. It should be reasonable for a field to require a certain extension, extension types could have a registry so that third party db backends/projects would be able to amend them, and they should have forwards and backwards methods to create/remove themselves from the database. Conceivably, the detection of extensions required by fields within the model could be delegated through the AppConfig, thus giving an extension point for a third party app to install them without needing manual migrations etc.

(Note here I'm using the term "database extension" to mean something which augments the power of the database, which can be created and removed. This is basically all a PG extension is, some create datatypes but others simply create new functions. The concept however is db independent - both Oracle and MySQL support stored procedures.)

This becomes a much bigger (and more powerful) feature, thus would definitely need to be after feature freeze. Perhaps this merits discussion on django-dev? Personally I'm not completely convinced that the amount of effort involved with building this feature is worth the cleanliness it promises, but there is a conceivable path there.

Owner

andrewgodwin commented Feb 13, 2014

Sure, all the databases have stored procedures, but only PostgreSQL really has this idea of "extensions" as a loadable object referenced only by name. In other DBs you'd need to manually declare everything and inject it with code, so I'm not sure how it makes sense as a core schema feature...

Member

shaib commented Feb 13, 2014

@mjtamlyn:

Making it a migration provided by the app means every field requiring an extension to the database needs to be in its own installable app.

No, it only means every extension to be loaded needs to be in its own migration. An app can have many initial migrations now; although that was not the intended use, I see no problem with some app defining a whole set of initial migrations -- not for trimming, but for separation of concerns; not intended to be trimmed later, but to stay there.

Then we'll only need to take care of the dependencies properly -- whether it be by makemigration adding an explicit dependency, or the field type somehow generating an implicit one. The concept of migration is now in core -- so, API-wise, just put a 'needs_migration' property on the field type, with its value a (label, migration) pair.

What am I missing?

Member

mjtamlyn commented Feb 13, 2014

That sounds like a significant change to the migration graph to support some migrations as 'optional unless it's a dependency'. I'll let Andrew work out whether it's possible - I can imagine it might result in some tricky interactions with the squashing code.

As for extensions being pg specific, I think all these extensions would naturally be very database specific and I don't see an enormous difference between defining and removing a stored procedure (or collection of stored procedures) and adding or removing an extension. By putting the code in an object (subclass of Extension) in a third party app it becomes much easier to extend to other databases etc than if it's in a migration, and it does lend some cleanliness to the database as the extension would only be added if needed.

To be clear, I'm somewhat undecided here, the current code works but is in my opinion ugly and potentially fragile. The proposed alternative is large and complex for relatively little gain. Maybe I need to write option 2 and see what it looks like.

Owner

timgraham commented Sep 15, 2014

Marc, should this PR remain open?

Member

mjtamlyn commented Sep 16, 2014

For now please, I'm starting to work on another approach for hstore and postgres extensions in general which might work, but you'll have to manually add the Operation yourself, either before creating the models or in a separate migration. I may also provide a management command to write out that migration if that is easy enough to do. That's still a prototype so let's leave this be for now as an alternative approach.

Member

mjtamlyn commented Nov 4, 2014

I'm going to give up on this concept in favour of manual Operations.

@mjtamlyn mjtamlyn closed this Nov 4, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment