InheritanceManager before SearchableManager #460

Closed
wants to merge 1 commit into
from

Projects

None yet

5 participants

@electroniceagle

Stephen,

This lets you add select_subclasses() to a filter() on Page or Product (anything that uses the DisplayManager) so you can automatically downcast to the appropriate subclass. Only works with a single level of hierarchy. I pulled the code directly from django-model-utils. Discussion of the overhead (essentially select-related turned on) in the second link. Requires Django 1.2+.

https://github.com/carljm/django-model-utils#inheritancemanager
http://jeffelmore.org/2010/11/11/automatic-downcasting-of-inherited-models-in-django/

>>> from mezzanine.pages.models import Page
>>> Page.objects.filter().select_subclasses()
[<RichTextPage: About>, <RichTextPage: About / Benefits>, <RichTextPage: About / Management>, <RichTextPage: About / Market Studies>, <RichTextPage: About / Motivation>, <RichTextPage: About / What We Do>, <RichTextPage: Account>, <RichTextPage: Account / Dashboard>, <RichTextPage: Account / Orders>, <RichTextPage: Account / Profile>, <Category: Catalog>, <Category: Catalog / Applications>, <Category: Catalog / Expertise>, <Category: Catalog / Platforms>, <Category: Catalog / Services>, <Form: Contact>, <Form: EC2 login credentials>, <RichTextPage: Help>, <RichTextPage: Help / Abaqus Remote Desktop>, <RichTextPage: Help / About the Manifold Flow Predictor>, '...(remaining elements truncated)...']
>>> Page.objects.filter()
[<Page: About>, <Page: About / Benefits>, <Page: About / Management>, <Page: About / Market Studies>, <Page: About / Motivation>, <Page: About / What We Do>, <Page: Account>, <Page: Account / Dashboard>, <Page: Account / Orders>, <Page: Account / Profile>, <Page: Catalog>, <Page: Catalog / Applications>, <Page: Catalog / Expertise>, <Page: Catalog / Platforms>, <Page: Catalog / Services>, <Page: Contact>, <Page: EC2 login credentials>, <Page: Help>, <Page: Help / Abaqus Remote Desktop>, <Page: Help / About the Manifold Flow Predictor>, '...(remaining elements truncated)...']
>>> ^D

@stephenmcd
Owner

Thanks for the explanation Brian - it's very clear and a great idea.

I'll go over this soon and from the looks of it, most likely merge it straight in.

Thanks again!

@lorin

@stephenmcd If this gets in, how would you feel about using this approach instead of Page.content_model for retrieving subclasses in the rest of the mezzanine code?

@stephenmcd
Owner

Off the top of my head it'd make get_content_model entirely redundant so we could probably get rid of any use of it.

Might be best to leave the method in (it can just return self perhaps) so that templates in existing projects don't break.

@stephenmcd
Owner

Sorry, to actually answer your question though... I'm all for it.

@electroniceagle

We want to make the querysets for Page and Product views (possibly BlogPost?) call select_subclasses(). Will try to push that next week.

@stephenmcd
Owner

Yeah I was thinking it would become the default behaviour.

I'm also waiting until I merge this in before I look at the product subclassing branch, as I think (and I think you've eluded to the same thing) that it may end up leveraging this branch itself.

@stephenmcd
Owner

Hi guys,

I've finally gotten around to looking at this and I've been playing around with the implementation and thinking about the feature itself for the last few days, and wanted to talk about it a bit more. I guess overall I want to play devil's advocate and talk about reasons for not including the inheritance manager in this pull request.

Firstly I think it's useful to define what it achieves, which I think are two separate things:

1) DB performance optimisation (no chance of hitting the DB again for each record when touching subclasses).
2) A different API for accessing pages. So no more need for get_content_model and such, since we always have the actual content subclasses directly when accessing pages.

Now I think the first item is simple to talk about, and there isn't really much question about the benefit of it - it's a good thing and we should definitely try and address it. It just so happens that there's already an implementation of this for the pages menu when in the admin interface - a special case was added for it a while back, since in the admin menu, we do need to deal with the content subclasses directly, and so a query was being performed for every page in the tree:

https://github.com/stephenmcd/mezzanine/blob/master/mezzanine/pages/templatetags/pages_tags.py#L53-L57

You can see in that code that the DB performance optimisation boils down to an extra line of code there, thanks to the class method Page.get_content_models. Now let's suppose for devil's advocate sake, we weren't to include the inheritance manager in this pull request. We could certainly make the above code apply to all instances of the page menu (not just in the admin). This could also be moved out of the page menu template tag and into PageManager.published (or somewhere else more general) so that it would always be used.

Now for the second item which is the API change, which I think is the real thing the inheritance manager brings to the table, and is a much more interesting point of discussion in that the benefits could be argued either for or against.

As I've mentioned I've been playing around with this, going through the entire code base and looking for where things can be changed and cleaned up to make use of it. It all works really well, so there aren't any technical issues there. But I think the API change is controversial. On one hand I think this is really "nice", and that was my thinking in my previous comments in this thread before actually playing with it. On the other hand, after playing with it, it comes across as somewhat magic and implicit - all of a sudden we're no longer dealing with real page instances when we query the page model. This all boils down to what I think is a matter of taste, and doing things in a way that most people would "expect" it to work.

Like I said, technically it works fine - and at the template level, if we were to leave the page's get_content_model method in tact for backward compatibility (with deprecation warnings), nothing really breaks save for a few things internally which can be easily addressed and wouldn't affect users. I am very interested though in what people are doing out there - is there a ton of project-level Mezzanine code in the wild that would break if this became the default behaviour? Probably not, but it's certainly something to think about.

The idea then comes up as to including this, but not making it the default behaviour. If we were to entertain that idea, it would steer me more toward not including it, since I'm reluctant to include things that don't actually get used by default. I've been bitten by this before, with things that currently exist in Mezzanine but aren't used, and therefore fall by the wayside as time goes on, with reports of them breaking because they don't get the attention they need over time due to not being used in the actual code base (the SingletonAdmin is one example of this).

It could also be argued that the subclass selection is an "end of the chain" type thing - it can occur outside of querysets, and performed once objects have actually been pulled out of the database, eg:

pages = [page.get_content_model() for page in pages]

As to how that might work at the project level for a project developer is up to them - but given how simple it would be, if the subclass selection wasn't used by default in Mezzanine, then the layers of the inheritance manager don't really warrant inclusion.

I'm going to post this thread to the maling list to try and get some more feedback. Not sure how much we'll get considering the effort involved in getting a proper understanding of everything in play, but I'm really on the fence about it given all the above.

TLDR: It is a nice API, and it technically works well, I'm just not sure it's the right API. And the performance considerations on their own can be better addressed without it.

@darylantony

I've forwarded this on to my knee-deep-in-mezzanine team... we'll see if they use it in our client sites.

@stephenmcd
Owner

Thanks Daryl.

I've also announced a timeline for 1.3 in this thread and requested feedback on this PR:

https://groups.google.com/group/mezzanine-users/browse_thread/thread/78aebc024034ef7a

@lorin

Stephen:

That was a very thoughtful reply. I'm writing up a long-ish email for the mailing list to try to explain some of the context, but I wanted to briefly point out that the InheritanceManager only does subclassing when you explicitly ask it by invoking select_subclasses on a returned queryset or get_subclass on the manager. So, for example, if you do:

pages = Page.objects.all()

In this case, pages will be a queryset of Page objects, it will not be a queryset of subclasses of Page. If you wanted to get a queryset of subclasses, you would do:

pages = Page.objects.all().select_subclasses()

In fact, you could keep the current API the same and redefine Page.get_content_model like this:

def get_content_model(self):
    return Page.objects.get_subclass(pk=self.pk)

The nice thing about being able to do select_subclasses on a whole queryset is that it's only one trip to the database to do so because it's implemented using select_related. If you were to do:

pages = [page.get_content_model() for page in pages]

Then this is going to hit the database N times if there are N pages (this would be the case whether you used the existing implementation of get_content_model or the proposed substitution above).

All that being said, in our internal code at Nimbis, we have the convention that our querysets always return the subclasses. For example, if we have a list view, we always redefine the get_queryset method so it looks like:

def get_queryset(self):
    return MyClass.objects.filter(...).select_subclasses()

What's nice about this is that the programmer never has to ask "Do I need to do a downcast?", because the convention is that the downcast has always already been done. Also, we've never encountered a problem where that was caused by having a reference to the derived class instead of a reference to the base class, since the derived class has access to the base class's fields.

@AlexHill
Collaborator

I think I'm +1 on this as an optional tool.

My only reservations about this come from a DBA point of view - the idea of a query joining on every content type's table and seeing what sticks gives me the heebie jeebies. As a developer, I like it.

All that being said, in our internal code at Nimbis, we have the convention that our querysets always return the subclasses.

I think having this behaviour by default would be great for an internal team, but the general public should be able to expect standard Django behaviour from Mezzanine's core models in this case.

I've been bitten by this before, with things that currently exist in Mezzanine but aren't used

One way to avoid this might be to adopt a policy of leaving the code untouched except by upstream updates, thus removing the problem of maintaining more forked code. You get the benefit of more people using and reading the code, and it's very general functionality which I believe Mezzanine should be able to use it as-is. Are there any changes that might be required for Mezzanine that couldn't be contributed back to django-model-utils?

@stephenmcd stephenmcd closed this Mar 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment