Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizing table joins #103

Open
skirsdeda opened this issue Aug 31, 2015 · 9 comments
Open

Optimizing table joins #103

skirsdeda opened this issue Aug 31, 2015 · 9 comments

Comments

@skirsdeda
Copy link
Contributor

I have been using django-parler for a while now and it seems to be fine with one level query sets of translated objects (no related objects which are also translated). But when you have a model with foreign keys or many to many relations to other translated models prefetch_related doesn't help anymore and SQL query count becomes unwieldy.

What I propose is making django-parler backendable, leaving current multi table implementation as a default backend and implementing additional backend which would not use additional tables for translations. A good way to do this would be using hstore field or fields (PostgreSQL only) in the main table where keys would be language codes, values would be translations and each translated field would have a corresponding hstore field in the table. While this would only be usable on PostgreSQL, it would solve not only this table joining disaster but would also enable simpler ordering by translated fields.

This is obviously quite a lot of work but I could invest some time into implementing the proposed backend if we could agree on this django-parler development direction (making it backendable).

@vdboor Let me know what you think about this.

@chogarcia
Copy link

I honestly think that for that approach you should use modeltranslation.

We gave a try to django-parler and we had to migrate all back to modeltranslation in the last minute because it was making 3000 queries on the backend taking 10 second to load the change list, where you actually can't prefetch because is filtered.

@skirsdeda
Copy link
Contributor Author

@cho-is I think it's too late for me because a lot of my code relies on django-parler API now. And then parler is used in some packages which are my dependencies. After all, django-parler provides a nice abstraction layer (it's API basically) while all the problems lie in what could be a backend code (if it was structured that way). And if the current backend code could be easily swapped for either suggested storage alternative (HStore) or even modeltranslation equivalent then it would be a big win for everyone.

Another consideration is that HStore would be superior in every way (for PostgreSQL users) and it is impossible to implement using modeltranslation API!

@chogarcia
Copy link

It was too late also for us but we had no choice. Good luck!

@mdentremont
Copy link

@cho-is Would you be able to prefetch the filtered list using the new Prefetch objects (ex: qs.prefetch_related('lookup__field', queryset=filtered_qs)?

@chogarcia
Copy link

No, it didn't work for me. I think was because the prefetch does not work if the queryset is already filtered.

@vdboor
Copy link
Contributor

vdboor commented Sep 30, 2015

I'm sorry to hear you're facing these issues.

The prefetch should work, but when you filter a queryset again after the prefetch code ran, the cache would be invalidated. After all - it can't produce correct results if the queryset changes.
You can filter querysets and do a prefetch however. The prefetch query that Django uses is in fact SELECT * from ..translations where id IN (...)

Doing .prefetch_related('translations') would help you in most cases. Parler will detect there is a prefetch cache, and read from there.

Regarding the backend idea. I'm not entirely sure how that will work out. If you can find a clean implementation, I'd be interested in merging that.

@vstoykov
Copy link
Contributor

vstoykov commented May 9, 2017

One thing to consider for optimizing queries can be using Django's ForeignObject with the idea from django-composite-foreignkey (or to use django-composite-foreignkey direclty if possible) to create relations between master object and translations.

And then for example the query:

MyTranslatedModel.objects.language()

Can join both tables and attach translation only for current language. This can be achieved by attaching ForeignObject for every configured language to the MyTranslatedModel.

For example if we have the following languages ['en', 'fr', 'de'] then in instances of MyTranslatedModel we will have not only transaltions related manager but also translation_en, translation_fr and translation_de which will directly return MyTranslatedModelTranslation instance for the given language. This virtual "foreign key" can be used in select_related and django will make a join. This select_related can be added by django-parler automatically when objects.language() method was called.

Even fallback languages (if any) can be joined by that way in order to save extra queries when given object is not translated in current (given) langauge.

@vdboor
Copy link
Contributor

vdboor commented May 22, 2017

Thanks for mentioning django-composite-foreignkey! That really helps finding the ORM intergration that is needed here.

However, this is something I love to accept a pull request from, but won't be implementing myself. Any takers?

@vstoykov
Copy link
Contributor

There are two merge requests for optimizing the queries #216 and #219

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants