Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization of "not translated into" filter #690

Merged
merged 3 commits into from
Jun 27, 2015

Conversation

trang
Copy link
Member

@trang trang commented Jun 20, 2015

The query that is executed when users displayed sentences not translated into XXX contains a several joins.

SELECT DISTINCT s.id FROM sentences s
  JOIN sentences_translations st ON ( s.id = st.sentence_id )
  JOIN sentences t on ( st.translation_id = t.id )
WHERE s.lang = '$source' AND t.lang = '$target'

We could do without these JOINs because we have the required fields already in the sentences_translations table, in the sentence_lang and translation_lang fields.

SELECT sentence_id FROM sentences_translations
WHERE sentence_lang = '$source'
AND translation_lang = '$target'

For a comparison, on my machine, searching for

Now the problem when selecting from the sentences_translations table is that we need to make sure the sentence_lang and translation_lang fields are in sync with the language in the sentences table.

When a user changes the language of a sentence, it doesn't update the sentences_translations table accordingly. I didn't take care of this yet.

I have only added a file containing the queries to fix the sentences_translations table but the last queries are really, really slow to execute. I don't know if there's a solution to make them faster, or if it's better to just retrieve the id's to delete and generate a script with a bunch of DELETE FROM sentences_translations WHERE sentence_id = XXX.

So there are at least 2 things to do before this pull request can be merged:

  1. Update the sentences_translations when the language of a sentence changes.
  2. Find a faster way to sync the languages in sentences_translations with the languages in sentences.

This scripts sync's the languages (sentence_lang and translation_lang)
between the sentences_translations table and the sentences table.
@trang trang added this to the 2015-06-29 milestone Jun 20, 2015
@jiru
Copy link
Member

jiru commented Jun 20, 2015

I have only added a file containing the queries to fix the sentences_translations table but the last queries are really, really slow to execute. I don't know if there's a solution to make them faster, or if it's better to just retrieve the id's to delete and generate a script with a bunch of DELETE FROM sentences_translations WHERE sentence_id = XXX.

I don’t think so. I tried to run these two queries on the dev database (and rolled back after). Each removed 2331 rows. The first one (join on st.sentence_id = s.id) took about 25 seconds to execute, but the second one (join on st.translation_id = s.id) took about 2 minutes to complete. It’s still reasonable for a one-time use, but that difference is rather strange. EXPLAIN doesn’t show any difference.

trang added a commit that referenced this pull request Jun 27, 2015
Optimization of "not translated into" filter
@trang trang merged commit 6ceada2 into dev Jun 27, 2015
@trang trang deleted the optimization-not-translated-into branch June 28, 2015 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants