Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Optimize and fix the Sphinx indexes SQL query. #486
By taking advantage of cross-linking, we can get both direct and indirect
The only downside of this approach is that it doesn’t work when two
select s.sentence_id, s.translation_id from sentences_translations as s left join sentences_translations s2 on s.translation_id = s2.sentence_id and s.sentence_id = s2.translation_id where s2.sentence_id is NULL;
Some quick comparison of the data returned by this new query shows that
Such a query returned many English sentences without any translation,
Running the query to generate indexes for Turkish sentences (about 200k)
Are you talking about a query whose results are visible to the end user, or one that will be processed further? Currently, whatever we display to the user makes a firm distinction between direct and indirect translations. If I were to do a query, there are situations (especially as an advanced contributor looking for sentences to link) where I might want to know indirect translations, but most of the time, I would want to know only direct translations. I would rather have to do a separate query for each one.
We should fix the five sentences with broken links. I'll write that up as a separate issue.
I’m talking about the query used to tell Sphinx what sentences are translated into what languages, so that one can perform a search based on that critera. We don’t make a distinction between “indirectly translated into language X” and “directly translated into language X” while performing a search. Sphinx just returns “every sentence in language X directly or indirectly translated into language Y“, and then, for each or these sentences, we look up translations and display them. So it’s a two-step process and I was only talking about the first step. Technically, we were using a single query before too, but it was two queries joined by an SQL union, which performed slower.