Prioritizing Search Results (Enhancement) #1713

ckjpn · 2018-11-02T01:58:33Z

Perhaps it's not possible now, and maybe never will be, but it would likely be useful to prioritize, or at least have the option to prioritize, search results as follows.

sentences with audio
sentences tagged OK
sentences by native speakers
and finally all other sentences.

For people searching for English sentences, the following would be a useful sequence.

sentences with audio
sentences on List 907
sentences by native speakers
and finally all other sentences.

It would probably also be good to give members the option to skip the List 907 step if they think that's not a good way to prioritize sentences.

alanfgh · 2018-11-03T21:04:25Z

What is the problem that this suggestion is meant to fix?

People can do this prioritization right now. That is, they can conduct advanced searches for sentences with audio, and/or sentences on any list they desire, and/or by native speakers. As for forcing this prioritization on them, I think it's a bad idea. First of all, there are bound to be people who want to search preferentially for sentences without these characteristics. For instance, they might want to find sentences
WITHOUT audio (for instance, if they want to add it themselves, or because they like longer or more complicated sentences, which are less likely to have audio). Secondly, entrenching a list written by one person in the site's search algorithm would subject others to that person's prejudices and vastly reduce the diversity of sentences found by people using the search. The sentences in the Tatoeba corpus are already far more homogeneous than they should be, due largely to the efforts of a single person in this direction (in part by discouraging others from translating or recording audio for sentences not on or originating from his list). It almost seems like the problem that the suggestion is meant to fix is other people's ability to choose sentences for themselves.

ckjpn · 2018-11-05T01:07:28Z

I should have said to offer members this as an option.

I think it would be a good default option, though.

New users would likely have a more positive experience on the website if this were implemented for the main "simple" search.

Members who prefer other searches should be allowed to override this in advanced searches.

ckjpn · 2018-11-21T00:36:45Z

People can do this prioritization right now.

People can't really do this prioritization right now without doing multiple searches.

This is just a follow up, which may not matter if this kind of prioritization is not possible.

What happens now, is that a member can search for sentences with audio, and if such sentences aren't found, the member can hen then do a search for sentences by native speakers, and if none are found, then do another search with no limitations.

If would be nice if when searching, you could see the several sentences with audio, followed by those without audio by native speakers, followed by all remaining sentences. This would be especially useful for search results with only a few audio files.

The point is to give visitors a positive experience, much like Google does when they prioritize search results.

For students of English, for example my students, it would be useful and beneficial to them if they could see sentences that they could with some certainty trust were correct and natural-sounding. Sentences that have been recorded have gone through one extra proofreading, so are likely to be the most trustworthy, followed by sentences on List 907, followed by sentences owned by people who claim they are native speakers. The least trustworthy would be the remaining owned sentences, followed by the unowned sentences. It's often difficult for non-native speakers to access whether a sentence is good or not.

For students of Japanese, and maybe most other languages, the sequence would likely be 1) sentences with audio 2) sentences tagged OK by native speakers 3) sentences by those claiming to be native speakers 4) all other sentences. At least with Japanese sentences, it seems that a lot of the orphan sentences and sentences owned by non-native speakers are about the same quality.

ckjpn · 2019-03-14T01:58:44Z

This is just to help clarify what I think would be useful.

If a student wants to study sentences with the word "email", it would be useful if sentences were shown in this order.

With Audio
https://tatoeba.org/eng/sentences/search?query=email&from=eng&to=und&orphans=no&unapproved=no&user=&tags=&list=&has_audio=yes&trans_filter=limit&trans_to=und&trans_link=&trans_user=&trans_orphan=&trans_unapproved=&trans_has_audio=&sort=words
email - with audio (108 results)
And then the additional 80 sentences that can be found with this search.
https://tatoeba.org/eng/sentences/search?query=email&from=eng&to=und&orphans=no&unapproved=no&user=&tags=&list=907&has_audio=&trans_filter=limit&trans_to=und&trans_link=&trans_user=&trans_orphan=&trans_unapproved=&trans_has_audio=&sort=words
email - on List 907 (188 results)
And then the additional sentences shown on this search, that haven't already been shown.
https://tatoeba.org/eng/sentences/search?query=email&from=eng&to=und&orphans=no&unapproved=no&native=yes&user=&tags=&list=&has_audio=&trans_filter=limit&trans_to=und&trans_link=&trans_user=&trans_orphan=&trans_unapproved=&trans_has_audio=&sort=words
email - by self-proclaimed native speakers (232 results)
And then the additional sentences shown on this search, that haven't already been shown.
https://tatoeba.org/eng/sentences/search?query=email&from=eng&to=und
email - all results (329 results)

Perhaps if this is not possible, then maybe you could make it possible to do the 2nd search with a way to filter out sentences found by the first search (etc.). This would allow students to manually do 4 searches to get similar results.

ckjpn · 2019-06-28T23:03:17Z

This Wall post is somewhat related.

https://tatoeba.org/eng/wall/show_message/32118#message_32118
Feature suggestion: Sort sentences by what should be translated first.

trang · 2019-07-28T00:43:55Z

The prioritization that is suggested here is way too biased for us to consider it as a default. I think we've started to agree that randomizing, rather than trying to guess what the user wants, is the better default option (cf. Wall thread).

Implementing custom levels of prioritization as an option might be possible, but it is overcomplexifying the search feature. In your example @ckjpn, even if someone could configure that they first see sentences with audio, then sentences in list 907, then sentences from native speakers, they would end up looking only at the sentences with audio. People rarely go past the few first pages when using the search.

For me it's clear we won't implement this feature, so I will close this issue.

jiru added the enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba. label Nov 2, 2018

ckjpn mentioned this issue Mar 6, 2019

Allow searches that are sorted by, rather than restricted by, a criterion #1804

Open

trang closed this as completed Jul 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prioritizing Search Results (Enhancement) #1713

Prioritizing Search Results (Enhancement) #1713

ckjpn commented Nov 2, 2018 •

edited

alanfgh commented Nov 3, 2018

ckjpn commented Nov 5, 2018 •

edited

ckjpn commented Nov 21, 2018 •

edited

ckjpn commented Mar 14, 2019

ckjpn commented Jun 28, 2019

trang commented Jul 28, 2019

Prioritizing Search Results (Enhancement) #1713

Prioritizing Search Results (Enhancement) #1713

Comments

ckjpn commented Nov 2, 2018 • edited

alanfgh commented Nov 3, 2018

ckjpn commented Nov 5, 2018 • edited

ckjpn commented Nov 21, 2018 • edited

ckjpn commented Mar 14, 2019

ckjpn commented Jun 28, 2019

trang commented Jul 28, 2019

ckjpn commented Nov 2, 2018 •

edited

ckjpn commented Nov 5, 2018 •

edited

ckjpn commented Nov 21, 2018 •

edited