Scores from words file are not used for ql_textscore computation #1133

aindlq · 2023-11-02T20:58:14Z

Looks like, when doing text search with ql:contains-entity and ql:contains-word, ql_textscore_* variable has simply number of matching documents per entity, but it doesn't take score column from words file into account.

From the documentation:

The SCORE(?text) returns the number of matching records (sums of the score in the wordsfile, see above).

For me looks like a bug, because ordering by "real" score is extremely useful. What is the expected behavior?

The text was updated successfully, but these errors were encountered:

joka921 · 2023-11-23T14:42:57Z

@NickG-1 is currently working on a thorough refactoring of the text index, that also exports the real score.
However we currently (at least temporarily) will drop the TEXTLIMIT feature (it doesn't quite fit in the SPARQL standard and we also don't find ourselves using it very often).
Would that be an issue for you?

aindlq · 2023-11-24T07:31:11Z

@joka921 thank you for the update! That is good know.

I didn't use it so far exactly because it is non-standard SPARQL extension and all our tooling expects standard sparql on various levels of the system. So having it specified with magical predicate is much more preferable then with non-standard TEXTLIMIT.

In my view something like text limit is necessary to have at some point in time for sure, because otherwise one can get into troubles with search queries that returns too many documents.

For example in our dataset about works of art queering for "anonymous" author or "Madonna" artworks will produce too many matched documents. But it is definitely not a showstopper.

Also just to add that when working with bigger documents, I think it is more convenient to get not the whole document text back, but rather just a matched document ID.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scores from words file are not used for ql_textscore computation #1133

Scores from words file are not used for ql_textscore computation #1133

aindlq commented Nov 2, 2023

joka921 commented Nov 23, 2023

aindlq commented Nov 24, 2023

Scores from words file are not used for ql_textscore computation #1133

Scores from words file are not used for ql_textscore computation #1133

Comments

aindlq commented Nov 2, 2023

joka921 commented Nov 23, 2023

aindlq commented Nov 24, 2023