Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No relevance for strings <4 Characters #1562

Open
pbksol opened this issue May 27, 2019 · 8 comments

Comments

@pbksol
Copy link

commented May 27, 2019

In B2 there's no relevance data given when searching for strings shorter than 4 characters. This is a bit of a show stopper especially when dealing with IT centric stuff where almost every abbreviation contains three letters :-/

I was not able to find any configuration to change this behavior. Is there any? Thanks :-)

@thorsten thorsten self-assigned this May 27, 2019
@thorsten

This comment has been minimized.

Copy link
Owner

commented May 27, 2019

Please use Elasticsearch and the search results will improve drastically. MySQL limits the full text search for words less than 4 characters.

@thorsten thorsten closed this May 27, 2019
@pbksol

This comment has been minimized.

Copy link
Author

commented May 27, 2019

Search (in FAQ to via API) returns words with 3 letters. No trouble there. It's just that relevance is not returned for these.

@thorsten thorsten reopened this May 27, 2019
@thorsten

This comment has been minimized.

Copy link
Owner

commented May 27, 2019

Which version do you use?

@thorsten thorsten added the Bug label May 27, 2019
@pbksol

This comment has been minimized.

Copy link
Author

commented May 27, 2019

V3b2

@thorsten thorsten added this to To Do in 3.0.0-beta.3 via automation May 28, 2019
@thorsten thorsten added this to the 3.0 milestone May 28, 2019
@thorsten thorsten moved this from To Do to In progress in 3.0.0-beta.3 Jul 31, 2019
@thorsten

This comment has been minimized.

Copy link
Owner

commented Jul 31, 2019

This is what my current testing version running on Docker is returning for a 3 letter search via API:

http://localhost:8080/api.php?action=search&lang=de&q=XML

[
{
id: 140,
lang: "de",
question: "Das ist ein Test mit XML!",
answer: "<?xml?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don&#39;t forget me this weekend!</body> </note>",
keywords: "",
category_id: 40,
score: 5.6890445,
link: "http://localhost:8080/index.php?action=faq&cat=40&id=140&artlang=de"
},
{
id: 124,
lang: "de",
question: "Wann wird von einer Auftragstasche ein JDF und eine HTML-Tasche geschrieben? Wann erhält das Prinect-System eine Terminänderung? Wann ist ein Auftrag in Stratos iPoint verfügbar?",
answer: "Wann wird in Prinance für Prinect ein JDF geschrieben? Beim Aufrufen des ...",
keywords: "",
category_id: 36,
score: 3.898917,
link: "http://localhost:8080/index.php?action=faq&cat=36&id=124&artlang=de"
}
]```
@pbksol

This comment has been minimized.

Copy link
Author

commented Aug 1, 2019

That's what B2 returns, yep. But if you search for anything with more than three characters, the API returns
relevance_thema
relevance_content
relevance_keywords

In your case the return for relevance_thema should be "1" for your ID 140, because there's one occurrence of "XML" in question (or Thema).

If you try to weight responses from the faq based on the relevance, every three letter acronym returns always a relevance of 0.

And yes, all the relevance stuff is not documented in the API documentation at all, but it's there since V2.x and the only way to sort responses based on the overall relevance.

@thorsten

This comment has been minimized.

Copy link
Owner

commented Aug 1, 2019

Ah, now I get it, I'll check that

@thorsten

This comment has been minimized.

Copy link
Owner

commented Sep 22, 2019

We add all the relevance information into the score value, so the API cannot return the single keys...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
3.0.0-beta.3
  
In progress
2 participants
You can’t perform that action at this time.