Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search should only work for keywords longer than 2/3 characters #274

Closed
fatcrobat opened this issue Jan 14, 2019 · 4 comments

Comments

@fatcrobat
Copy link

commented Jan 14, 2019

Affected version(s)

3.5-4.6

Description

Contao search currently does not have a declared minimum character length for search keywords. On websites with a large amount of news and pages (big tl_search and tl_search_index tables) a search for keyword with less than 3/4 characters will slow down the database. Elasticsearch for instance also has a 4 character minimum limit by default. (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-term.html)

How to reproduce

If you search for instance for the following search term:

"vob a 2019" with enabled fuzzy search than you will have 3 search keyword:

  • vob
  • a
  • 2019

@fatcrobat fatcrobat changed the title Search should only work for keywords longer than 2 characters Search should only work for keywords longer than 2/3 characters Jan 14, 2019

@leofeyer leofeyer added feature and removed up for discussion labels Feb 14, 2019

@leofeyer leofeyer modified the milestone: 4.8.0 Feb 14, 2019

@leofeyer

This comment has been minimized.

Copy link
Member

commented Feb 14, 2019

As discussed in Mumble on February 14th, we want to strip keywords that deceed a configurable minimum length. Also, the search results shall show a hint if keywords have been stripped.

@leofeyer leofeyer added this to the 4.8.0 milestone Feb 14, 2019

@leofeyer

This comment has been minimized.

Copy link
Member

commented Jun 28, 2019

On second though, we should probably not even index words that deceed the minimum length in the first place. So the DB is not bloated with keywords that will never be found.

@contao/developers WDYT?

@Toflar

This comment has been minimized.

Copy link
Member

commented Jun 28, 2019

Indexing and searching are two separate disciplines so I think we should keep that as is for now.
Might be that you want to have different configurations for different root pages but the index stays the same.

@leofeyer leofeyer self-assigned this Jul 4, 2019

leofeyer added a commit that referenced this issue Jul 4, 2019
@leofeyer

This comment has been minimized.

Copy link
Member

commented Jul 4, 2019

Implemented in a16279b.

@leofeyer leofeyer closed this Jul 4, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.