Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

* is removed from search query by ignore_chars #1072

Closed
sanikolaev opened this issue Mar 14, 2023 · 1 comment
Closed

* is removed from search query by ignore_chars #1072

sanikolaev opened this issue Mar 14, 2023 · 1 comment

Comments

@sanikolaev
Copy link
Collaborator

sanikolaev commented Mar 14, 2023

If you create an index with ignore_chars = * and infix search, you can't do any infix search, since * is removed from the search keyword:

mysql> drop table if exists t; create table t(f text) min_infix_len='2' ignore_chars = '*'; insert into t values(1, 'abcdefabc'); select * from t where match('*def*'); show meta;    
--------------    
drop table if exists t    
--------------    
    
Query OK, 0 rows affected (0.03 sec)    
    
--------------    
create table t(f text) min_infix_len='2' ignore_chars = '*'    
--------------    
    
Query OK, 0 rows affected (0.01 sec)    
    
--------------    
insert into t values(1, 'abcdefabc')    
--------------    
    
Query OK, 1 row affected (0.00 sec)    
    
--------------    
select * from t where match('*def*')    
--------------    
    
Empty set (0.00 sec)    
    
--------------    
show meta    
--------------    
    
 ---------------- -------     
| Variable_name  | Value |    
 ---------------- -------     
| total          | 0     |    
| total_found    | 0     |    
| total_relation | eq    |    
| time           | 0.000 |    
| keyword[0]     | def   |    
| docs[0]        | 0     |    
| hits[0]        | 0     |    
 ---------------- -------     
7 rows in set (0.01 sec)    

The expected behaviour is to keep * in the search keyword intact unless it's escaped, then it should be removed.

@sanikolaev sanikolaev added the bug label Mar 14, 2023
@tomatolog
Copy link
Contributor

Should be fixed at 21966fb and ec5105c. Need just to update daemon to get the fix of wildcards at query to not be affected by ignore_chars

However there is still no way to escape wildcard characters at the query, ie enabled min_infix_len or min_prefix_len cause query with wildchars *?% to expand such terms

ignore_chars always got considered on indexing or inserting document into RT index.
ignore_chars also got applied as is into query terms on search for index WITHOUT min_infix_len or min_prefix_len .
ignore_chars also got applied into query terms on search for index WITH min_infix_len or min_prefix_len but does not affect chars *?% if they mentioned at the ignore_chars

For example indexing document test*2 a+b into index with ignore_chars = *+ produce such tokens test2 ab

and query test*2 a+b to the index with ignore_chars = *+

  • index WITHOUT min_infix_len or min_prefix_len makes query test2 ab
  • index WITH min_infix_len or min_prefix_len makes query test*2 ab and also cause test*2 to get expanded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants