Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate records during SELECT #946

Closed
blacksoul000 opened this issue Nov 20, 2022 · 5 comments
Closed

Duplicate records during SELECT #946

blacksoul000 opened this issue Nov 20, 2022 · 5 comments

Comments

@blacksoul000
Copy link

blacksoul000 commented Nov 20, 2022

Query to run:
SELECT id FROM indexed_2e5916bd_4e53_46b0_b2ee_a87689625c70 WHERE MATCH('-test') order by id asc LIMIT 50 OPTION not_terms_only_allowed=1;
Waiting for unique ids but some of them are duplicated.
Table was created by query and id field is autogenerated by manticore.
Query:

CREATE TABLE IF NOT EXISTS indexed_2e5916bd_4e53_46b0_b2ee_a87689625c70 (object_id string, parent_id string, X0037001F text, LONG_X0037001F text, X0E04001F text, LONG_X0E04001F text, X0E03001F text, LONG_X0E03001F text, X0C1F001F text, LONG_X0C1F001F text, X0C1A001F text, LONG_X0C1A001F text, X0065001F text, LONG_X0065001F text, X0042001F text, LONG_X0042001F text, p16_string text, p16_long_string text, p3586_string text, p3586_long_string text, p4096_string text, p4096_long_string text, p26_string text, p26_long_string text) expand_keywords = '1' min_infix_len = '2'    

Version: Manticore 5.0.2 348514c@220530 dev

indexed_2e5916bd_4e53_46b0_b2ee_a87689625c70.zip

@sanikolaev
Copy link
Collaborator

@blacksoul000 The archive is corrupted. Please reupload.

@sanikolaev sanikolaev added the waiting Waiting for the original poster (in most cases) or something else label Nov 24, 2022
@sanikolaev
Copy link
Collaborator

@blacksoul000 Never mind, I've found the original archive.

@sanikolaev
Copy link
Collaborator

I confirm the issue can be reproduced with 5.0.2 and one of the latest commits to the master branch. This archive includes not only the data, but a config too:
issue-946.tgz

@sanikolaev sanikolaev removed the waiting Waiting for the original poster (in most cases) or something else label Nov 24, 2022
@githubmanticore
Copy link
Contributor

➤ Sergey Nikolaev commented:

Here's what the issue looks like live:

snikolaev@dev:~/issue-946$ ~/searchd -c configless.conf 
Manticore 5.0.3 d761c0f@221116 dev 
Copyright (c) 2001-2016, Andrew Aksyonoff 
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com) 
Copyright (c) 2017-2022, Manticore Software LTD (https://manticoresearch.com) 
 
[37:56.330] [3694847] WARNING: Error initializing columnar storage: daemon requires columnar library v16 (trying to load v15) 
[37:56.348] [3694847] using config file 'configless.conf' (161 chars)... 
starting daemon version '5.0.3 d761c0f@221116 dev' ... 
listening on 127.0.0.1:9315 for mysql 
listening on 127.0.0.1:9316 for sphinx and http(s) 
precaching index 'indexed_2e5916bd_4e53_46b0_b2ee_a87689625c70' 
precached 1 indexes in 0.150 sec 
snikolaev@dev:~/issue-946$ mysql -P9315 -h0 
Welcome to the MySQL monitor.  Commands end with ; or \g. 
Your MySQL connection id is 1 
Server version: 5.0.3 d761c0f@221116 dev git branch master...origin/master 
 
Copyright (c) 2000, 2021, Oracle and/or its affiliates. 
 
Oracle is a registered trademark of Oracle Corporation and/or its 
affiliates. Other names may be trademarks of their respective 
owners. 
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. 
 
mysql> SELECT id FROM indexed_2e5916bd_4e53_46b0_b2ee_a87689625c70 WHERE MATCH('-test') order by id asc LIMIT 50 OPTION not_terms_only_allowed=1; 
+------+ 
| id   | 
+------+ 
|   50 | 
|   55 | 
|   57 | 
|   59 | 
|  161 | 
|  167 | 
|  169 | 
|  171 | 
|  173 | 
|  175 | 
|  178 | 
|  180 | 
|  182 | 
|  184 | 
|  186 | 
|  188 | 
|  191 | 
|  193 | 
|  196 | 
|  198 | 
|  200 | 
|  203 | 
|  206 | 
|  208 | 
|  214 | 
|  214 | 
|  216 | 
|  216 | 
|  218 | 
|  218 | 
+------+ 
30 rows in set (0.07 sec) 

WHERE MATCH('-test') may be the root cause.

@githubmanticore
Copy link
Contributor

➤ Stan commented:

fixed duplicate documents at the result set for the query with not_terms_only_allowed option to RT index with killed documents at 1d3f0da

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants