Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indextool --buildidf is not working #1100

Closed
sanikolaev opened this issue Apr 19, 2023 · 5 comments
Closed

indextool --buildidf is not working #1100

sanikolaev opened this issue Apr 19, 2023 · 5 comments

Comments

@sanikolaev
Copy link
Collaborator

sanikolaev commented Apr 19, 2023

MRE:

➜  ~ cat csv.conf  
searchd {  
    listen = 9315:mysql41  
    log = searchd.log  
    pid_file = searchd.pid  
    binlog_path =  
}  
  
source src {  
    type = csvpipe  
    csvpipe_command = echo "1,a"; echo "2,b"  
    csvpipe_field = f  
}  
  
index idx {  
    type = plain  
    source = src  
    path = /tmp/idx  
}  
  
  
➜  ~ indexer -c csv.conf --all  
Manticore 6.0.5 70662654f@230405 dev  
Copyright (c) 2001-2016, Andrew Aksyonoff  
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)  
Copyright (c) 2017-2023, Manticore Software LTD (https://manticoresearch.com)  
  
WARNING: Error initializing columnar storage: daemon requires columnar library v21 (trying to load v18)  
WARNING: Error initializing secondary index: daemon requires secondary library v8 (trying to load v6)  
using config file '/Users/sn/csv.conf'...  
indexing table 'idx'...  
collected 2 docs, 0.0 MB  
creating lookup: 0.0 Kdocs, 100.0% done  
sorted 0.0 Mhits, 100.0% done  
total 2 docs, 2 bytes  
total 0.029 sec, 68 bytes/sec, 68.15 docs/sec  
total 3 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg  
total 15 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg  
  
➜  ~ indextool -c csv.conf --buildidf /tmp/idx.spd --out /tmp/global.idf  
Error initializing columnar storage: daemon requires columnar library v21 (trying to load v18)Error initializing secondary index: daemon requires secondary library v8 (trying to load v6)Manticore 6.0.5 70662654f@230405 dev  
Copyright (c) 2001-2016, Andrew Aksyonoff  
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)  
Copyright (c) 2017-2023, Manticore Software LTD (https://manticoresearch.com)  
  
using config file '/Users/sn/csv.conf'...  
read 0.0 of 0.0 MB, 100.0% done  
0 documents, 0 words (0 read, 0 merged, 0 skipped)  
writing /tmp/global.idf (0M)...  
finished in 0.0 sec  
➜  ~ ls -la /tmp/global.idf  
ls: /tmp/global.idf: No such file or directory  

Expected: > 0 documents found, the output file exists.

@githubmanticore
Copy link
Contributor

➤ Sergey Nikolaev commented:

SERGEY will make a cleaner test w/o

WARNING: Error initializing columnar storage: daemon requires columnar library v21 (trying to load v18)  
WARNING: Error initializing secondary index: daemon requires secondary library v8 (trying to load v6)  

@sanikolaev
Copy link
Collaborator Author

Retest:

➜  ~ cat csv.conf
searchd {
    listen = 9315:mysql41
    log = searchd.log
    pid_file = searchd.pid
    binlog_path =
}

source src {
    type = csvpipe
    csvpipe_command = echo "1,acd def ghi"; echo "2,abc ghi xyz"; echo "3,def hjk kjh"
    csvpipe_field = f
}

index idx {
    type = plain
    source = src
    path = /tmp/idx
}

➜  ~ indexer -c csv.conf --all
Manticore 6.0.5 19c3ca50e@230428 dev (columnar 2.0.5 24e76dd@230422) (secondary 2.0.5 24e76dd@230422)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2023, Manticore Software LTD (https://manticoresearch.com)

using config file '/Users/sn/csv.conf'...
indexing table 'idx'...
collected 3 docs, 0.0 MB
creating secondary index
creating lookup: 0.0 Kdocs, 100.0% done
sorted 0.0 Mhits, 100.0% done
total 3 docs, 33 bytes
total 0.023 sec, 1375 bytes/sec, 125.01 docs/sec
total 3 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 15 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg

➜  ~ indextool -c csv.conf --buildidf /tmp/idx.spd --out /tmp/global.idf
Manticore 6.0.5 19c3ca50e@230428 dev (columnar 2.0.5 24e76dd@230422) (secondary 2.0.5 24e76dd@230422)
Copyright (c) 2001-2016, Andrew Aksyonoff
Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Copyright (c) 2017-2023, Manticore Software LTD (https://manticoresearch.com)

using config file '/Users/sn/csv.conf'...
read 0.0 of 0.0 MB, 100.0% done
0 documents, 0 words (0 read, 0 merged, 0 skipped)
writing /tmp/global.idf (0M)...
finished in 0.0 sec

➜  ~ ls -la /tmp/global.idf
ls: /tmp/global.idf: No such file or directory

@githubmanticore
Copy link
Contributor

➤ Aleksey N. Vinogradov commented:

That is misunderstanding.

first, you need to dump dict of an index into file (using something like ./indextool --dumpdict idx --stats > dump_of_idx.txt or whatever)

then you can run indextool --buildidf dump_of_idx.txt --out global.idf. Building of idf doesn't understand .spd (which is not dictionary at all) or even .spi, but only textual dumps which looks usually as

keyword,docs,hits,offset 
abc,1,1,19 
acd,1,1,29 
def,2,2,1 
ghi,2,2,10 
hjk,1,1,39 
kjh,1,1,24 
xyz,1,1,34 

@githubmanticore
Copy link
Contributor

➤ Aleksey N. Vinogradov commented:

however output file is not closed correctly. Usually that is not an issue, but for the sake of completeness I've fixed it.

@githubmanticore
Copy link
Contributor

➤ Aleksey N. Vinogradov commented:

Closing of the file fixed in cc23131
Generation of global.idf works correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants