Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rocksdb backend not working #22

Closed
tmm1 opened this issue May 24, 2016 · 15 comments · Fixed by #23
Closed

rocksdb backend not working #22

tmm1 opened this issue May 24, 2016 · 15 comments · Fixed by #23

Comments

@tmm1
Copy link
Contributor

tmm1 commented May 24, 2016

I have written some very straightforward bleve code to index and search data. It works great with the default boltdb backend.

Today I tried to switch to the rocksdb backend, but it's not working. I am unable to retrieve any data out of the database. All search requests return 0 results, and DocCount() returns 0 as well.

I have confirmed rocksdb is being used correctly. I also see the database size growing on disk after I index documents:

$ cat index_meta.json
{"storage":"rocksdb","index_type":"upside_down"}

$ du -sh store
 36M    store

$ ls -1 store/*.sst | wc -l
      19

I installed rocksdb using brew:

/usr/local/Cellar/rocksdb/4.5.1/lib/librocksdb.4.5.1.dylib

and am using blevex/rocksdb as follows:

val, err = bleve.NewUsing(path, indexMapping(), bleve.Config.DefaultIndexType, rocksdb.Name, nil)
@mschoch
Copy link
Contributor

mschoch commented May 24, 2016

I don't have a lot of ideas, but that is very strange. Can you build bleve_dump with RocksDB support and run it on the index?

@tmm1
Copy link
Contributor Author

tmm1 commented May 24, 2016

I was able to build and run bleve_dump, and I see a lot of "Dictionary Term:" and "Document:" entries but I'm not sure what I'm looking at.

If I check DocCount() after indexing, it shows the correct value. But once I close the index and re-open in another process, it's back down to 0.

@tmm1
Copy link
Contributor Author

tmm1 commented May 24, 2016

I created an empty boltdb and rocksdb store and ran bleve_dump on both. The output was identical.

Then I used bleve_index to add a single document to each store, and bleve_query to look it up. The rocksdb store returns no results:

$ cat input/key.json
{"name":"MYFUNC"}

$ bleve_index -index boltdb input
2016/05/24 15:40:16 Indexing: key
$ bleve_index -index rocksdb input
2016/05/24 15:40:35 Indexing: key

$ bleve_query -index boltdb -fields "name:MYFUNC"
1 matches, showing 1 through 1, took 1.956353ms
    1. key (0.306853)
    name
        MYFUNC
$ bleve_query -index rocksdb -fields "name:MYFUNC"
No matches

The rocksdb dump appears to have binary garbage strewn through it and doesn't look like the boltdb dump.

Dictionary Term: `MYFUNCd^A^@MYFUNC` Field: 0 Count: 1·
Key:   64 00 00 4d 59 46 55 4e 43 64 01 00 4d 59 46 55 4e 43···············································
Value: 01··································································································

The rocksdb dump is also missing Field: and Backindex DocId: entries altogether.

@tmm1
Copy link
Contributor Author

tmm1 commented May 24, 2016

Tried facebook/rocksdb@master (librocksdb.4.8.0.dylib) with the same result.

@steveyen What version of rocksdb are you using?

@tmm1
Copy link
Contributor Author

tmm1 commented May 25, 2016

Same behavior with rocksdb v4.1 and v3.13. I even tried go 1.5.1.

I also spun a brand new ubuntu xenial VM with golang-1.6-go and librocksdb4.1 and tried it there. Same behavior.

@mschoch
Copy link
Contributor

mschoch commented May 25, 2016

So, obviously I haven't used rocksdb in a while on my local machine. What I have still works is bleve_create, bleve_index, and bleve_query, all built against rocksdb-3.11.2. And, your example operations work fine here:

$ echo '{"name":"MYFUNC"}' > /tmp/doc.json
$ DYLD_LIBRARY_PATH=/Users/mschoch/Documents/research/rocksdb-rocksdb-3.11.2/ bleve_create -index /tmp/rocksdb.bleve
2016/05/24 20:21:17 Created bleve index at: /tmp/rocksdb.bleve
$ DYLD_LIBRARY_PATH=/Users/mschoch/Documents/research/rocksdb-rocksdb-3.11.2/ bleve_index -index /tmp/rocksdb.bleve /tmp/doc.json
2016/05/24 20:22:20 Indexing: doc
$ DYLD_LIBRARY_PATH=/Users/mschoch/Documents/research/rocksdb-rocksdb-3.11.2/ bleve_query -index /tmp/rocksdb.bleve/ -fields "name:MYFUNC"
2016/05/24 20:22:53 i see field: 0 - 'name'
2016/05/24 20:22:53 i see field: 1 - '_all'
2016/05/24 20:22:53 field cache: &{map[name:0 _all:1] 1 {{0 0} 0 0 0 0}}
2016/05/24 20:22:53 new term reader for field 0 term myfunc
2016/05/24 20:22:53 got back index row: Backindex DocId: `doc` Term Entries: [term:"myfunc" field:0  term:"myfunc" field:1 ], Stored Entries: [field:0 ]
2016/05/24 20:22:53 field 0 has name 'name'
2016/05/24 20:22:53 got back index row: Backindex DocId: `doc` Term Entries: [term:"myfunc" field:0  term:"myfunc" field:1 ], Stored Entries: [field:0 ]
2016/05/24 20:22:53 field 0 has name 'name'
2016/05/24 20:22:53 see field named: 'name'
1 matches, showing 1 through 1, took 2.78853ms
    1. doc (0.306853)
    name
        MYFUNC

Obviously at the time I last built these I had a bunch of debug statements in the library somewhere. But, it seems it did work at some point.

I think some testing was done with 4.1. Let me try with that and current bleve and see what happens.

Also, can you post the SHA for your github.com/tecbot/gorocksdb repo?

@tmm1
Copy link
Contributor Author

tmm1 commented May 25, 2016

I can try with 3.11.2. Are you sure bleve_create created a rocksdb index for you? I thought it always created boltdb indexes.

tecbot/gorocksdb@59ab8de

@tmm1
Copy link
Contributor Author

tmm1 commented May 25, 2016

Ah, you can create a rocksdb store with bleve_create -index /tmp/test.rocksdb -store rocksdb

Here's what the bleve_dump looks like for me after the commands you ran:

Dictionary Term: `myfuncd^@^@myfunc` Field: 1 Count: 1
Key:   64 01 00 6d 79 66 75 6e 63 64 00 00 6d 79 66 75 6e 63
Value: 01

InternalStore - Key: _mapping (5f 6d 61 70 70 69 6e 67) Val: {"default_mapping":{"enabled":true,"dynamic":true,"default_analyzer":""},"type_field":"_type","default_type":"_default","default_analyzer":"standard","default_datetime_parser":"dateTimeOptional","default_field":"_all","byte_array_converter":"json","store_dynamic":true,"index_dynamic":true,"analysis":{}} (7b 22 64 65 66 61 75 6c 74 5f 6d 61 70 70 69 6e 67 22 3a 7b 22 65 6e 61 62 6c 65 64 22 3a 74 72 75 65 2c 22 64 79 6e 61 6d 69 63 22 3a 74 72 75 65 2c 22 64 65 66 61 75 6c 74 5f 61 6e 61 6c 79 7a 65 72 22 3a 22 22 7d 2c 22 74 79 70 65 5f 66 69 65 6c 64 22 3a 22 5f 74 79 70 65 22 2c 22 64 65 66 61 75 6c 74 5f 74 79 70 65 22 3a 22 5f 64 65 66 61 75 6c 74 22 2c 22 64 65 66 61 75 6c 74 5f 61 6e 61 6c 79 7a 65 72 22 3a 22 73 74 61 6e 64 61 72 64 22 2c 22 64 65 66 61 75 6c 74 5f 64 61 74 65 74 69 6d 65 5f 70 61 72 73 65 72 22 3a 22 64 61 74 65 54 69 6d 65 4f 70 74 69 6f 6e 61 6c 22 2c 22 64 65 66 61 75 6c 74 5f 66 69 65 6c 64 22 3a 22 5f 61 6c 6c 22 2c 22 62 79 74 65 5f 61 72 72 61 79 5f 63 6f 6e 76 65 72 74 65 72 22 3a 22 6a 73 6f 6e 22 2c 22 73 74 6f 72 65 5f 64 79 6e 61 6d 69 63 22 3a 74 72 75 65 2c 22 69 6e 64 65 78 5f 64 79 6e 61 6d 69 63 22 3a 74 72 75 65 2c 22 61 6e 61 6c 79 73 69 73 22 3a 7b 7d 7d)
Key:   69 5f 6d 61 70 70 69 6e 67
Value: 7b 22 64 65 66 61 75 6c 74 5f 6d 61 70 70 69 6e 67 22 3a 7b 22 65 6e 61 62 6c 65 64 22 3a 74 72 75 65 2c 22 64 79 6e 61 6d 69 63 22 3a 74 72 75 65 2c 22 64 65 66 61 75 6c 74 5f 61 6e 61 6c 79 7a 65 72 22 3a 22 22 7d 2c 22 74 79 70 65 5f 66 69 65 6c 64 22 3a 22 5f 74 79 70 65 22 2c 22 64 65 66 61 75 6c 74 5f 74 79 70 65 22 3a 22 5f 64 65 66 61 75 6c 74 22 2c 22 64 65 66 61 75 6c 74 5f 61 6e 61 6c 79 7a 65 72 22 3a 22 73 74 61 6e 64 61 72 64 22 2c 22 64 65 66 61 75 6c 74 5f 64 61 74 65 74 69 6d 65 5f 70 61 72 73 65 72 22 3a 22 64 61 74 65 54 69 6d 65 4f 70 74 69 6f 6e 61 6c 22 2c 22 64 65 66 61 75 6c 74 5f 66 69 65 6c 64 22 3a 22 5f 61 6c 6c 22 2c 22 62 79 74 65 5f 61 72 72 61 79 5f 63 6f 6e 76 65 72 74 65 72 22 3a 22 6a 73 6f 6e 22 2c 22 73 74 6f 72 65 5f 64 79 6e 61 6d 69 63 22 3a 74 72 75 65 2c 22 69 6e 64 65 78 5f 64 79 6e 61 6d 69 63 22 3a 74 72 75 65 2c 22 61 6e 61 6c 79 73 69 73 22 3a 7b 7d 7d

Document: key Field 0, Array Positions: [116 1 0 109 121 102 117 110 99 13823 101 121 116 0 0 109 121 102 117 110 99 13823 101 121 102 0 0 102 1 0 98 107 101 121], Type: t Value: MYFUNC^A<80><80><80><FC>^C^@^A^@^F^@^A<80><80><80><FC>^C^@^A^@^F^@name<FF>_all<FF>

The Dictionary Term and Document both look corrupted.

@tmm1
Copy link
Contributor Author

tmm1 commented May 25, 2016

Was unable to build latest blevex against 3.11.2:

Undefined symbols for architecture x86_64:
  "_rocksdb_writebatch_deletev", referenced from:
      _blevex_rocksdb_execute_direct_batch in batchex.cgo2.o
  "_rocksdb_writebatch_mergev", referenced from:
      _blevex_rocksdb_execute_direct_batch in batchex.cgo2.o
  "_rocksdb_writebatch_putv", referenced from:
      _blevex_rocksdb_execute_direct_batch in batchex.cgo2.o
ld: symbol(s) not found for architecture x86_64

Looks like the batchex code is new since then, so I suspect that's where the bug is. The corrupted data in the index looks a lot like a buffer over-read, and there's a lot of pointer manipulation happening in the BatchEx implementation that might be related.

@mschoch
Copy link
Contributor

mschoch commented May 25, 2016

Yeah you're right the first round wasn't actually using RocksDb. Here it is done correctly, still seems to work:

$ DYLD_LIBRARY_PATH=/Users/mschoch/Documents/research/rocksdb-rocksdb-3.11.2/ bleve_create -store rocksdb -index /tmp/rocksdb.bleve
2016/05/24 20:42:51 Created bleve index at: /tmp/rocksdb.bleve
$ DYLD_LIBRARY_PATH=/Users/mschoch/Documents/research/rocksdb-rocksdb-3.11.2/ bleve_index -index /tmp/rocksdb.bleve /tmp/doc.json
2016/05/24 20:43:05 Indexing: doc
$ DYLD_LIBRARY_PATH=/Users/mschoch/Documents/research/rocksdb-rocksdb-3.11.2/ bleve_query -index /tmp/rocksdb.bleve/ -fields "name:MYFUNC"
2016/05/24 20:43:11 i see field: 0 - 'name'
2016/05/24 20:43:11 i see field: 1 - '_all'
2016/05/24 20:43:11 field cache: &{map[name:0 _all:1] 1 {{0 0} 0 0 0 0}}
2016/05/24 20:43:11 new term reader for field 0 term myfunc
2016/05/24 20:43:11 got back index row: Backindex DocId: `doc` Term Entries: [term:"myfunc" field:0  term:"myfunc" field:1 ], Stored Entries: [field:0 ]
2016/05/24 20:43:11 field 0 has name 'name'
2016/05/24 20:43:11 got back index row: Backindex DocId: `doc` Term Entries: [term:"myfunc" field:0  term:"myfunc" field:1 ], Stored Entries: [field:0 ]
2016/05/24 20:43:11 field 0 has name 'name'
2016/05/24 20:43:11 see field named: 'name'
1 matches, showing 1 through 1, took 888.488µs
    1. doc (0.306853)
    name
        MYFUNC

$ /Users/mschoch/Documents/research/rocksdb-rocksdb-3.11.2/ldb dump --db=/tmp/rocksdb.bleve/store/
bdoc ==> 


myfunc
Keys in range: 1

@tmm1
Copy link
Contributor Author

tmm1 commented May 25, 2016

I reverted back to 998e4a0 (before BatchEx was added) and things are working now!

@mschoch
Copy link
Contributor

mschoch commented May 25, 2016

Regarding batchex, its possible. We actually had problems with similar approach in another kv store and disabled it.

You should be able to borrow these same lines to disable the whole batchex codepath:

https://github.com/blevesearch/blevex/blob/master/forestdb/writer.go#L33-L40

@mschoch
Copy link
Contributor

mschoch commented May 25, 2016

OK, glad to hear you got something working. I'll talk to @steveyen tomorrow and we can decide what to do, probably at a minimum we will apply the workaround I described above to disable it. And maybe we'll rip it out completely.

@tmm1
Copy link
Contributor Author

tmm1 commented May 25, 2016

2190c1b is broken
ed7ccd7 is broken
a896a01 is broken
998e4a0 is working

@tmm1
Copy link
Contributor Author

tmm1 commented May 25, 2016

@mschoch Thanks for your help! I opened #23 with BatchEx disabled and confirmed everything is working as expected for me with that change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants