-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add badger db support #591
Comments
If someone from the community wants to add support it that would be great. Internally we're focusing on support for moss in the bleve 1.x line. And in bleve 2.x we'll most likely be moving to a custom format and not using generic k/v store in the default index implementation. |
ok.. well good to know the roadmap. |
Sure, let's leave it open. |
I'm planning on giving it a shot. Anyone else working on it? |
@akhenakh That would be awesome, please! :) |
Hey @akhenakh, great work so far! I'm having a look at it and it's looking great. Was hoping I could help out. I'm just taking a look at Badger and given that they don't have any plans to support snapshots (dgraph-io/badger#39), what's the plan here? I believe they talked a bit about dumping the database concurrently, but the database won't be frozen in time. |
Yeah the snapshot is not going to happen soon. But not sure how it would impact Bleve, any clue @bt ? |
Pretty much all of the search side of Bleve requires a snapshot to work correctly. The reason is that multiple iterators are created, and they have to be guaranteed to the see the same underlying data. If they don't, you'll get inconsistent search results. A simple example would be a phrase search for "fat cat". If version 1 of a document has "fat man" and the new version of a document has "old cat", without snapshots its possible that a search for the phrase "fat cat" is successful, even though neither the old nor the new version of the document ever contained that phrase. You might think that atomic batches prevent this, but without snapshots, its possible for the first iterator (looking for "fat") sees the old version of the document, and second iterator (looking for "cat") sees the new version of the document. |
thanks for the clarification @mschoch |
Hey @mschoch, just as a clarification just to make sure I'm getting this correct: If I was to execute a search, and whilst the search is running, the store is updated with a new document AND that document matches the search phrase, the search shouldn't see the new document, because the search would only be searching the snapshot that is made at the time the search was first executed. Am I correct? In this case, even with the workaround that was suggested in the linked issue above would not work, as if the iterator moves as new data is indexed, it would pick up the new data as the search is run. If that's true, then I guess the only way would be to implement snapshotting into Badger, if that's part of their plans. |
|
(Author of Badger here) While we don't have any plans to support snapshots natively in Badger, this can be very simple implementation above it. You can create a simple layer above Badger, which appends current timestamp (or index) to keys as a suffix. Then if you need a snapshot at a given time, you can append that timestamp in the search key, and do an iteration from there. The first key you encounter would be the value just at or before the snapshot. This layer can have a very simple API; and would also provide others with a snapshot functionality above Badger, which would be nice. |
Hey @manishrjain, this sounds like a great idea! So this implementation should be outside of Badger and act as some sort of intermediary "bridge" to support snapshotting between Badger and Bleve? |
Yeah, this should be outside of Badger. Can be a simple layer, which just adds the timestamp or index to the keys being written, and then on a lookup, can do a single key iteration. If you need some help, or some APIs in Badger, we'd be happy to provide -- we just don't want this versioning logic to live within Badger to keep things simple and performant. For e.g., for iteration, we can probably provide an API which just gives you the first key on or after the search key -- this would avoid you having to create an iterator (and pay for the overhead). |
Hey @manishrjain, so just before I start hacking away at a solution for this, I just wanted to confirm my algorithm (sorry, I'm not too familiar with the underlying technology behind KV-stores). Your proposed solution; as I understand, what you are saying is that I could build a layer on top of Badger, which would append both the So, given the following scenario (where At
At
If a snapshot is taken at This is how I'm understanding the possible implementation of snapshotting into Badger. Please let me know if this is what you're thinking or if I'm just completely out of my mind 😄 |
Close, but more complex than needed. You just want to append a padded timestamp to the key as suffix (not prefix). So,
Then, when you need a snapshot for t=4, you can do a reverse iteration to get the first key which is lower than Note that binary.BigEndian gives you encoding in a way that the In fact, we can add an API called This works pretty well. The only annoying part is the need for doing reverse iteration. Forward iteration works nicer, easier to understand. But, would require the timestamp to be encoded in a way, where the latest timestamp generates the smallest byte slice. Not sure if there's already a way to do it nicely. Irrespective, the above would give you snapshots. |
Good to see I'm on the right track! So if I was to implement the above, wouldn't there be some performance degrade given that we'll be storing the value multiple times? How would you propose to purge snapshots out? |
Storing the value multiple times won't really degrade performance per se. But, just cause your value log and LSM tree to be bigger in size. I think what you could do is to iterate over the LSM tree periodically and delete the keys below a certain time threshold. Badger has a key-only iteration that you can use for this purpose. If the LSM tree is in memory, as we recommend, it is very cheap to iterate over it (In the benchmarks, it's blazing fast!). |
To get a forward timestamp, I'm using a reverse timestamp:
Would this work in this case ? |
Yeah. That should work. Put the result in uint64, and do big endian
encoding to generate the suffix.
Sent from Nexus 6P
…On Jul 31, 2017 9:19 PM, "Fabrice Aneche" ***@***.***> wrote:
To get a forward timestamp, I'm using a reverse timestamp:
math.MaxInt64 - t.UnixNano()
Would this work in this case ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#591 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABsyNL4lJcntuEBiNYQFwhDgf3qQyCUlks5sTbgzgaJpZM4NdgjI>
.
|
@bt are you working on this over my branch or do you plan to do it on a different implem ? I'm asking cause I don't want we do the work twice :) |
Bleve also makes heavy use of iterators, so you'll also need an iterator which only sees the correct values for the snapshot, silently skipping over the values with a too old or too new timestamp. You wouldn't want to pull the values associated with those if you can avoid it, so you might want to use a key-only iterator again, and then do get operations for the keys you've determined are the ones you care about. |
Hey @akhenakh, I haven't started coding this yet as I'm still trying to get my head around how it would all work together, so you're free to start us off and I'll make some PRs as I go through the code and understand how you structure it :) Thanks! |
Hey @akhenakh, did you make much progress? I didn't manage to find your fork. Anything I can help with? 👍 |
Hey @bt, sorry I missed your message, here is my WIP https://github.com/akhenakh/bleve/commits/badger, haven't worked on it since we talked. I had some chat with @manishrjain on Badger Slack, he provided some more details on a possible implem. |
Thanks for the reply @manishrjain. The current issue right now is the fact that the implementation is passing around the same Txn instance and so in the test, when it tries to create a second iterator, it causes a panic. I’ll try to do a refactor tomorrow to see if I can get it to work but I’ve already tried a lot of things today and none seem to work (at least without changing the test function of course) |
@manishrjain is there a way I can get a snapshot of the BadgerDB at a certain ts (without using ManagedDB)? I noticed you mention that using ManagedDB is literally like playing with fire; I'm trying to abstract as much as that out so that it's handled by BadgerDB, but it seems that I might need to use ManagedDB to implement Badger... I know that creating a new transaction will also snapshot at that time, but given the way that Bleve's KVstore works, it means I will have to manage this single Txn and that seems pretty bad considering Badger emphasises concurrent Txn use... |
If it just a test which is failing, better to modify the test, I think. You could use ManagedDB, but again, there's some understanding that'd be required. For a read-only transaction, multiple get assigned the same timestamp. You could make use of that fact in the default DB implementation. They'd get a new timestamp if there are update transactions in between. There's no way you can specify the timestamp to use unless you're using ManagedDB. |
hey @bt. Where the code. I can take a crack at it. There are a ton of example of using the ManagedDB transactions and handling the time issue out there on github land. https://github.com/search?o=desc&p=3&q=badger+ManagedDB&s=indexed&type=Code |
- Support multiple read-only iterators, as per request from blevesearch/bleve#591. - Disallow the public APIs from accepting keys prefixed with `!badger!`, fixing #582. - Disallow closing iterator twice, fixing #588.
@manishrjain thanks !! |
@bt The changes you wanted for Badger have happened |
Hey @manishrjain, thanks for making the change :) I've pretty much given up on integration due to two reasons:
So for those reasons I've stopped using BadgerDB over Bleve and will probably look at it again once |
There are no plans for scorch to ever use a KV store. The KV serialization requirements are part of the reason that the |
@mschoch oooh okay -- so in that case, is there any plans to improve From my benchmarking, |
Is it Badger taking up RAM, or the cost of serialization which is causing the RAM usage? If it is Badger, I'd like to identify the cause -- maybe share a heap profile or something. |
@bt EDIT: |
From my perspective, the ability to plug-in any key/value store was a cool (possibly unique) idea we explored. It turns out to not be a very good choice if you want bleve to perform well. I do not believe you can plug-in a better/faster K/V store to Working with bleve users over this same period of time, I've realized most users simply want a correct and fast full-text search index. They That said, I know there exists a group (probably small) that really liked the pluggable K/V store aspect of bleve. With that in mind, we have no immediate plans to remove
If you could articulate more of these benefits, we can have a discussion about what aspect of scorch relate to this, and if there are analogues or missing features. |
@manishrjain Unfortunately I've long moved from the implementation I had for Badger and so reverting my Git repo now would be quite hard. The most I have is the PDF visualisation of the memory which I've attached, not sure if it'll help. I think a lot of it comes from the amount of data that needs to be serialised with using
@gedw99 I'm using a WAL for ensuring that data received is indexed in a case where the application crashes. I serialise the data coming in and then replay it back on application start.
So one of the features (mainly a more important one to me now) is to ensure durability by implementing a WAL. I've made a very simple one (that's really just factored for my own use-case) but now my issue is more around the fact that I'm not sure what to use as the reference point -- that is, how do I know that my current index has reached the end of the WAL? I'm about to implement that now, and my plan is to use the Additionally, the next thing on my list is to implement high availability and sharding, and so I believe I would need snapshotting to do this. I haven't given too much thought about that yet though so unsure the way forward for that specific case. I have been trying to follow |
Maybe this helps
https://github.com/mosuka/blast/blob/master/README.md
The WAL is a eventstore using a CQRS apttern i presume ?
I think you can do what you need with Blast or BLeve, but Blast is really
nice in that its not dependent on couchbase and uses RocksDB for its DB.
RocksDB is pretty stable with golang now thanks to the cockroachdb guys
pounding on it.
Maybe this helps ??
…On Thu, 27 Sep 2018 at 15:13, Bertram Truong ***@***.***> wrote:
BadgerDB seems to take so much RAM that it's not practical for my use-case
(but I've isolated the problem to actually be due to using upsidedown).
Is it Badger taking up RAM, or the cost of serialization which is causing
the RAM usage? If it is Badger, I'd like to identify the cause -- maybe
share a heap profile or something.
@manishrjain <https://github.com/manishrjain> Unfortunately I've long
moved from the implementation I had for Badger and so reverting my Git repo
now would be quite hard. The most I have is the PDF visualisation of the
memory which I've attached
<https://github.com/blevesearch/bleve/files/2424157/file.pdf>, not sure
if it'll help. I think a lot of it comes from the amount of data that needs
to be serialised with using upsidedown.
@bt <https://github.com/bt>
Yeah i knew about scorch but forgot about it being upsidedown index only:)
So just use scorch and move forward ?
why do you need the WAL ?
EDIT:
you need the WAL so that you can use Raft for HA ?
@gedw99 <https://github.com/gedw99> I'm using a WAL for ensuring that
data received is indexed in a case where the application crashes. I
serialise the data coming in and then replay it back on application start.
If you could articulate more of these benefits, we can have a discussion
about what aspect of scorch relate to this, and if there are analogues or
missing features.
So one of the features (mainly a more important one to me now) is to
ensure durability by implementing a WAL. I've made a very simple one
(that's really just factored for my own use-case) but now my issue is more
around the fact that I'm not sure what to use as the reference point --
that is, how do I know that my current index has reached the end of the WAL?
I'm about to implement that now, and my plan is to use the CurRootEpoch
as reported by the Stats to give something like a batchNumber to my index
batches. Then, I plan to register an event callback to watch for an
EventKindPersisterProgress which I will fork Bleve to include the
persisted epoch. Not entirely sure if this will work but I *THINK* it
will 😂
Additionally, the next thing on my list is to implement high availability
and sharding, and so I believe I would need snapshotting to do this. I
haven't given too much thought about that yet though so unsure the way
forward for that specific case.
I have been trying to follow cbft and cbgt but some parts are too
advanced for me 😦
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#591 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ATuCwmjN15fnyx4ykDpcQWd70IBJasViks5ufM8KgaJpZM4NdgjI>
.
|
Thanks, just had a look -- seems to be completely dependent on Bleve: https://github.com/mosuka/blast/blob/master/index/index.go Also indexes into BoltDB using |
The design of bleve is such that it is durable by default. What does this mean? Specifically, unless you've overridden some configuration, when you call Index() or Batch() when those methods return (without error) you are guaranteed that those documents have been indexed durably (not just in memory). Some applications can tolerate losing data (with some bounds) and applications like If you use that setting, when the method returns, the data is guaranteed to be searchable, but not guaranteed to be written to disk yet. Typically a WAL is used when you want the data to be durable quickly. The question is do you want that data to be searchable or not? If not, you can just put the WAL in front of Bleve. If you do want them to be searchable, then right now you need to wait for it to be indexed and persisted. It's not clear to me that Bleve should support data that is persisted but not yet indexed. Regarding snapshotting, I'm still not clear what is missing. The IndexReader returned by the Reader() method on an Index is a stable snapshot. By combining this with your own sequence numbers (stored using the internal storage API) you can do scatter/gather across multiple shards at the same "point in time". I agree |
Oh! I kept thinking of
Perfect, seems like this will do what I need. |
@bt your right. Its not using rocksdb at all for the index and so is upside down. I think that using scorch is the best way forward too since its the supported path. I miht have a crack at getting Blast to use Scorch too. |
Looks like someone was successful: https://github.com/alash3al/bbadger |
How's the state of all this? |
- Support multiple read-only iterators, as per request from blevesearch/bleve#591. - Disallow the public APIs from accepting keys prefixed with `!badger!`, fixing dgraph-io/badger#582. - Disallow closing iterator twice, fixing dgraph-io/badger#588.
add moss and badger indexers, note: badger indexer has issues blevesearch/bleve/issues/591 change up apis
add moss and badger indexers, note: badger indexer has issues blevesearch/bleve/issues/591 change up apis
add moss and badger indexers, note: badger indexer has issues blevesearch/bleve/issues/591 change up apis
https://github.com/dgraph-io/badger
I am proposing that this DB be supported by bleve.
Badger on this outperforms rocksdb, which is outstanding.
here is a blog post with the benchmarks
https://open.dgraph.io/post/badger/
My understanding is that they will be adding snapshotting also.
The text was updated successfully, but these errors were encountered: