-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Any chance bringing the forks together? #69
Comments
Yeah! ngaut's (being used for the tikv project) is the main one I would On Sep 29, 2016 07:49, "Mario Mueller" notifications@github.com wrote:
|
Hi, @xenji We would like to merge all of the changes to the repo. It's just there is a problem: |
I'm happy cutting a new major number for an api breaking new version On Sep 30, 2016 07:57, "goroutine" notifications@github.com wrote:
|
Seems like a plan. Good to have you all in the conversation! Should we revive this discussion and dive deeper when @spacejam is back from the Alps? I think this needs more than just a bit of talking, doesn't it? |
@spacejam If you stick to Semantic Versioning you don't even need to cut a new major version as long as you don't have a 1.0.0 release yet. |
Any updates on all this? |
Hey Everyone, I'd just like to say, I'm keen to help out with improving the general quality of this wrapper (such as writing API docs and improving the tests). I also need some more features of RocksDB for my project, which I'm happy to put some of my own time in to implementing. It looks like some of this work may have already been done in the other forks though. I'm not sure what I should do as my availability to help out will deminish after this week but any work I do now will probably conflict or duplicate with what has been done in the other forks. I'm happy to help out in any way I can to get the changes in the forks pushed upstream so let me know if you need an extra pair of hands. |
Thanks @ngaut Just been having a look. It appears the fork was made a while ago and both itself and upstream have had significant changes since then. This complicates things a little bit. I reckon our best course of action is:
Does this sound OK? |
SGTM. @kaedroho |
Here's my opinions. In general, I didn't notice a single change that had documentation. Some parts do not have tests and the tests that are there could be improved. Code cleanupThere has been some major refactoring and cleanups in the fork, but upstream has been heavily changed too. I don't think it's worth, and may not even be feasible to port these. If we must reproduce these cleanups, we should probably get them out of the way first so avoid merge conflicts in new changes. Static linkingI think it would be great to get this upstream. I had problems with jemalloc segfaulting under a dynamically linked RocksDB. The ethcore fork have implemented this as well. They appear to have done it differently and also have Windows support so I think it's worth looking into both implementations. IteratorsThere's been a lot of work around refactoring the iterator API and I disagree with some of the design decisions. I think we should discuss this part of the API before making any changes. OptionsMost of these changes follow a similar pattern by just exposing new functions in the There's one change that I disagree with though, https://github.com/ngaut/rust-rocksdb/pull/31. This adds an upper-bound to range iterators but it's not part of the standard RocksDB API. The PR also changes the read methods to consume the |
Thanks for your excellent work. @kaedroho About Options: Others sounds great to me. |
Thank you very much for stepping forward to help with this, @kaedroho! I can imagine the upper bound being a nice way to help with things like multiplexing higher level lexicographic ranges of sharded databases, and getting higher performance than filtering results after the fact. I'm in favor of keeping it. In general I'm in favor of making the merge path for @ngaut's changes as seamless as possible, as he is doing a ton of great work that I'd love to merge in over time. To that end, I'm generally in favor of keeping whatever API semantics his version has (maybe rethinking ownership of ReadOption though), breaking the ones here where necessary, adding proper documentation (I can help with this over time). I can commit a couple hours per week to this to help this effort, but it may look more like a whole day per month or something like that due to random traveling. Since you want to cram work into this week, I'll try to pay extra close attention to this to clear blockers on my end to your progress. |
Ahh, my mistake! It looks like a bit of dead code got through on that pull request (which led me to think this was custom). See: https://github.com/ngaut/rust-rocksdb/pull/31/files#diff-15fa9749208206c5a07bbe0cc9aa11abR45 This new attribute is the reason why the ownership rules had to change, but it doesn't look like it was ever used anywhere, so I think it can just be removed. I'll just port the |
No problem! This wrapper has been really useful to me so far, glad I can help make it better!
I agree, the TiKV team have been making some really useful changes. But I think it's a shame that both TiKV and Ethcore have decided that they must maintain their own forks in isolation of each other. Both projects have implemented static linking (a huge bit of work) when only one of them really needed to and new projects in the future will not benefit from either of their efforts! I hope this effort would be a step towards bringing all the teams together, so we can all work on the same codebase and benefit from each other's contributions. I've just submitted the first batch of work which pulls across most of the changes to the Next, I'm going to have a look at the static linking implementations. I'll probably write up an issue describing what I've learned and propose an implementation before starting work. |
Sorry for the delay (had to leave my desk for a while). Here's my quick analysis of how the two forks have implemented static linking. tl;dr The only notable difference between the two is how they both build RocksDB. I prefer the approach taken in ethcore because of it's cross-plafrom support and it doesn't do any downloads during compilation. Please let me know if you disagree with my observations/opinions. The approach I'd like to take is to copy the implementation from the ethcore fork, tweak it to fix any compatibility issues and make it use git-submodules instead of bundling RocksDB/snappy. Please let me know if that sounds OK to you. What they both do
What they do differently
My opinions: I prefer the way the ethcore fork works here. Performing extra downloads during a build feels very ugly to me, download links may break in the future and we can’t be sure the build server has an internet connection. I don’t like having full copies of dependencies bundled in the repo, I think git-submodules would be a good fit here instead. The ngaut fork builds with bz2, libz, lz4 and snappy whereas the ethcore fork only builds with snappy. This would obviously, improve compile time considerably but I wonder is snappy is sufficient for all use cases?
My opinions: Again, I prefer the approach the ethcore fork has taken here. Using a shell script would make cross-platform support very difficult. Different distributions of Linux may not have all the required tools installed to run it and there are many different shells to take into account. It will need to be completely reimplemented for Windows support. Ethcore keeps all of it’s build logic inside the build.rs file. This file can be built and run on any platform that Rust supports. Despite the name, the gcc-rs crate works well with Visual C++ on Windows (the ethcore fork has Windows support). The only downside I can think of with the ethcore approach is that every .cc file needs to be listed in build.rs. This list may need to be updated when RocksDB is updated. The linker should give us an error message when this list is incorrect so there’s little chance of mistakes. |
Actually we thought about using git submodule and gcc-rs at first. But it seems ugly to me to track all the .cc files again which was done in the Makefile already. And git-submodule is too slow for me to clone a new repository from the beginning. I think almost all the linux distribution and OS X have bash support. Windows can use PowerShell get the job done, though It's not implemented yet in the ngaut branch. If build server doesn't have a network connection, then it will be hard for it to build a rust project. Because the crates registry, crates source are needed to be updated (or initialized) via a connection to github.com and crates.io. Please note that, gcc-rs just wraps a gcc process, so it depends on gcc compiler. Rocksdb can't be compiled without any external tools (yet). |
Thanks for chiming in @BusyJay!
Ahh, good point. Having to pull down a 6000-commit history when you only need the state at one time is a bit heavy. This won't be a problem when rust-rocksdb is installed from crates.io though as the checked out submodules will be bundled in the crate at the point of publishing it.
I agree, but it's the lesser of two evils in my opinion
I'll elaborate on this a bit
|
Would you consider a PR that changed the FFI module to something like https://github.com/alexreg/rocksdb-sys/blob/master/src/lib.rs? This is up-to-date with RocksDB (v4.11.2), and stays faithful to the C API – no renamings of types, and arguably more complete tests (see https://github.com/alexreg/rocksdb-sys/blob/master/tests/c_test.rs in the same repo, since they've been ported from the C tests). I'd also recommend forking the FFI stuff into a separate Note that type aliases could then be given where no interop is needed. And for the enums, a simple numerical mapping could be made. |
@alexreg I've forked the FFI module into a separate -sys crate in my "implement static linking" PR (#80) submitted earlier today. Hopefully this will be looked at soon! Those FFI bindings are much more complete than what we currently have (implementing 274 library functions vs 116 in this repo), and the test suite is a massive bonus. I think pulling this in would be very helpful*. But I'd still like the common, everyday API to feel "rusty", following Rust's naming conventions. The FFI module could become a behind the scenes thing which follows RocksDB's naming conventions (and is public for those who want more control). The enums currently defined in FFI will probably have to be redefined somewhere else. * I'm not a maintainer though, @spacejam could give an official answer to your question |
@kaedroho How did you get librocksdb to link (statically) against libjemalloc by the way? Specifically, the version of libjemalloc.a that Rust uses. Edit: |
Peeps, that is how open source collaboration should work! Thank you, you are awesome! :D |
I'd like to continue to pull in changes over time from the various forks, so this is by no means over, but for now we've wrapped up a nice big chunk. Today 0.5.0 will be released, which includes the great work you've all done in this issue, and others! I've been pleasantly surprised by the amount of contribution that this issue has stirred up, and blown away by the amount of effort put in by @kaedroho and @alexreg ! Everyone who contributed, this project is in a much nicer place now after your efforts :) Thank you! |
Glad to help. Thanks likewise for maintaining this project so diligently. :-)
|
Hey there!
First, thanks for making this all happen, it helps me a lot in my daily work!
Secondly: Is there a chance, that you/we can get all forks on one table and see how this can be beneficial for the project? For example: You and also https://github.com/ngaut/rust-rocksdb seem to have made pretty good progress, but the branches are too diverged and I lack of knowledge to merge them my self.
So I try and shoutout to @ngaut, @BusyJay and @zhangjinpeng1987 to join the discussion if those two forks should be one again?
Hope that helps somehow ...
Cheers,
Mario
The text was updated successfully, but these errors were encountered: