Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for opening a column family with options #136

Merged

Conversation

garyttierney
Copy link
Contributor

I've hit an issue when I create a column family with a custom compare function and then try to
re-open the same database. Since I can't pass in options to open_cf, my existing column familes
are opened with the default options and I get an error.

This adds a new DB::open_cf_descriptors method that allows passing in Options for
each column family, and a new ColumnFamilyDescriptor type was added to contain
the congfiguration for each column family.

Adds a new `DB::open_cf_descriptors` method that allows passing in Options for
each column family.  A new `ColumnFamilyDescriptor` type was added to contain
the congfiguration for each column family. This fixes an issue where a column
family was created with non-default options, and then fails to re-open due to a
config mismatch.
@vmx
Copy link
Contributor

vmx commented Aug 28, 2017

This change is similar to #130. I might even like this one better. Though I would make it a breaking change and change open_cf_descriptors() to just open_cf().

@garyttierney
Copy link
Contributor Author

Apologies @vmx, I completely missed your open pull request or I'd have pulled that instead! I'd be happy to update this to be a breaking change if you'd rather go with this PR over your own :).

@vmx
Copy link
Contributor

vmx commented Aug 29, 2017

It would be great to get feedback from others, especially @spacejam on which approach he prefers and if he agrees with me that this should be a breaking change or not.

@spacejam
Copy link
Member

spacejam commented Sep 2, 2017

I like coupling column family identifiers to their options, reducing the chance of messing up the order by users. I see it being relatively inexpensive in maintenance costs to keep open_cf_descriptors as well as open_cf and it's nice that this doesn't break existing code. Maybe if more people feel strongly about there being one, we can merge this as-is for now, and open an issue for collapsing them into a single endpoint in the next breaking release? I'd like to get more people's support for that kind of breaking change I think, since it's frustrating to have to go back and change existing code for users.


assert!(DB::destroy(&Options::default(), path).is_ok());
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a test exercising the new open_cf_descriptors method?

@@ -63,6 +63,11 @@ pub struct DB {
path: PathBuf,
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a short comment describing this public struct?

src/db.rs Outdated
pub fn open_cf<P: AsRef<Path>>(opts: &Options, path: P, cfs: &[&str]) -> Result<DB, Error> {
let cfs_v = cfs.to_vec().iter().map(|name| ColumnFamilyDescriptor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use ColumnFamilyDescriptor::new(name, Options::defaut()) instead?

@vmx
Copy link
Contributor

vmx commented Sep 2, 2017

@spacejam My point about making this one the open_cf() is that I don't really see a use case for the original function. As soon as you need any options you can't use it. And at least I use column families especially because i needed different options.

So I'm in favour for keeping the API as small as possible, especially since we are before 1.0.

@garyttierney
Copy link
Contributor Author

I've made the requested changes and added some crate level documentation on ColumnFamilyDescriptors.

With regards to folding open_cf_descriptors into open_cf, an automatic conversion from &str to a ColumnFamilyDescriptor with default options could be done to keep backwards compatibility I think:

pub fn open_cf<P, C>(opts: &Options, path: P, cfs: &[C]) -> Result<DB, Error>
       where P: AsRef<Path>, C: Into<ColumnFamilyDescriptor> {}

impl From<&str> for ColumnFamilyDescriptor {
    fn from(name: &str) -> Self {
        ColumnFamilyDescriptor::new(name, Options::default())
    }
}

Though, I think in most cases when you want to open a ColumnFamily you also want to define a set of per-cf options.

@spacejam
Copy link
Member

spacejam commented Sep 4, 2017

@garyttierney I like the idea of using an automatic conversion to support the new use case while preserving backwards compatibility, and reduce API bloat. The implementation may require a couple more lifetimes, but it's feasible. @vmx what do you think about this approach?

@vmx
Copy link
Contributor

vmx commented Sep 4, 2017

That sounds like a good solution, I wasn't even aware that things like that are possible with Rust.

@spacejam
Copy link
Member

spacejam commented Sep 5, 2017

Awesome! That feels like a good way forward to me at this point. Great idea, @garyttierney ! Would you mind implementing open_cf such that it accepts an Into<ColumnFamilyDescriptor>? And removing open_cf_descriptors in its favor.

When playing around with the idea I made this quick PoC of migrating a &[&str to an Into<_>, which is a bit ugly, but in case it's helpful here it is! https://play.rust-lang.org/?gist=401075448daa3d550e4d29d726788b4b&version=stable

@garyttierney
Copy link
Contributor Author

@spacejam yep, happy to go ahead and implement this :-) I'll get some changes pushed later on tonight for review.

@vmx
Copy link
Contributor

vmx commented Sep 18, 2017

@garyttierney Can I help somehow with this pull request? If yes, tell me which part is missing and I'll be working on it.

@garyttierney
Copy link
Contributor Author

garyttierney commented Sep 18, 2017

@vmx sorry for the delay. I got a bit stuck keeping backwards compatibility for the open_cfs method!

Currently open_cf_descriptors takes a Vec<ColumnFamilyDescriptor> and would be able to take a Vec<Into<ColumnFamilyDescriptor>>. However, this method signature isn't compatible with existing calls to open_cf which accepts &[&str].

The solution there might be to accept &[C] where C: Into<ColumnFamilyDescriptor> but I've had some issues with this also:

pub fn open_cf<P, C>(opts: &Options, path: P, cfs: &[C]) -> Result<DB, Error>
       where P: AsRef<Path>, C: Into<ColumnFamilyDescriptor> 
{
    for cf in cfs.into_iter() {  // cf = &ColumnFamilyDescriptor, need ColumnFamilyDescriptor
    }
}

I'm relatively new to Rust, so if you have any ideas there I'd appreciate the help. Thanks!

@vmx
Copy link
Contributor

vmx commented Sep 18, 2017

I'm still trying to get this working and I think I'll get this working. Though I really wonder if it's worth it.

I still don't see the point of the original API, I think if you use Column Families, the usual case is to have custom options. I would make it a breaking change with a note in the README saying:

If you want to use column families like prior to this release you can do so with converting the Column Family names into ColumnFamilyDescriptors via:
your_cfs_as_str.to_vec().iter().map(|name| ColumnFamilyDescriptor::new(*name, Options::default())).collect();

Or opening a bug with the exact error message and then posting this line so you can find it if you hit the problem.

@vmx
Copy link
Contributor

vmx commented Sep 18, 2017

Here's the code I came up with. The problem is that it needs to clone the elements of the colum family slice even if it is of type ColumFamilyDescriptor already.

diff --git a/src/db.rs b/src/db.rs
index 6660863..f9c3995 100644
--- a/src/db.rs
+++ b/src/db.rs
@@ -569,6 +569,12 @@ impl ColumnFamilyDescriptor {
     }
 }
 
+impl<'a> From<&'a str> for ColumnFamilyDescriptor {
+    fn from(name: &'a str) -> ColumnFamilyDescriptor {
+        ColumnFamilyDescriptor::new(name, Options::default())
+    }
+}
+
 impl DB {
     /// Open a database with default options.
     pub fn open_default<P: AsRef<Path>>(path: P) -> Result<DB, Error> {
@@ -579,14 +585,16 @@ impl DB {
 
     /// Open the database with the specified options.
     pub fn open<P: AsRef<Path>>(opts: &Options, path: P) -> Result<DB, Error> {
-        DB::open_cf(opts, path, &[])
+        let empty: &[&str] = &[];
+        DB::open_cf(opts, path, empty)
     }
 
     /// Open a database with the given database options and column family names.
     ///
     /// Column families opened using this function will be created with default `Options`.
-    pub fn open_cf<P: AsRef<Path>>(opts: &Options, path: P, cfs: &[&str]) -> Result<DB, Error> {
-        let cfs_v = cfs.to_vec().iter().map(|name| ColumnFamilyDescriptor::new(*name, Options::default())).collect();
+    pub fn open_cf<P, C>(opts: &Options, path: P, cfs: &[C]) -> Result<DB, Error>
+            where P: AsRef<Path>, C: Clone + Into<ColumnFamilyDescriptor> {
+        let cfs_v = cfs.into_iter().map(|cf| cf.clone().into()).collect();
 
         DB::open_cf_descriptors(opts, path, cfs_v)
     }
diff --git a/src/lib.rs b/src/lib.rs
index 788e97a..07c947b 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -81,6 +81,7 @@ pub struct DB {
 /// A descriptor for a RocksDB column family.
 ///
 /// A description of the column family, containing the name and `Options`.
+#[derive(Clone)]
 pub struct ColumnFamilyDescriptor {
     name: String,
     options: Options,
@@ -165,6 +166,7 @@ pub struct BlockBasedOptions {
 ///    DB::open(&opts, path).unwrap()
 /// }
 /// ```
+#[derive(Clone)]
 pub struct Options {
     inner: *mut ffi::rocksdb_options_t,
 }

@vmx
Copy link
Contributor

vmx commented Sep 19, 2017

Although the above code compiles, it doesn't work (it segfaults), probably due to the cloning of the Options.

I think all boils down to that open_cf gets the strings as slice (kind of borrowed) and that open_cf_descriptor gets the vector of ColumFamilyDescriptors moved (which makes sense). So with keeping API compatibility you'd borrow the vector and then internally clone it. I think that's counter intuitive.

@vmx
Copy link
Contributor

vmx commented Sep 25, 2017

I've create a version of open_cf_descriptor which borrows the vector of ColumnFamilyDescriptor (https://github.com/vmx/rust-rocksdb/tree/feature/open-cfs-with-options). I haven't found a clean way to convert the &[&str] into a &Vec<ColumnFamilyDescriptor> while not doing any conversion if it's a &Vec<ColumnFamilyDescriptor>.

@vmx
Copy link
Contributor

vmx commented Sep 26, 2017

I've spent another few hours. This time I come to the conclusion that converting a reference to another one — in our case a &[&str]] into a &Vec<ColumnFamilyDescriptor> (or &[ColumnFamilyDescriptor]) — is not possible with the From trait as something needs to hold the actual ColumnFamilydescriptors. Anything created within the fn from() won't live long enough.

@vmx
Copy link
Contributor

vmx commented Oct 20, 2017

This pull request relates to the discussions at #145.

BusyJay pushed a commit to BusyJay/rust-rocksdb that referenced this pull request Nov 4, 2017
@spacejam spacejam merged commit 06a3927 into rust-rocksdb:master Feb 10, 2018
spacejam added a commit that referenced this pull request Feb 10, 2018
Open cfs with options - fixes merge conflict in #136
@spacejam
Copy link
Member

crate version 0.9.0 has been cut with these changes included!

rleungx pushed a commit to rleungx/rust-rocksdb that referenced this pull request Jun 17, 2020
Update rocksdb with the following changes:
```
cb7efe6d7 2020-01-08 zbk602423539@gmail.. Add oldest snapshot sequence property (#6228) (rust-rocksdb#141)
c7d775168 2020-01-07 zbk602423539@gmail.. statistics: make ticker and histogram extendible (rust-rocksdb#136)
ebec1bd8a 2020-01-07 bupt2013211450@gma.. add multibatch write into memtable (rust-rocksdb#131)
4dcfb8789 2020-01-07 zbk602423539@gmail.. fix invalid register for .seh_savexmm (rust-rocksdb#140)
```

Signed-off-by: Yi Wu <yiwu@pingcap.com>
rleungx pushed a commit to rleungx/rust-rocksdb that referenced this pull request Jun 23, 2020
Update rocksdb with the following changes:
```
cb7efe6d7 2020-01-08 zbk602423539@gmail.. Add oldest snapshot sequence property (#6228) (rust-rocksdb#141)
c7d775168 2020-01-07 zbk602423539@gmail.. statistics: make ticker and histogram extendible (rust-rocksdb#136)
ebec1bd8a 2020-01-07 bupt2013211450@gma.. add multibatch write into memtable (rust-rocksdb#131)
4dcfb8789 2020-01-07 zbk602423539@gmail.. fix invalid register for .seh_savexmm (rust-rocksdb#140)
```

Signed-off-by: Yi Wu <yiwu@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants