Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document project and API #42

Merged
merged 4 commits into from
Jun 22, 2018
Merged

document project and API #42

merged 4 commits into from
Jun 22, 2018

Conversation

mykmelez
Copy link
Contributor

@mykmelez mykmelez commented Jun 19, 2018

This branch adds documentation for the project itself (via Markdown in README.md) as well as the Rust crate that it produces (as embedded comments in lib.rs), with the goal being to cleanly distinguish between those two kinds of docs. The only overlap between the two is the one-line description ("a simple, humane, typed Rust interface to LMDB"), which I've made consistent between the README, the comments, and the "description" field of the GitHub repo page.

For crate docs, I took inspiration from the rand, lazy_static, and serde docs while treating libc and winapi as counter-examples. The usage example itself is adapted from examples/simple-store.rs and is tested by cargo test, to ensure that it doesn't bitrot.

For project docs, I looked at a variety of different projects and mostly stuck to the basics, which we can flesh out further over time.

@ncalexan You've been thinking about Rust crate docs lately for Mentat, so perhaps you have some insight here.

@ncloudioj I'd appreciate your input as someone relatively new to rkv but experienced with LMDB (and the author of the simple-store.rs example!).

Finally, writing these docs raised a few implementation issues that I'll investigate/file separately:

  • it's unclear what happens if one Manager consumer calls get_or_create with Rkv::new and another calls it with Rkv::new_with_capacity
  • when retrieving or creating a new environment handle, the name of the method is get_or_create ("create" appears second), but when retrieving or creating a new store handle, the name of the method is create_or_open ("create" appears first)
  • it's unclear if/how it's possible to open multiple stores within a single transaction, which LMDB itself supports

@mykmelez mykmelez self-assigned this Jun 19, 2018
@mykmelez mykmelez changed the title WIP: project and API documentation document project and API Jun 20, 2018
@mykmelez mykmelez requested a review from ncalexan June 20, 2018 00:44

rkv is a usable Rust wrapper around LMDB.
```sh
cargo doc --open
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps cargo doc --no-deps --open?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm, just notice that there are some cross references on deps' doc in the crate document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my intent is to reference dependencies in the rkv docs, and to ensure that I'm referencing the correct versions of those dependencies (i.e. the versions that rkv is actually dependent on, rather than the latest versions available on crates.io), I reference the docs created by the cargo doc invocation that created the rkv docs.

//! - integers (`Value::I64`, `Value::U64`)
//! - floats (`Value::F64`)
//! - strings (`Value::Str`)
//! - blobs (`Value::Blob`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Json, Instant, and UUID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was to avoid displaying an exhaustive list and mention only a small set of common types. In 630743a, I've added this statement to be more explicit about that:

//! See [Value](value/enum.Value.html) for the complete list of supported types.

@ncloudioj
Copy link
Member

LGTM, r=nanj.

it's unclear what happens if one Manager consumer calls get_or_create with Rkv::new and another calls it with Rkv::new_with_capacity

Isn't each enviroment guarded by a RwLock? I'd assume those writers (Rkv::new and variants) would be serialized by this lock, and the last one wins. This also reminds me that LMDB allows you to open an environment in the read-only mode (MDB_RDONLY). We might not want to use this option because it could prevent the application from creating write transactions later on.

when retrieving or creating a new environment handle, the name of the method is get_or_create ("create" appears second), but when retrieving or creating a new store handle, the name of the method is create_or_open ("create" appears first)

To be more consistent, we shall rename those store methods.

it's unclear if/how it's possible to open multiple stores within a single transaction, which LMDB itself supports

Good catch! The current rkv::Store abstraction doesn't support that because it wraps the transaction into the Reader/Writer. To support multi-store reads/writes, it needs to take the transaction out from store, perhaps something like,

let txn = rkv.write();
let store_foo = rkv.create_or_open("foo");
let store_bar = rkv.create_or_open("bar");
store_foo.write(txn, "key0", "value0");
store_bar.write(txn, "key1", "value1");
txn.commit();

The downside is that users can't use the Reader/Writer any more. Another potential approach, which reuses the design of Writer/Reader, is introduce a MultiStore so that multiple stores could be get_or_created at the same time in a single transaction. Its read/write API will be slightly different,

let store_names = vec!["foo", "bar", "baz"];
let mega_store = rkv.create_or_open(&store_names);
let writer = mega_store.write();
writer.write("foo", "key0", "value0"); // it takes a store name here
writer.write("bar", "key1", "value1");
writer.commit();

Would love to hear your thoughts on this.

@mykmelez mykmelez mentioned this pull request Jun 20, 2018
@mykmelez
Copy link
Contributor Author

Isn't each enviroment guarded by a RwLock? I'd assume those writers (Rkv::new and variants) would be serialized by this lock, and the last one wins.

They're guarded by a RwLock, but this issue is about creation of the environments themselves.

There are currently three Rkv methods that create environments: new, with_capacity, and from_env. If one consumer opens an environment using Rkv::new (which uses the default capacity of 5 databases), and then a second consumer tries to open the same environment with Rkv::with_capacity(10), then the Manager's RwLock will serialize those calls, but what should the result of the second call be? If we return the cached environment, then we're returning an environment with a different capacity than the consumer requested.

At the moment, we actually avoid this problem, since Manager::get_or_create only accepts an Rkv::* callback that takes a single parameter, which means it only accepts Rkv::New, since Rkv::with_capacity and Rkv::from_env both take two parameters. But that actually raises a new issue: how do you use the Manager to protect an environment with a non-default configuration?

I've filed #43 on this issue.

@mykmelez
Copy link
Contributor Author

To be more consistent, we shall rename those store methods.

Good idea, I filed #44 on this (and #45 on a parameter ordering consistency issue I also noticed).

@mykmelez
Copy link
Contributor Author

The current rkv::Store abstraction doesn't support that because it wraps the transaction into the Reader/Writer. To support multi-store reads/writes, it needs to take the transaction out from store, perhaps something like,

Would love to hear your thoughts on this.

Good question! And a tough one. I filed #46 on it and put some thoughts there.

Copy link
Member

@ncalexan ncalexan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, and I learned a thing about GitHub badges!

@@ -1,24 +1,49 @@
[![Travis CI Build Status](https://travis-ci.org/mozilla/rkv.svg?branch=master)](https://travis-ci.org/mozilla/rkv)
[![Appveyor Build Status](https://ci.appveyor.com/api/projects/status/lk936u5y5bi6qafb/branch/master?svg=true)](https://ci.appveyor.com/project/mykmelez/rkv/branch/master)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be pointing to "project/mkymelez"? Can we get it under Mozilla's banner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusingly, the AppVeyor project at https://ci.appveyor.com/project/mykmelez/rkv builds the mozilla/rkv repo, while another project I created at https://ci.appveyor.com/project/mykmelez/rkv-iy5s5 builds the mykmelez/rkv repo.

That's because the "mykmelez" in the URL refers to the GitHub account name of the AppVeyor user who created the project, not the GitHub account/organization of the repository being built.

Interestingly, the github-owners mailing list is just discussing this issue, as @Mossop has an AppVeyor account with the username "Mozilla" that he can use to create projects with the URL prefix https://ci.appveyor.com/project/mozilla/. (I'm not actually sure how he obtained that username, as I can't find a way to change my own; it appears to be tied to my GitHub account name.)

In any case, that conversation seems to be leaning toward getting rid of that AppVeyor account in favor of individual Mozillians creating AppVeyor projects for their mozilla/* repos as desired, which is completely self-service and doesn't require central administration through Dave.

I'm open to counter-arguments, if you think that the URL issue will confound users enough to make it worth maintaining that central account.

Also note that it's technically possible to create a badge URL that refers to a GitHub repository rather than an AppVeyor project. Here's such a URL for mozilla/rkv:

https://ci.appveyor.com/api/projects/status/github/mozilla/rkv?svg=true&branch=master

The issue with that URL is that it'll stop working if someone else sets up a second project for the same repository, and literally any AppVeyor user can create a project for a public GitHub repository, whether or not they have any relationship to it. See appveyor/ci#1525 for the gory details.

And even with such a badge URL, the project URL—i.e. the URL you visit when you click the badge—will still reflect the account name of the user who created it. Thus, for the project I created to build mozilla/rkv, the best we could do is:

[![Appveyor Build Status](https://ci.appveyor.com/api/projects/status/github/mozilla/rkv?svg=true&branch=master)](https://ci.appveyor.com/project/mykmelez/rkv/branch/master)

Which doesn't seem worth it, given how easy it is for anyone to DOS the badge URL.

## Feature choices
```sh
cargo build
```

If you specify the `backtrace` feature, backtraces will be enabled in `failure`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not relevant to this review, but can you talk me through why you're doing this? @grigoryk just converted Mentat to failure and maybe we want to grow this feature as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When @rnewman added this note to the README (a07fb4e), he also added this comment to Cargo.toml:

# Get rid of failure's dependency on backtrace. Eventually
# backtrace will move into Rust core, but we don't need it here.

README.md Outdated

rkv relies on the latest [rustfmt](https://github.com/rust-lang-nursery/rustfmt) for code formatting, so please make sure your pull request passes the rustfmt before submitting it for review. See rustfmt's [quick start](https://github.com/rust-lang-nursery/rustfmt#quick-start) for installation details.

Please also observe Mozilla's [Community Participation Guidelines](https://www.mozilla.org/en-US/about/governance/policies/participation/) while contributing to this project.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "We follow Mozilla's ...".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds better, fixed in f74b3aa.

Note that https://help.github.com/articles/adding-a-code-of-conduct-to-your-project/ recommends a CODE_OF_CONDUCT.md file, and at least some adopters of the Contributor Covenant emphasize their code with a top-level section in their README, such as failure:

https://github.com/rust-lang-nursery/failure#code-of-conduct

We might consider doing that as well. I'll confer with the experts!

@mykmelez mykmelez merged commit 12e5a10 into mozilla:master Jun 22, 2018
@mykmelez mykmelez deleted the document-rkv branch June 22, 2018 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants