Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

13642 keepstore librados backend #71

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jrandall
Copy link
Contributor

@jrandall jrandall commented Jul 6, 2018

Implements a librados (ceph) backend for keepstore.

Uses the go-ceph (https://github.com/ceph/go-ceph) rados bindings to the librados library.

Implements RadosVolume and TestableRadosVolume, and includes mock code for testing without access to an actual ceph cluster (although go-ceph is required even for the mock tests because some go-ceph structs are used directly by the mocks).

Runs the standard set of generic tests and some additional tests on context cancellation.

In addition to the mock implementation, a real implemenation can be used for the tests (although the context cancellation tests are skipped in that case as the required races cannot be guaranteed when running against a real-world system). This is specified in much the same way as with the Azure tests, by passing a test argument (-test.rados-pool-volume <poolname>) and then setting other necessary -rados-* connection parameters as normal.

A helper script (run-rados-test-with-docker-ceph-demo.sh) is provided that spins up a "real" ceph cluster running in a ceph/demo docker container (https://hub.docker.com/r/ceph/demo/) and runs the keepstore rados tests against it. It should work on ubuntu and debian, but I haven't tested portability against any other systems, and also note that Ceph OSDs (including those in ceph/demo) are somewhat finicky about the capabilities of the underlying filesystem, so it may not work on all docker volume backends (see http://docs.ceph.com/docs/jewel/rados/configuration/filesystem-recommendations/#filesystem-background-info).

In my hands, this implementation passes all tests using the mocks, using the ceph/demo docker container, and also using an actual production ceph cluster on real hardware.

This PR also includes one small change to the generic tests, which was to add a PutRaw call to the testStatus test in volume_generic_test.go before calling Status and then asserting that there should be non-zero BytesUsed. My argument here is that it is not reasonable to require the backend to have nonzero bytes used if there are no objects in the backend store!

Arvados-DCO-1.1-Signed-off-by: Joshua C. Randall <jcrandall@alum.mit.edu>
…there to be nonzero BytesUsed

Arvados-DCO-1.1-Signed-off-by: Joshua C. Randall <jcrandall@alum.mit.edu>
…iner

Arvados-DCO-1.1-Signed-off-by: Joshua C. Randall <jcrandall@alum.mit.edu>
Arvados-DCO-1.1-Signed-off-by: Joshua C. Randall <jcrandall@alum.mit.edu>
@jrandall jrandall force-pushed the 13642-keepstore-librados-backend-pr branch from a42ccc4 to 57262cf Compare July 7, 2018 00:03
@tetron
Copy link
Member

tetron commented Jul 13, 2018

@jrandall this is awesome! Aside from the regular code review, we need to figure out what our process is for integrating features we don't use ourselves. Looks like there are some tests, so that's good.

I'm curious, do you run a few keepstore servers and have them talk to the Ceph nodes, or run a keepstore server on every compute node?

@jrandall
Copy link
Contributor Author

We generally run a keepstore on every compute node (and also some dedicated ones for loading data in and out from external sites via keep-proxy/keep-web)

@tomclegg tomclegg self-assigned this Sep 5, 2018
@tomclegg
Copy link
Member

tomclegg commented Sep 6, 2018

(see further discussion at https://dev.arvados.org/issues/13642)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants