Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for (Very) High Level Status on RAMCloud v. RAFT #28

Closed
gshanemiller opened this issue Feb 23, 2019 · 1 comment
Closed

Request for (Very) High Level Status on RAMCloud v. RAFT #28

gshanemiller opened this issue Feb 23, 2019 · 1 comment

Comments

@gshanemiller
Copy link

gshanemiller commented Feb 23, 2019

Hi,

I am trying to get up to speed with distributed systems focusing on linearizable KV stores for which RAMCloud is exemplar particularly for my own application needs. However, I am little lost on the following very high level feature sets.

RAFT demonstrates a single-leader consensus system for finite-state-machines. So if one was to implement RAFT as part of a KV all-in-DRAM store, one high level consequence is that the total storage capacity would be limited to a single server, and just as importantly, all client requests would have to be directed to the current leader. The leader could conceivably become I/O or CPU bound if otherwise operational. In section 6 of Ongaro's RAFT paper he touches on this and linearizability noting that LogCabin does not support read-only requests on followers. However, he does discuss stratagems for reading on replicas but no client writes are entertained. So linearizability, if it reasonably includes writes, cannot be implemented in RAFT whereby the client is free to interact with any server in the RAFT cluster.

While RAMCloud uses RAFT via LogCabin as a consensus system and log replication implementation, I gather that RAMCloud supercede all these limitations. Indeed, from you paper:

"For a single coordinator to manage a large cluster without becoming a performance bottleneck, it must not be involved in high-frequency operations such as those that read and write RAMCloud objects. Each client library maintains a cache of configuration information for recently accessed tables, which allows it to identify the appropriate server for a read or write request without involving the coordinator."

And it's in this way that RAMCloud is able to claim linearizability e.g. to treat the reading and writing of a value of a key by making the total storage of all servers appear like a CPU register running on a single-core i.e. atomic changes with a recency guarantee. As just explained, client requests are redirected to a single server where the client operation (read, write, increment, CAS) are done in a linearizable manner. Thus linearizability is in effect regardless of where the client request came
because it depends on key partitioning sort-of, kind-of reducing to RAFT.

This clarification is important because it deals with potential confusion on consistency: there is no eventual consistency in RAMCloud ... it's linearizable and thus always consistent.

With this in mind a RAMCloud cluster could still become I/O or CPU bound if the keys are not well distributed and many client requests are parked on the same server as with the RAFT implementation.

Is this more or less how things work? Are there more conceptual gaps between RAFT and RAMCloud insofar as the overall system promises it implements?

Regards

@johnousterhout
Copy link
Member

johnousterhout commented Mar 1, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants