Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: skip persisting no side effect log entries in WAL #5912

Closed
mitake opened this issue Jul 11, 2016 · 7 comments
Closed

RFC: skip persisting no side effect log entries in WAL #5912

mitake opened this issue Jul 11, 2016 · 7 comments
Assignees

Comments

@mitake
Copy link
Contributor

mitake commented Jul 11, 2016

Hi,

I'm trying to implement an idea about skipping persisting no side effect log entries in WAL. Because of the persistence process, linearizable range requests of current etcd incurs disk write. I think persisting log entries that do not have side effect is not required because replaying the entries during recovery means nothing.

The most straightforward way for reducing the overhead is removing the entries from the log. However, it produces holes in the log and it goes against the design of Raft and make the recovery process very complicated. So I tried to implement the idea with this way:

  1. if saved entries have no side effect, queue them in memory
  2. during actual saving, persist the queued entries

It would produce batched write and I'm not sure it is enough light always. But having threshold would solve this problem.

The change affects the core part of Raft and is breaking some unit tests for now. However, as I show in the below, performance improvement is significant (more than 20x). So I'd like to hear opinions from other developers whether this change is worth to be tried seriously or not. I'm still not sure that the idea violates the design principles of Raft and etcd. The correct implementation seems to be possible but I cannot say there's no critical pitfalls yet (e.g. managing HardState will be tricky).

Current very rough draft is here: https://github.com/mitake/etcd/tree/skip-persist-in-wal

Below is the performance comparison:

  • master
bench with linearizable range
 10000 / 10000 Boooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo! 100.00%2m7s
Summary:
  Total:        127.0084 secs.
  Slowest:      0.1915 secs.
  Fastest:      0.0065 secs.
  Average:      0.0127 secs.
  Stddev:       0.0064 secs.
  Requests/sec: 78.7349

Response time histogram:
  0.006 [1]     |
  0.025 [9479]  |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.043 [515]   |∎∎
  0.062 [0]     |
  0.080 [0]     |
  0.099 [0]     |
  0.118 [0]     |
  0.136 [0]     |
  0.155 [0]     |
  0.173 [2]     |
  0.192 [3]     |

Latency distribution:
  10% in 0.0080 secs.
  25% in 0.0087 secs.
  50% in 0.0114 secs.
  75% in 0.0145 secs.
  90% in 0.0195 secs.
  95% in 0.0254 secs.
  99% in 0.0299 secs.
  • with this change
bench with linearizable range
 10000 / 10000 Boooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo! 100.00%5s

Summary:
  Total:        5.3315 secs.
  Slowest:      0.0262 secs.
  Fastest:      0.0004 secs.
  Average:      0.0005 secs.
  Stddev:       0.0008 secs.
  Requests/sec: 1875.6623

Response time histogram:
  0.000 [1]     |
  0.003 [9937]  |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.006 [3]     |
  0.008 [22]    |
  0.011 [12]    |
  0.013 [23]    |
  0.016 [1]     |
  0.018 [0]     |
  0.021 [0]     |
  0.024 [0]     |
  0.026 [1]     |

Latency distribution:
  10% in 0.0004 secs.
  25% in 0.0005 secs.
  50% in 0.0005 secs.
  75% in 0.0005 secs.
  90% in 0.0005 secs.
  95% in 0.0005 secs.
  99% in 0.0008 secs.
@heyitsanthony
Copy link
Contributor

@mitake my only concern with this patch is the hasSideEffects bit (judging by the comments, it seems you agree). I think the way to avoid coupling wal to etcdserver semantics is to have:

type WAL.Config struct {
    Dirpath string
    Snap walpb.Snapshot
    // IsIdempotent returns false if entry has side effects
    IsIdempotent(*raftpb.Entry) bool
}

and let etcdserver set IsIdempotent when configuring its WAL. If the field isn't set, WAL will default to func(*raftpb.Entry) bool { return false }.

@xiang90
Copy link
Contributor

xiang90 commented Jul 11, 2016

Actually for l-read, there is a better way to do it. We already implemented the l-read feature inside raft, so that it does not need to hit disk at all. I am not sure if we really need this for other kinds of readonly requests. Even if we need, we could use similar approach.

See #5193 for more details.

@mitake
Copy link
Contributor Author

mitake commented Jul 12, 2016

@heyitsanthony yes, lettinng etcd server judge about the side effect of entries would be safer. I agree.

@xiang90 IIUC, the recent l-read improvement is throughput oriented (Section 6.4 of the dissertation). Of course it will improve the read throughput but it would add latency to the request and in the worst case the latency is the heartbeat interval (100ms default and shouldn't be so short for avoiding frequent elections). On the other hand, skipping persistence does not introduce the additional latency cost.

Of course I'm not against the l-read improvement and understand my approach should be analyzed carefully for ensuring that it preserves the safety properties. And I will not be surprised even if the analysis concludes that the way isn't safe ;)

@xiang90 xiang90 self-assigned this Jul 19, 2016
@mitake
Copy link
Contributor Author

mitake commented Jul 28, 2016

Maybe we should ask about this idea on the raft-dev mailing list? Someone else might try this kind of idea in the past and have experience.

@xiang90
Copy link
Contributor

xiang90 commented Jul 28, 2016

@mitake Sure. Probably you want to ask this in the mailing list.

@mitake
Copy link
Contributor Author

mitake commented Jul 28, 2016

@xiang90 yes, I'll post to the mailing list and cc you people.

@xiang90
Copy link
Contributor

xiang90 commented Aug 24, 2016

closing this in favor of #6212.

@xiang90 xiang90 closed this as completed Aug 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants