Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce file system and implement streaming encoding #91

Merged
merged 3 commits into from
Aug 20, 2021

Conversation

tabokie
Copy link
Member

@tabokie tabokie commented Aug 20, 2021

Signed-off-by: tabokie xy.tao@outlook.com

This PR rearranges internal data access around a standard reader/writer. This way external IO logic can be embedded through the new interface FileSystem, which allows for building wrapper around said reader/writer.
This interface is not ideal as it doesn't ensure implementation to actually leverage the raw IO object. More work needs to be done to factor out code dependence on LogFd and sever FileSystem as a standalone abstraction.

This PR also implements streaming encoding. LogBatch now encodes user data immediately instead of keeping a copy and encode before actual writing.

Stress results:

Master:
- Thread 1
[write]
Throughput(QPS) = 8004.80
Latency(μs) min = 70, avg = 116.66, p50 = 112, p90 = 143, p95 = 156, p99 = 204, p99.9 = 497, max = 3997
Fairness = 100.0%
Write Bandwidth = 5.9MiB/s
- Thread 2
[write]
Throughput(QPS) = 17403.63
Latency(μs) min = 61, avg = 107.43, p50 = 102, p90 = 129, p95 = 143, p99 = 189, p99.9 = 406, max = 3653
Fairness = 100.0%
Write Bandwidth = 13.0MiB/s
- Thread 4
[write]
Throughput(QPS) = 30859.47
Latency(μs) min = 63, avg = 122.39, p50 = 114, p90 = 150, p95 = 167, p99 = 324, p99.9 = 581, max = 6083
Fairness = 99.7%
Write Bandwidth = 23.1MiB/s
- Thread 8
[write]
Throughput(QPS) = 47897.03
Latency(μs) min = 67, avg = 160.13, p50 = 154, p90 = 216, p95 = 237, p99 = 299, p99.9 = 675, max = 4987
Fairness = 99.5%
Write Bandwidth = 35.7MiB/s
- Thread 16
[write]
Throughput(QPS) = 52373.21
Latency(μs) min = 70, avg = 298.32, p50 = 288, p90 = 440, p95 = 517, p99 = 729, p99.9 = 1236, max = 11711
Fairness = 99.3%
Write Bandwidth = 39.2MiB/s

Patched:
- Thread 1
[write]
Throughput(QPS) = 8918.95
Latency(μs) min = 53, avg = 91.23, p50 = 88, p90 = 113, p95 = 122, p99 = 156, p99.9 = 387, max = 6179
Fairness = 100.0%
Write Bandwidth = 6.6MiB/s
- Thread 2
[write]
Throughput(QPS) = 19137.79
Latency(μs) min = 46, avg = 85.84, p50 = 82, p90 = 103, p95 = 115, p99 = 147, p99.9 = 362, max = 4955
Fairness = 100.0%
Write Bandwidth = 14.2MiB/s
- Thread 4
[write]
Throughput(QPS) = 34229.31
Latency(μs) min = 48, avg = 98.22, p50 = 95, p90 = 121, p95 = 133, p99 = 168, p99.9 = 443, max = 8055
Fairness = 99.9%
Write Bandwidth = 25.6MiB/s
- Thread 8
[write]
Throughput(QPS) = 50313.45
Latency(μs) min = 49, avg = 140.74, p50 = 134, p90 = 197, p95 = 218, p99 = 278, p99.9 = 733, max = 6119
Fairness = 99.5%
Write Bandwidth = 37.7MiB/s
- Thread 16
[write]
Throughput(QPS) = 54520.49
Latency(μs) min = 55, avg = 274.42, p50 = 268, p90 = 406, p95 = 477, p99 = 650, p99.9 = 1113, max = 18015
Fairness = 99.7%
Write Bandwidth = 40.7MiB/s

Signed-off-by: tabokie <xy.tao@outlook.com>
Signed-off-by: tabokie <xy.tao@outlook.com>
Signed-off-by: tabokie <xy.tao@outlook.com>
@tabokie tabokie merged commit 130d3b6 into tikv:master Aug 20, 2021
@tabokie tabokie deleted the fs2 branch August 20, 2021 11:33
@zhangjinpeng87
Copy link
Member

@tabokie Please fill the description, it is not friendly for others want to know more about this PR with no full description.

self.buf.encode_u64(0)?;
self.buf.encode_u64(0)?;
} else if self.buf_state != BufState::EntriesFilled {
// return error
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIXME

enum BufState {
Uninitialized,
EntriesFilled,
Sealed(usize),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting the meaning for this counter.

ftruncate(self.0, offset as i64).map_err(|e| from_nix_error(e, "ftruncate"))
}

pub fn allocate(&self, offset: usize, size: usize) -> IoResult<()> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test that an allocated file reads out 0.

tabokie added a commit that referenced this pull request Sep 7, 2021
Signed-off-by: tabokie <xy.tao@outlook.com>
tabokie added a commit to tabokie/raft-engine that referenced this pull request Sep 8, 2021
Signed-off-by: tabokie <xy.tao@outlook.com>
tabokie added a commit to tabokie/raft-engine that referenced this pull request Sep 8, 2021
Signed-off-by: tabokie <xy.tao@outlook.com>
tabokie added a commit that referenced this pull request Sep 8, 2021
1. Append compressed entries to log batch buffer instead of replacing the buffer.
2. Avoid allocate twice when creating `protobuf::Message` from decompressed slice
3. Use entry reference for log batch append

Bench results:
```
Patched:
test log_batch::tests::bench_log_batch_add_entry_and_encode ... bench:     136,299 ns/iter (+/- 44,912)
test log_batch::tests::bench_log_batch_add_entry_and_encode ... bench:     131,405 ns/iter (+/- 82,630)

Master:
test log_batch::tests::bench_log_batch_add_entry_and_encode ... bench:     199,314 ns/iter (+/- 95,954)
test log_batch::tests::bench_log_batch_add_entry_and_encode ... bench:     197,948 ns/iter (+/- 16,842)

Without streaming encoding (#91):
test log_batch::tests::bench_log_batch_add_entry_and_encode ... bench:     407,738 ns/iter (+/- 142,417)
test log_batch::tests::bench_log_batch_add_entry_and_encode ... bench:     411,079 ns/iter (+/- 111,409)
```

Signed-off-by: tabokie <xy.tao@outlook.com>
tabokie added a commit that referenced this pull request Sep 9, 2021
Replace file system (#91) with an Allocator-like type `FileBuilder`.

File builder is better, it's strongly typed with our own file writer/reader, so that 1) we don't need to propagate `Send` and `Sync` trait bound outside of this library, 2) user writers or readers are promised to use our writer/reader as a backend.

There still are some reasonable requirements on user types, such as no buffering and no length altering, which are documented in trait definition.

Signed-off-by: tabokie <xy.tao@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants