forked from valyala/ybc
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
196 lines (146 loc) · 8.65 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
YBC - Yet another Blob Cache library.
This library implements fast in-process blob cache with persistence support.
===============================================================================
YBC features.
* Huge amounts of data can be cached. Cache size may exceed available RAM size
by multiple orders of magnitude.
* Very large objects may be cached (up to 2^64 bytes). Think of media files
such as videos, audios and images.
* Very large cache size is supported (up to 2^64 items and up to 2^64 bytes).
In practice cache size is limited by free space on backing store,
not by available RAM size.
* Persistence support. Cached objects survive process restart if the cache
is explicitly backed by files.
* Data file layout is optimized for HDD and SSD devices. The code
prefers sequential I/O instead of random I/O. If random I/O is inevitable,
the code tries hard localizing it in the smallest possible memory region.
* Automatic robust recovery from corrupted index files. Corruptions in data
files may be left unnoticed due to performance reasons - it may be quite
expensive validating multi-GB blobs on every access.
* Built-in handling of dogpile effect (aka 'thundering herd').
* Cached objects may be sharded among available backing store devices.
Such sharding may linearly increase cache performance if frequently accessed
items don't fit available physical RAM.
* The library is thread-safe out of the box.
* Concurrent atomic updates. It is safe updating an item from multiple threads
while other threads are reading old value for the item under the same key.
* 'Add transaction' support allows constructing object's value on the fly
without serializing it into a temporary buffer (aka 'zero-copy').
Uncommited transaction can be rolled back at any time.
Think of videos streamed directly into the cache without usage of temporary
buffers or complex objects requiring serialization from multiple distinct
places before storing them into the cache.
* Readers and writers don't block each other while reading/writing data
from/into the cache. The speed is actually limited by hardware memory
bandwidth (if frequently accessed items fit RAM) or backing store
random I/O bandwidth (if requently accessed items don't fit RAM).
* Instant invalidation of all items in the cache irregardless of cache size.
* Optimization for multi-tiered memory hierarchy in modern CPUs. The code avoids
unnecessary random memory accesses and tightly packs frequently accessed data
in order to reduce working set size and increase CPU cache hit ratio.
* The code avoids using dynamic memory allocations in frequently executed paths,
so cache performance is almost independent of the efficiency of the provided
malloc() implementation.
* The code avoids using expensive system calls in frequently executed paths.
* The code depends only on OS-supplied libraries.
* The code can be easily ported to new platforms. All platform-specific
functions and types are hidden behind platform-independent wrappers.
* Public interface (ybc.h) is designed with future versions' compatibility
in mind. It doesn't expose private structures' contents, so they can be freely
modified in the future versions of the library without breaking applications
dynamically linked against older versions.
* Small library size (can be packed to 22Kb with -Os).
===============================================================================
Use-cases.
* CDN cache.
* File hosting cache.
* Memcache-like shared cache.
* Per-process local cache in web-servers.
* Web-proxy or web-accelerator cache.
* Web-browser cache.
* Out-of-GC blob cache for programming languages with GC support.
================================================================================
Credits.
YBC design is inspired by Varnish's "Notes from the Architect" -
https://www.varnish-cache.org/trac/wiki/ArchitectNotes .
================================================================================
FAQ.
Q: Give me performance numbers!
A: Well, it achieves 10Mqps for get() calls and 4Mqps for set() calls in one
thread on my not-so-fast laptop with 2.50GHz Intel i5 CPU.
Q: Is this yet another boring cache implementation?
A: No. YBC is designed for modern OSes running on modern hardware. It takes
advantage of OS's and computer's hardware features, while simultaneouly
avoiding their weaknesses. This results in high performance and low resource
usage. Read 'Features' chapter for more details.
Q: OK, how to use it in my program?
A: 1. Investigate API provided by YBC at ybc.h . See tests/ folder
for examples on how to properly use the API.
2. Use this API in your program.
3. Build either object file (ybc-[32|64]-[debug|release].o) or library
(libybc-[debug|release].so) using the corresponding build target
in the provided Makefile.
4. Link the object file or library against your program.
5. ?????
6. PROFIT!!!
Take a look also at bindings/ folder. It contains YBC bindings for various
programming languages if you don't like programming in C. Though currently
only Go is supported :)
Other folders also worth investigation:
* libs/ folder contains additional libraries built on top of YBC.
* apps/ folder contains applications built on top of YBC. Currently
there are two apps here with self-describing names:
* memcached - memcache server implementation on top of Go bindings
for YBC. Unlike the original memcache server, it supports cache
sizes exceeding available RAM by multiple orders of magnitude.
It also provides cache persistence, so cached data survives server
restarts or server crashes.
* memcached-bench - benchmark tool for memcached servers.
Makefile already contains build targets for all these apps.
Q: Why recently added items may disappear from the cache, while their ttl isn't
expired yet?
A: Because YBC is a cache, not a storage. They serve different pruposes:
- Storage is used for items' persistence. It guarantees that added items
exist until they are explicitly deleted.
- Cache is used as a performance booster. It doesn't guarantee that added
items exist until they are explicitly deleted.
YBC may delete any item at any time due to various reasons, which depend
on implementation details. So always expect that the added item may disappear
at any time during subsequent requests.
Q: Can YBC cache small items, not large blobs?
A: Yes. YBC works well with small items - it even supports zero-byte key and
zero-byte value.
Q: Why YBC cannot adjust backing files' size on-demand? I.e. automatically
shrink files when the number of items in the cache is small and automatically
expand files when the number of items exceeds current files' capacity.
A: Because this is stupid idea due to the following reasons:
- If cache size isn't bound, then backing files may eat all the available
space on storage devices.
- Automatic file size adjustment may significantly increase file
fragmentation, which will lead to performance degradation.
- File size adjustment may lead to arbitrary long delays in requests'
serving.
- This will complicate the implementation, which may result in more bugs.
Q: What's the purpose of cache persistence?
A: Cache persistence allows skipping cache warm-up step ater the application
restart. This step can be very expensive and time-consuming for large caches
and slow backends. You can also make a 'golden' copy of a persistent cache
and start every node in the application cluster using their own copy
of the 'golden' cache, thus avoiding warm-up step on all nodes
in the cluster.
Q: Why cache performance doesn't scale on multiple CPUs, but actually drops?
A: Because the cache uses global lock for keeping your data in a consistent
state. But this lock is held only for very short period of time during
'lookup' and 'new add transaction' operations. The cache doesn't hold
the lock when data is read/written to/from the cache. So, your application
shuld scale well on multiple CPUs if it works with cache items exceeding
1Kb size or if it is busy with other tasks except hammering the cache
in a tight loop.
Q: Are cache files platform-neutral?
A: No. Cache files are tightly tied to the platform they were created.
Cache files created on one platform (for example, x86) will be completely
broken when read on another platform (for example, x64). Though it is likely
they will appear as empty.
Platform interoperability may result in code complication and, probably,
performance degradation. So copy cache files only between machines with
identical platforms.