Using fatcache #12
Comments
We are currently not running fatcache in production. Given the work at hand (a data insight project), my guess is we will start looking at the multiple cache solutions at the beginning of next year, and make a case for using fatcache on the long-tail part as we gain more insight into the data. |
Hi Yue, Thank you. I am a SSD architect at Intel. I’d like to determine performance of memcache on our future SSDs. Annie From: Yao Yue [mailto:notifications@github.com] We are currently not running fatcache in production. Given the work at hand (a data insight project), my guess is we will start looking at the multiple cache solutions at the beginning of next year, and make a case for using fatcache on the long-tail part as we gain more insight into the data. — |
@annief the current perf numbers with commodity intel 320 ssd (https://github.com/twitter/fatcache/blob/master/notes/spec.md) can be found here: https://github.com/twitter/fatcache/blob/master/notes/performance.md One of the designs in fatacache that kind of limits it from getting more out of ssd is the fact that IO to disk is sync. Given the fact that fatcache is single threaded and uses asynchronous IO for network, the synchronous IO to SSD is sort of limiting its performance (throughput to be more precise). I believe we can make fatcache IO async by using epoll and linux aio framework -- https://code.google.com/p/kernel/wiki/AIOUserGuide. If we can fix this, fatcache would become a viable alternative for in-memory caches. |
We have a pull request adding async support to fatcache and I should look at it real soon. A collaboration would be fantastic. |
I see - being both single threaded and synchronous can be a limit. So that you know, I am very new to fatcache. I appreciate you helping me understand fatcache better.
Is it useful if I were to do a performance analysis to determine what performance bottlenecks are worthy of fixing in fatcache? From: Manju Rajashekhar [mailto:notifications@github.com] @anniefhttps://github.com/annief the current perf numbers with commodity intel 320 ssd (https://github.com/twitter/fatcache/blob/master/notes/spec.md) can be found here: https://github.com/twitter/fatcache/blob/master/notes/performance.md One of the designs in fatacache that kind of limits it from getting more out of ssd is the fact that IO to disk is sync. Given the fact that fatcache is single threaded and uses asynchronous IO for network, the synchronous IO to SSD is sort of limiting its performance (throughput to be more precise). I believe we can make fatcache IO async by using epoll and linux aio framework -- https://code.google.com/p/kernel/wiki/AIOUserGuide. If we can fix this, fatcache would become a viable alternative for in-memory caches. — |
Happy to help with any questions you have about fatcache and/or its architecture. Please feel free to ask anything
I'm betting on async disk IO. If you look at the perf numbers in performance.md, you will notice that the trick is in increasing the average queue size to get 100% utilization (see %util iostat numbers). In order to do that I had to run 8 instances of fatcache on a single ssd. It would be nice get 100% util with single just fatcache instance. Furthermore, as we evolve the fatcache architecture it would be nice to incorporate the fact that a single fatcache instance can talk to multiple ssd's at the same time (think of lvm) which would essentially allow us to use commodity ssd that have limited parallelism (queue length)
No it doesn't. It implements memcache ascii protocol (https://github.com/twitter/fatcache/blob/master/notes/memcache.txt). But instead of being multithreaded, it architecture is single threaded which was popularized by key-value stores like redis. Single threaded makes sense because you can always run multiple instances of fatcache on one machine or multiple machines to scale horizontally and have some kind of sharding layer on the client to route traffic to one of the fatcache instances
single threaded makes reasoning simple :) There is nothing in fatcache that is CPU intensive that warrants it to be multithreaded. Most of the work in fatchcache is handling network IO and disk IO. Currently Network IO is async, but disk IO is not. Sync disk IO is limiting fatcache performance
absolutely! |
I understand what you mean now. You are counting on multi-instances (processes) to scale, not necessarily multi-threads. Makes good sense. Are the default settings optimal for a fatcache/memcache deployment? Looking for help from folks who runs oversees a memcache deployment to help get the right settings. Thanks. From: Manju Rajashekhar [mailto:notifications@github.com] Happy to help with any questions you have about fatcache and/or its architecture. Please feel free to ask anything Would making fatcache multithreaded (vs async) be a better longer term scaling strategy? I'm betting on async disk IO. If you look at the perf numbers in performance.md, you will notice that the trick is in increasing the average queue size to get 100% utilization (see %util iostat numbers). In order to do that I had to run 8 instances of fatcache on a single ssd. It would be nice get 100% util with single just fatcache instance. Furthermore, as we evolve the fatcache architecture it would be nice to incorporate the fact that a single fatcache instance can talk to multiple ssd's at the same time (think of lvm) which would essentially allow us to use commodity ssd that have limited parallelism (queue length) is fatcache an extension of twemcache or memcache with SSD consideration? does it not inherit memcache’s existing multithreaded framework? No it doesn't. It implements memcache ascii protocol (https://github.com/twitter/fatcache/blob/master/notes/memcache.txt). But instead of being multithreaded, it architecture is single threaded which was popularized by key-value stores like redis. Single threaded makes sense because you can always run multiple instances of fatcache on one machine or multiple machines to scale horizontally and have some kind of sharding layer on the client to route traffic to one of the fatcache instances what in fatcache inherently require serialization? I assume that’s why you chose a single thread implementation? single threaded makes reasoning simple :) There is nothing in fatcache that is CPU intensive that warrants it to be multithreaded. Most of the work in fatchcache is handling network IO and disk IO. Currently Network IO is async, but disk IO is not. Sync disk IO is limiting fatcache performance Is it useful if I were to do a performance analysis to determine what performance bottlenecks are worthy of fixing in fatcache? absolutely! — |
We do use jumbo frames when possible. Single get/set can be split across multiple packets, but for most workloads we observe, simple key/value requests are usually smaller than one MTU, even w/o the jumbo frame config. |
This is a great discussion. I have a couple questions Are there any other performance numbers available? I'm curious about how things work out when storing larger values. Would it make sense to use several SSDs in a box instead of one? annief, have you been testing this on your newer hardware? Thanks! |
Hi Archier, I am behind in what I said I would do. No performance numbers yet. My apologies. Will get back into that mode. Per your question on size, the way I think about it is: per your question on # of SSDs, if your goal is to achieve the maximum possible throughput, more SSDs is better.
Can someone who has experience with production memcache help guide us on these questions? Thanks! |
Hi Annie, looks like you closed the issue? We run memcache in production, get in touch. |
Hi Archie, I clicked the wrong button! let me reopen the issue. what is a good next step to incorporate your workload insights? From: Archie R [mailto:notifications@github.com] Hi Annie, looks like you closed the issue? We run memcache in production, get in touch. — |
Hi Annie just drop me a message -- my email's on my Git page. On Tue, Oct 29, 2013 at 6:09 PM, annief notifications@github.com wrote:
|
page 2 of this link as few numbers -- https://speakerdeck.com/manj/caching-at-twitter-with-twemcache-twitter-open-source-summit; it is a year old though @thinkingfish can possibly answer your questions |
Regarding latencies, here is a profile of Twemcache by Rao Fu: When it comes to throughput, because of the existence of multiget and the
This depends on the bottleneck. The overly simplified answer is "yes". If a -Yao (@thinkingfish) On Tue, Oct 29, 2013 at 4:16 PM, annief notifications@github.com wrote:
|
On Tue, Oct 29, 2013 at 4:16 PM, annief notifications@github.com wrote:
-Yao (@thinkingfish) |
Hi,
a. the challenge is the knowing size. is the size obtained via trial and error to meet a given cache hit ratio? From: Yao Yue [mailto:notifications@github.com] On Tue, Oct 29, 2013 at 4:16 PM, annief notifications@github.com wrote:
-Yao (@thinkingfish) — |
-Yao (@thinkingfish) On Wed, Oct 30, 2013 at 3:44 PM, annief notifications@github.com wrote:
-Yao (@thinkingfish) On Wed, Oct 30, 2013 at 3:44 PM, annief notifications@github.com wrote:
|
In your setup, what would a request/s-to-size ratio be to motivate moving to higher capacity/lower cost SSDs? From: Yao Yue [mailto:notifications@github.com]
-Yao (@thinkingfish) On Wed, Oct 30, 2013 at 3:44 PM, annief notifications@github.com wrote:
-Yao (@thinkingfish) On Wed, Oct 30, 2013 at 3:44 PM, annief notifications@github.com wrote:
— |
Hmmm, there are many variables that it depends one. On the service side, Let's fixate key/value size first. Assuming we have 100 Byte key+value, Now we can vary key/value size. If we go down in key-value size, the At a high level the incentive comes from datasets with a long tail (large All this calculation can be better presented with a chart. I haven't got -Yao (@thinkingfish) On Wed, Oct 30, 2013 at 4:59 PM, annief notifications@github.com wrote:
|
Hi, is fatcache currently in production use? I'm interested in contributing to optimizing memcache for SSDs, and would like to start with performance analysis to expose optimization possibilities. Should I start with fatcache or is there another deployment you would recommend? Thanks.
The text was updated successfully, but these errors were encountered: