Skip to content
thijsterlouw edited this page Sep 13, 2010 · 8 revisions

This is probably the best performing Erlang Memcached client out there. It is totally optimized for speed:

  • uses the Memcached binary protocol (sequence numbers + faster parsing)
  • avoid blocking the gen_server which maintains the persistent connections to the Memcached server
    • that means using {active, true} for the tcp connection.
    • If you use {active, false} (which almost every other client does), then you are blocking the gen_server process while you are waiting for the gen_tcp:recv(). That means the gen_server can get a huge inbox, which in turn means that many requests will timeout in the erlang side (or you just get very bad performance)
  • it uses dynamically compiled code to store the configuration. This is expecially important for the Ketama-style consistent hash ring and binary-search lookups of the server address based on the hashed key
    • if you store the config in a process, then you have the bottleneck of sequential access
    • if you read the config from config file / environment only, then you don’t get dynamic updates (unless you reload the env, becomes messy)
    • you could use an ETS table to store the points on the continuum, but if you store all as one key, it’s very slow to copy the entire ring to the requesting process
    • if you store ring points as seperate keys in ETS, then it’s already much faster, but still have to access ETS many times, which is slower than necessary
    • the best solution is to dynamically compile the configuration and have a process responsible for maintaining the state and updating the file
      • even then several options exist for matching: gb_trees, case-matching etc. It turns out that erlang’s case matching doesn’t scale well with many points on the ring (160 points * number of servers). gb_trees is much better then. We are also experimenting with a driver (C code) for this part. You would gain more speed, but flexibility becomes a bit harder.
  • this client supports the concept of Memcached pools. So you can say server A,B and C are together in one pool and servers D,E and F in another. When you get many servers, or you are migrating servers, this is a huge benefit. Fewer servers per pool means that multi-get requests become faster as well.
  • supports very fast JSON encoder/decoder via a customized eep0018/yajl extension
  • avoid calls to erlang:now() by using NIFs (C code)

TODO:

  • removing (and adding back) servers that go down (and up), for example in the case of network splits
Clone this wiki locally