Skip to content
sduwangning edited this page Aug 24, 2012 · 5 revisions

Introduction

Squid-ssd (by renren.com) is a modified squid optimized for SSD+HDD mixed setup, based on squid-2.7.STABLE9 code.

Squid-ssd is suitable for serving millions of small files such as avatars, pictures, photos. For serving big files, you can use other store squid already had had for years.

Build & Configuration

Please use 'coss-index' branch. Indexing is added to decrease startup time to 10s level instead of hours.

$ git clone https://github.com/renren/squid-ssd.git
$ cd squid-ssd
$ git checkout coss-index

To build squid-ssd, you should add --enable-coss-aio-ops option to configure scripts, and then make && make install.

In the config file, we add 'backstore' option to coss store. Below is an example configuration

cache_dir coss /mnt/ssd0/stripe 100000        \
        backstore=/mnt/hdd0/stripe,400000     \
        backstore=/mnt/hdd1/stripe,400000     \
        backstore=/mnt/hdd2/stripe,400000     \
        max-size=1024768 block-size=1024

For a clean setup, you can use squid -z to initialize coss stripe files. Besides that, squid-ssd are designed to be compatible with legacy coss stripe files. In the example above, you already have three coss stripe files /mnt/hdd{0,1,2}/stripe on three different HDD disks, and just now, you installed a SSD disk and mounted to /mnt/ssd0. You can simply use dd if=/mnt/hdd0/stripe of=/mnt/ssd0/stripe bs=1MiB count=100000 to initialize SSD stripe file.

Architecture

Storing small files is a big problem for most conventional native filesystems: that is, too many metadata and data scatter all over the places, so heavy random read and write are involved when allocating, writing, reading, moving, deleting files. For cache like squid, the same problem exists, and is amplified, because cache serves heavy random read.

OS filesystem (memory) caching mechenism may help, if these so many metadata are cached in memory. But caching data is also important, and sometime more important, because serving data from memory cache avoids disk I/O, including metadata I/O.

Squid's COSS store is designed in the principle that pieces are bundled together to decrease aformentioned overhead. Small files are packed into big COSS file, although read is still mostly random, write is sequentialized as much as possible. And a few big files save too much metadata overhead than millions of small files. And for real world maintenace, COSS is also an optimized (but with problem) solution, because loading a COSS file is totally sequential.

You may find a lot of merits to describe COSS store, but world is not perfect yet: random read from within a few big files is still random read. And COSS introduces extra overhead for reallocation of hot content, even if it is not heavy.

So using SSD is a potential and logic choice. Current generation of SSD's are excellent at sequential write, sequential and random read, and they have decent random write capability (compared to HDD).

SSD is getting cheap everyday, but still expensive compared to HDD. It may or may not be good to use all SSD setup.

If your workload has high repeat rate (that is, averagely and evenly, cached objects will be accessed a much lot of times), and cache evicting & replacing rate is low (so total disk capacity is not an issue), use all SSD setup. You can even use 10G NIC to match the throughput capability.

Nevertheless, if you run a popular web 2.0 sites, you're almostly not so lucky. Avatars should be ok. Some picutures, photos can be hot spot, but just in short time when the related contents are new. They will be digged out some months later, accessed few times, and forgotten under dusts again.

So you face these challenges when appling SSD in CDN nodes:

  1. You need to serve huge traffic while keeping reponse quick
  2. You need fast write for modest traffic of back-to-origin fetches
  3. You need big capacity for less back-to-origin fetches; SSD is expensive

Squid-ssd addresses those challenges well:

  1. User reads from SSD; back-to-origin fetches written to SSD
  2. Objects evicted from SSD to HDD, and from HDD to nothing
  3. Objects promoted to SSD from HDD to avoid back-to-origin fetches

TODOs

  1. Use fallocate to initialize coss stripe file for filesystems which support the fallocate system call(ext4, xfs)
  2. Add index for coss file to speedup startup. These codes are in coss-index branch, however, they are not reviewed and fully tested.

LIMITATIONS

Squid 2.7 is single processed and can't scale well on multi-core systems (however you can run multiple instances when convenient). You may see 400Mbps-600Mbps capping when serving small files (such as avatars) because one core is near 100% used.

Clone this wiki locally