Skip to content

Commit

Permalink
Overhauls the image cache to be truly optional
Browse files Browse the repository at this point in the history
Fixes LP Bug#874580 - keyerror 'location' when fetch errors
Fixes LP Bug#817570 - Make new image cache a true extension
Fixes LP Bug#872372 - Image cache has virtually no unit test coverage

* Adds unit tests for the image cache (coverage goes from 26% to 100%)
* Removes caching logic from the images controller and places it into
  a removeable transparent caching middleware
* Adds a functional test case that verifies caching of an image
  and subsequent cache hits
* Removes the image_cache_enabled configuration variable, since it's
  now enabled by simply including the cache in the application
  pipeline
* Adds a singular glance-cache.conf to etc/ that replaces the
  multiple glance-pruner.conf, glance-reaper.conf and
  glance-prefetcher.conf files
* Adds documentation on enabling and configuring the image cache

TODO: Add documentation on the image cache utilities, like reaper,
      prefetcher, etc.

Change-Id: I58845871deee26f81ffabe1750adc472ce5b3797
  • Loading branch information
jaypipes committed Oct 19, 2011
1 parent e764565 commit ad9e9ca
Show file tree
Hide file tree
Showing 14 changed files with 835 additions and 138 deletions.
48 changes: 48 additions & 0 deletions doc/source/configuring.rst
Expand Up @@ -468,6 +468,54 @@ To set up a user named ``glance`` with minimal permissions, using a pool called
ceph-authtool --gen-key --name client.glance --cap mon 'allow r' --cap osd 'allow rwx pool=images' /etc/glance/rbd.keyring
ceph auth add client.glance -i /etc/glance/rbd.keyring

Configuring the Image Cache
---------------------------

Glance API servers can be configured to have a local image cache. Caching of
image files is transparent and happens using a piece of middleware that can
optionally be placed in the server application pipeline.

Enabling the Image Cache Middleware
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To enable the image cache middleware, you would insert the cache middleware
into your application pipeline **after** the appropriate context middleware.

The cache middleware should be in your ``glance-api.conf`` in a section titled
``[filter:cache]``. It should look like this::

[filter:cache]
paste.filter_factory = glance.api.middleware.cache:filter_factory


For example, suppose your application pipeline in the ``glance-api.conf`` file
looked like so::

[pipeline:glance-api]
pipeline = versionnegotiation context apiv1app

In the above application pipeline, you would add the cache middleware after the
context middleware, like so::

[pipeline:glance-api]
pipeline = versionnegotiation context cache apiv1app

And that would give you a transparent image cache on the API server.

Configuration Options Affecting the Image Cache
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

One main configuration file option affects the image cache.

* ``image_cache_datadir=PATH``

Required when image cache middleware is enabled.

Default: ``/var/lib/glance/image-cache``

This is the root directory where the image cache will write its
cached image files. Make sure the directory is writeable by the
user running the ``glance-api`` server

Configuring the Glance Registry
-------------------------------
Expand Down
33 changes: 17 additions & 16 deletions etc/glance-api.conf
Expand Up @@ -164,18 +164,6 @@ rbd_store_pool = images
# For best performance, this should be a power of two
rbd_store_chunk_size = 8

# ============ Image Cache Options ========================

image_cache_enabled = False

# Directory that the Image Cache writes data to
# Make sure this is also set in glance-pruner.conf
image_cache_datadir = /var/lib/glance/image-cache/

# Number of seconds after which we should consider an incomplete image to be
# stalled and eligible for reaping
image_cache_stall_timeout = 86400

# ============ Delayed Delete Options =============================

# Turn on/off delayed delete
Expand All @@ -188,15 +176,25 @@ scrub_time = 43200
# Make sure this is also set in glance-scrubber.conf
scrubber_datadir = /var/lib/glance/scrubber

# =============== Image Cache Options =============================

# Directory that the Image Cache writes data to
image_cache_datadir = /var/lib/glance/image-cache/

[pipeline:glance-api]
pipeline = versionnegotiation context apiv1app
# NOTE: use the following pipeline for keystone
# pipeline = versionnegotiation authtoken auth-context apiv1app

# To enable transparent caching of image files replace pipeline with below:
# pipeline = versionnegotiation context cache apiv1app
# NOTE: use the following pipeline for keystone auth (with caching)
# pipeline = versionnegotiation authtoken auth-context cache apiv1app

# To enable Image Cache Management API replace pipeline with below:
# pipeline = versionnegotiation context imagecache apiv1app
# pipeline = versionnegotiation context cachemanage apiv1app
# NOTE: use the following pipeline for keystone auth (with caching)
# pipeline = versionnegotiation authtoken auth-context imagecache apiv1app
# pipeline = versionnegotiation authtoken auth-context cachemanage apiv1app

[pipeline:versions]
pipeline = versionsapp
Expand All @@ -210,8 +208,11 @@ paste.app_factory = glance.api.v1:app_factory
[filter:versionnegotiation]
paste.filter_factory = glance.api.middleware.version_negotiation:filter_factory

[filter:imagecache]
paste.filter_factory = glance.api.middleware.image_cache:filter_factory
[filter:cache]
paste.filter_factory = glance.api.middleware.cache:filter_factory

[filter:cachemanage]
paste.filter_factory = glance.api.middleware.cache_manage:filter_factory

[filter:context]
paste.filter_factory = glance.common.context:filter_factory
Expand Down
56 changes: 56 additions & 0 deletions etc/glance-cache.conf
@@ -0,0 +1,56 @@
[DEFAULT]
# Show more verbose log output (sets INFO log level output)
verbose = True

# Show debugging output in logs (sets DEBUG log level output)
debug = False

log_file = /var/log/glance/image-cache.log

# Send logs to syslog (/dev/log) instead of to file specified by `log_file`
use_syslog = False

# Directory that the Image Cache writes data to
image_cache_datadir = /var/lib/glance/image-cache/

# Number of seconds after which we should consider an incomplete image to be
# stalled and eligible for reaping
image_cache_stall_timeout = 86400

# image_cache_invalid_entry_grace_period - seconds
#
# If an exception is raised as we're writing to the cache, the cache-entry is
# deemed invalid and moved to <image_cache_datadir>/invalid so that it can be
# inspected for debugging purposes.
#
# This is number of seconds to leave these invalid images around before they
# are elibible to be reaped.
image_cache_invalid_entry_grace_period = 3600

image_cache_max_size_bytes = 1073741824

# Percentage of the cache that should be freed (in addition to the overage)
# when the cache is pruned
#
# A percentage of 0% means we prune only as many files as needed to remain
# under the cache's max_size. This is space efficient but will lead to
# constant pruning as the size bounces just-above and just-below the max_size.
#
# To mitigate this 'thrashing', you can specify an additional amount of the
# cache that should be tossed out on each prune.
image_cache_percent_extra_to_free = 0.20

# Address to find the registry server
registry_host = 0.0.0.0

# Port the registry server is listening on
registry_port = 9191

[app:glance-pruner]
paste.app_factory = glance.image_cache.pruner:app_factory

[app:glance-prefetcher]
paste.app_factory = glance.image_cache.prefetcher:app_factory

[app:glance-reaper]
paste.app_factory = glance.image_cache.reaper:app_factory
180 changes: 180 additions & 0 deletions glance/api/middleware/cache.py
@@ -0,0 +1,180 @@
# vim: tabstop=4 shiftwidth=4 softtabstop=4

# Copyright 2011 OpenStack LLC.
# All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

"""
Transparent image file caching middleware, designed to live on
Glance API nodes. When images are requested from the API node,
this middleware caches the returned image file to local filesystem.
When subsequent requests for the same image file are received,
the local cached copy of the image file is returned.
"""

import httplib
import logging
import re
import shutil

from glance import image_cache
from glance import registry
from glance.api.v1 import images
from glance.common import exception
from glance.common import utils
from glance.common import wsgi

import webob

logger = logging.getLogger(__name__)
get_images_re = re.compile(r'^(/v\d+)*/images/(.+)$')


class CacheFilter(wsgi.Middleware):

def __init__(self, app, options):
self.options = options
self.cache = image_cache.ImageCache(options)
self.serializer = images.ImageSerializer()
logger.info(_("Initialized image cache middleware using datadir: %s"),
options.get('image_cache_datadir'))
super(CacheFilter, self).__init__(app)

def process_request(self, request):
"""
For requests for an image file, we check the local image
cache. If present, we return the image file, appending
the image metadata in headers. If not present, we pass
the request on to the next application in the pipeline.
"""
if request.method != 'GET':
return None

match = get_images_re.match(request.path)
if not match:
return None

image_id = match.group(2)
if self.cache.hit(image_id):
logger.debug(_("Cache hit for image '%s'"), image_id)
image_iterator = self.get_from_cache(image_id)
context = request.context
try:
image_meta = registry.get_image_metadata(context, image_id)

response = webob.Response()
return self.serializer.show(response, {
'image_iterator': image_iterator,
'image_meta': image_meta})
except exception.NotFound:
msg = _("Image cache contained image file for image '%s', "
"however the registry did not contain metadata for "
"that image!" % image_id)
logger.error(msg)
return None

# Make sure we're not already prefetching or caching the image
# that just generated the miss
if self.cache.is_image_currently_prefetching(image_id):
logger.debug(_("Image '%s' is already being prefetched,"
" not tee'ing into the cache"), image_id)
return None
elif self.cache.is_image_currently_being_written(image_id):
logger.debug(_("Image '%s' is already being cached,"
" not tee'ing into the cache"), image_id)
return None

# NOTE(sirp): If we're about to download and cache an
# image which is currently in the prefetch queue, just
# delete the queue items since we're caching it anyway
if self.cache.is_image_queued_for_prefetch(image_id):
self.cache.delete_queued_prefetch_image(image_id)
return None

def process_response(self, resp):
"""
We intercept the response coming back from the main
images Resource, caching image files to the cache
"""
if not self.get_status_code(resp) == httplib.OK:
return resp

request = resp.request
if request.method != 'GET':
return resp

match = get_images_re.match(request.path)
if match is None:
return resp

image_id = match.group(2)
if not self.cache.hit(image_id):
# Make sure we're not already prefetching or caching the image
# that just generated the miss
if self.cache.is_image_currently_prefetching(image_id):
logger.debug(_("Image '%s' is already being prefetched,"
" not tee'ing into the cache"), image_id)
return resp
if self.cache.is_image_currently_being_written(image_id):
logger.debug(_("Image '%s' is already being cached,"
" not tee'ing into the cache"), image_id)
return resp

logger.debug(_("Tee'ing image '%s' into cache"), image_id)
# TODO(jaypipes): This is so incredibly wasteful, but because
# the image cache needs the image's name, we have to do this.
# In the next iteration, remove the image cache's need for
# any attribute other than the id...
image_meta = registry.get_image_metadata(request.context,
image_id)
resp.app_iter = self.get_from_store_tee_into_cache(
image_meta, resp.app_iter)
return resp

def get_status_code(self, response):
"""
Returns the integer status code from the response, which
can be either a Webob.Response (used in testing) or httplib.Response
"""
if hasattr(response, 'status_int'):
return response.status_int
return response.status

def get_from_store_tee_into_cache(self, image_meta, image_iterator):
"""Called if cache miss"""
with self.cache.open(image_meta, "wb") as cache_file:
for chunk in image_iterator:
cache_file.write(chunk)
yield chunk

def get_from_cache(self, image_id):
"""Called if cache hit"""
with self.cache.open_for_read(image_id) as cache_file:
chunks = utils.chunkiter(cache_file)
for chunk in chunks:
yield chunk


def filter_factory(global_conf, **local_conf):
"""
Factory method for paste.deploy
"""
conf = global_conf.copy()
conf.update(local_conf)

def filter(app):
return CacheFilter(app, conf)

return filter
Expand Up @@ -27,9 +27,9 @@
logger = logging.getLogger('glance.api.middleware.image_cache')


class ImageCacheFilter(wsgi.Middleware):
class CacheManageFilter(wsgi.Middleware):
def __init__(self, app, options):
super(ImageCacheFilter, self).__init__(app)
super(CacheManageFilter, self).__init__(app)

map = app.map
resource = cached_images.create_resource(options)
Expand All @@ -52,6 +52,6 @@ def filter_factory(global_conf, **local_conf):
conf.update(local_conf)

def filter(app):
return ImageCacheFilter(app, conf)
return CacheManageFilter(app, conf)

return filter

0 comments on commit ad9e9ca

Please sign in to comment.