Skip to content

Commit

Permalink
Merge branch 'mixin' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
JWCook committed Feb 20, 2021
2 parents cf3f8ea + d4b2f61 commit 8fca61f
Show file tree
Hide file tree
Showing 13 changed files with 240 additions and 210 deletions.
57 changes: 42 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,11 @@ Not to be confused with [aiohttp-cache](https://github.com/cr0hn/aiohttp-cache),
for the aiohttp web server. This package is, as you might guess, specifically for the **aiohttp client**.

## Development Status
**This is an early work in progress and not yet fully functional!**
**This is an early work in progress!**

The current state is a mostly working drop-in replacement for `aiohttp.ClientSession`.
However, most cache operations are still synchronous, have had minimal testing, and likely have lots of bugs.
Breaking changes should be expected until a `1.0` release.

## Installation
Requires python 3.7+
Expand Down Expand Up @@ -54,35 +55,61 @@ Here is a simple example using an endpoint that takes 1 second to fetch.
After the first request, subsequent requests to the same URL will return near-instantly; so,
fetching it 10 times will only take ~1 second instead of 10.
```python
from aiohttp_client_cache import CachedSession
from aiohttp_client_cache import CachedSession, SQLiteBackend

async with CachedSession(backend='sqlite') as session:
async with CachedSession(cache=SQLiteBackend()) as session:
for i in range(10):
await session.get('http://httpbin.org/delay/1')
```

## Cache Backends
Several backends are available.
The default backend is `sqlite`, if installed; otherwise it falls back to `memory`.
`aiohttp-client-cache` can also be used as a mixin, if you happen have other mixin classes that you
want to combine with it:
```python
from aiohttp import ClientSession
from aiohttp_client_cache import CacheMixin

* `sqlite` : SQLite database (requires [aiosqlite](https://github.com/omnilib/aiosqlite))
* `redis` : Stores all data in a redis cache (requires [redis-py](https://github.com/andymccurdy/redis-py))
* `mongodb` : MongoDB database (requires [pymongo](https://pymongo.readthedocs.io/en/stable/))
* `gridfs` : MongoDB GridFS enables storage of documents greater than 16MB (requires pymongo)
* `memory` : Not persistent, simply stores all data in memory
class CustomSession(CacheMixin, CustomMixin, ClientSession):
pass
```

## Cache Backends
Several backends are available. If one isn't specified, a simple in-memory cache will be used.

* `SQLiteBackend`: Uses a [SQLite](https://www.sqlite.org) database
(requires [aiosqlite](https://github.com/omnilib/aiosqlite))
* `DynamoDBBackend`: Uses a [Amazon DynamoDB](https://aws.amazon.com/dynamodb/) database
(requires [boto3](https://github.com/boto/boto3))
* `RedisBackend`: Uses a [Redis](https://redis.io/) cache
(requires [redis-py](https://github.com/andymccurdy/redis-py))
* `MongoDBBackend`: Uses a [MongoDB](https://www.mongodb.com/) database
(requires [pymongo](https://pymongo.readthedocs.io/en/stable/))
* `GridFSBackend`: Uses a [MongoDB GridFS](https://docs.mongodb.com/manual/core/gridfs/) database,
which enables storage of documents greater than 16MB
(requires [pymongo](https://pymongo.readthedocs.io/en/stable/))

You can also provide your own backend by subclassing `aiohttp_client_cache.backends.BaseCache`.

## Expiration
If you are using the `expire_after` parameter , responses are removed from the storage the next time
the same request is made. If you want to manually purge all expired items, you can use
If you are using the `expire_after` parameter, expired responses are removed from the storage the
next time the same request is made. If you want to manually purge all expired items, you can use
`CachedSession.delete_expired_responses`. Example:

```python
session = CachedSession(expire_after=1)
await session.remove_expired_responses()
session = CachedSession(expire_after=3) # Cached responses expire after 3 hours
await session.remove_expired_responses() # Remove any responses over 3 hours old
```

## Conditional Caching
Caching behavior can be customized by defining various conditions:
* Response status codes
* Request HTTP methods
* Request headers
* Specific request parameters
* Custom filter function

See [CacheBackend](https://aiohttp-client-cache.readthedocs.io/en/latest/modules/aiohttp_client_cache.backends.base.html)
docs for details.

## Credits
Thanks to [Roman Haritonov](https://github.com/reclosedev) and
[contributors](https://github.com/reclosedev/requests-cache/blob/master/CONTRIBUTORS.rst)
Expand Down
2 changes: 1 addition & 1 deletion aiohttp_client_cache/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = '0.1.2'
__version__ = '0.1.3'

try:
from aiohttp_client_cache.backends import * # noqa
Expand Down
73 changes: 23 additions & 50 deletions aiohttp_client_cache/backends/__init__.py
Original file line number Diff line number Diff line change
@@ -1,55 +1,28 @@
from importlib import import_module
from logging import getLogger
from typing import Optional, Type

from aiohttp_client_cache.backends.base import ( # noqa
from aiohttp_client_cache.backends.base import ( # noqa: F401
BaseCache,
CacheController,
CacheBackend,
DictCache,
ResponseOrKey,
)

logger = getLogger(__name__)


def import_member(qualname: str) -> Optional[Type]:
"""Attempt to import a class or other module member by qualified name"""
try:
module, member = qualname.rsplit('.', 1)
return getattr(import_module(module), member)
except (AttributeError, ImportError) as e:
logger.debug(f'Could not load {qualname}: {str(e)}')
return None


# Import all backends for which dependencies have been installed
BACKEND_QUALNAMES = {
'dynamodb': 'aiohttp_client_cache.backends.dynamodb.DynamoDbController',
'gridfs': 'aiohttp_client_cache.backends.gridfs.GridFSController',
'memory': 'aiohttp_client_cache.backends.base.CacheController',
'mongodb': 'aiohttp_client_cache.backends.mongo.MongoDBController',
'redis': 'aiohttp_client_cache.backends.redis.RedisController',
'sqlite': 'aiohttp_client_cache.backends.sqlite.SQLiteController',
}
BACKEND_CLASSES = {name: import_member(qualname) for name, qualname in BACKEND_QUALNAMES.items()}


def init_backend(
backend: Optional[str] = None, cache_name: str = 'http-cache', *args, **kwargs
) -> CacheController:
"""Initialize a backend by name; defaults to ``sqlite`` if installed, otherwise ``memory``"""
logger.info(f'Initializing backend: {backend}')
if isinstance(backend, CacheController):
return backend
if not backend:
backend = 'sqlite' if 'sqlite' in BACKEND_CLASSES else 'memory'
backend = backend.lower()

if backend not in BACKEND_QUALNAMES:
raise ValueError(f'Invalid backend: {backend}')
backend_class = BACKEND_CLASSES.get(backend)
if not backend_class:
raise ImportError(f'Dependencies not installed for backend {backend}')

logger.info(f'Found backend type: {backend_class}')
return backend_class(cache_name, *args, **kwargs)
# Import all backends for which dependencies are installed
try:
from aiohttp_client_cache.backends.dynamodb import DynamoDBBackend
except ImportError:
DynamoDBBackend = None # type: ignore
try:
from aiohttp_client_cache.backends.gridfs import GridFSBackend
except ImportError:
GridFSBackend = None # type: ignore
try:
from aiohttp_client_cache.backends.mongo import MongoDBBackend
except ImportError:
MongoDBBackend = None # type: ignore
try:
from aiohttp_client_cache.backends.redis import RedisBackend
except ImportError:
RedisBackend = None # type: ignore
try:
from aiohttp_client_cache.backends.sqlite import SQLiteBackend
except ImportError:
SQLiteBackend = None # type: ignore
47 changes: 36 additions & 11 deletions aiohttp_client_cache/backends/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,39 @@
logger = getLogger(__name__)


class CacheController:
"""Class to manage higher-level cache operations.
Handles cache expiration, and generating cache keys, and managing redirect history.
Basic storage operations are handled by :py:class:`.BaseCache`.
To extend this with your own custom backend, implement a subclass of :py:class:`.BaseCache`
to use as :py:attr:`CacheController.responses` and :py:attr:`CacheController.response_aliases`.
class CacheBackend:
"""Base class for cache backends. This manages higher-level cache operations,
including cache expiration, generating cache keys, and managing redirect history.
If instantiated directly, ``CacheBackend`` will use a non-persistent in-memory cache.
Lower-level storage operations are handled by :py:class:`.BaseCache`.
To extend this with your own custom backend, implement one or more subclasses of
:py:class:`.BaseCache` to use as :py:attr:`CacheBackend.responses` and
:py:attr:`CacheBackend.response_aliases`.
Args:
cache_name: Cache prefix or namespace, depending on backend; see notes below
expire_after: Number of hours after which a cache entry will expire; se ``None`` to
never expire
allowed_codes: Limit caching only for response with this codes
allowed_methods: Cache only requests of this methods
include_headers: Make request headers part of the cache key
ignored_params: List of request parameters to be excluded from the cache key.
filter_fn: function that takes a :py:class:`aiohttp.ClientResponse` object and
returns a boolean indicating whether or not that response should be cached. Will be
applied to both new and previously cached responses
The ``cache_name`` parameter will be used as follows depending on the backend:
* ``sqlite``: Cache filename prefix, e.g ``my_cache.sqlite``
* ``mongodb``: Database name
* ``redis``: Namespace, meaning all keys will be prefixed with ``'cache_name:'``
Note on cache key parameters: Set ``include_get_headers=True`` if you want responses to be
cached under different keys if they only differ by headers. You may also provide
``ignored_parameters`` to ignore specific request params. This is useful, for example, when
requesting the same resource with different credentials or access tokens.
"""

def __init__(
Expand All @@ -30,10 +56,9 @@ def __init__(
expire_after: Union[int, timedelta] = None,
allowed_codes: tuple = (200,),
allowed_methods: tuple = ('GET', 'HEAD'),
filter_fn: Callable = lambda r: True,
include_headers: bool = False,
ignored_params: Iterable = None,
**kwargs,
filter_fn: Callable = lambda r: True,
):
self.name = cache_name
if expire_after is not None and not isinstance(expire_after, timedelta):
Expand Down Expand Up @@ -205,8 +230,8 @@ def filter_ignored_params(d):

# TODO: Support yarl.URL like aiohttp does?
class BaseCache(metaclass=ABCMeta):
"""A wrapper for the actual storage operations. This is separate from
:py:class:`.CacheController` to simplify writing to multiple tables/prefixes.
"""A wrapper for lower-level cache storage operations. This is separate from
:py:class:`.CacheBackend` to allow a single backend to contain multiple cache objects.
This is no longer using a dict-like interface due to lack of python syntax support for async
dict operations.
Expand Down
10 changes: 6 additions & 4 deletions aiohttp_client_cache/backends/dynamodb.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,21 @@
from boto3.resources.base import ServiceResource
from botocore.exceptions import ClientError

from aiohttp_client_cache.backends import BaseCache, CacheController, ResponseOrKey
from aiohttp_client_cache.backends import BaseCache, CacheBackend, ResponseOrKey
from aiohttp_client_cache.forge_utils import extend_signature


class DynamoDbController(CacheController):
class DynamoDBBackend(CacheBackend):
"""DynamoDB cache backend.
See :py:class:`.DynamoDbCache` for backend-specific options
See `DynamoDB Service Resource
<https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#service-resource>`_
for more usage details.
"""

def __init__(self, cache_name: str, *args, **kwargs):
super().__init__(cache_name, *args, **kwargs)
@extend_signature(CacheBackend.__init__)
def __init__(self, cache_name: str = 'http-cache', **kwargs):
super().__init__(cache_name=cache_name, **kwargs)
self.responses = DynamoDbCache(cache_name, 'responses', **kwargs)
self.redirects = DynamoDbCache(cache_name, 'urls', connection=self.responses.connection)

Expand Down
10 changes: 6 additions & 4 deletions aiohttp_client_cache/backends/gridfs.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,19 @@
from gridfs import GridFS
from pymongo import MongoClient

from aiohttp_client_cache.backends import BaseCache, CacheController, ResponseOrKey
from aiohttp_client_cache.backends import BaseCache, CacheBackend, ResponseOrKey
from aiohttp_client_cache.backends.mongo import MongoDBCache
from aiohttp_client_cache.forge_utils import extend_signature


class GridFSController(CacheController):
class GridFSBackend(CacheBackend):
"""An async-compatible interface for caching objects in MongoDB GridFS.
Use this if you need to support documents greater than 16MB.
"""

def __init__(self, cache_name: str, *args, connection: MongoClient = None, **kwargs):
super().__init__(cache_name, *args, **kwargs)
@extend_signature(CacheBackend.__init__)
def __init__(self, cache_name: str = 'http-cache', connection: MongoClient = None, **kwargs):
super().__init__(cache_name=cache_name, **kwargs)
self.responses = GridFSCache(cache_name, connection)
self.keys_map = MongoDBCache(cache_name, 'http_redirects', self.responses.connection)

Expand Down
10 changes: 6 additions & 4 deletions aiohttp_client_cache/backends/mongo.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,16 @@

from pymongo import MongoClient

from aiohttp_client_cache.backends import BaseCache, CacheController, ResponseOrKey
from aiohttp_client_cache.backends import BaseCache, CacheBackend, ResponseOrKey
from aiohttp_client_cache.forge_utils import extend_signature


class MongoDBController(CacheController):
class MongoDBBackend(CacheBackend):
"""MongoDB cache backend"""

def __init__(self, cache_name: str, *args, connection: MongoClient = None, **kwargs):
super().__init__(cache_name, *args, **kwargs)
@extend_signature(CacheBackend.__init__)
def __init__(self, cache_name: str = 'http-cache', connection: MongoClient = None, **kwargs):
super().__init__(cache_name=cache_name, **kwargs)
self.responses = MongoDBPickleCache(cache_name, 'responses', connection)
self.keys_map = MongoDBCache(cache_name, 'urls', self.responses.connection)

Expand Down
10 changes: 6 additions & 4 deletions aiohttp_client_cache/backends/redis.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,16 @@

from redis import Redis, StrictRedis

from aiohttp_client_cache.backends import BaseCache, CacheController, ResponseOrKey
from aiohttp_client_cache.backends import BaseCache, CacheBackend, ResponseOrKey
from aiohttp_client_cache.forge_utils import extend_signature


class RedisController(CacheController):
class RedisBackend(CacheBackend):
"""Redis cache backend"""

def __init__(self, cache_name: str, *args, **kwargs):
super().__init__(cache_name, *args, **kwargs)
@extend_signature(CacheBackend.__init__)
def __init__(self, cache_name: str = 'http-cache', **kwargs):
super().__init__(cache_name=cache_name, **kwargs)
self.responses = RedisCache(cache_name, 'responses', **kwargs)
self.redirects = RedisCache(cache_name, 'urls', connection=self.responses.connection)

Expand Down
31 changes: 17 additions & 14 deletions aiohttp_client_cache/backends/sqlite.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,35 +2,38 @@
import pickle
import sqlite3
from contextlib import asynccontextmanager
from os.path import splitext
from typing import AsyncIterator, Iterable, Optional, Union

import aiosqlite

from aiohttp_client_cache.backends import BaseCache, CacheController, ResponseOrKey
from aiohttp_client_cache.backends import BaseCache, CacheBackend, ResponseOrKey
from aiohttp_client_cache.forge_utils import extend_signature


class SQLiteController(CacheController):
"""SQLite cache backend.
class SQLiteBackend(CacheBackend):
"""An async SQLite cache backend.
Reading is fast, saving is a bit slower. It can store a large amount of data
with low memory usage.
The path to the database file will be ``<cache_name>.sqlite``, or just ``<cache_name>`` if a
a different file extension is specified.
Args:
cache_name: database filename prefix
extension: Database file extension
cache_name: Database filename
"""

def __init__(self, cache_name: str, *args, extension: str = '.sqlite', **kwargs):
super().__init__(cache_name, *args, **kwargs)
self.redirects = SQLiteCache(cache_name + extension, 'urls')
self.responses = SQLitePickleCache(cache_name + extension, 'responses')
@extend_signature(CacheBackend.__init__)
def __init__(self, cache_name: str = 'http-cache', **kwargs):
super().__init__(cache_name=cache_name, **kwargs)
path, ext = splitext(cache_name)
cache_path = f'{path}.{ext or "sqlite"}'

self.redirects = SQLiteCache(cache_path, 'urls')
self.responses = SQLitePickleCache(cache_path, 'responses')

class SQLiteCache(BaseCache):
"""An async interface for caching objects in a SQLite database

It's possible to create multiple SqliteCache instances, which will be stored as separate
tables in one database.
class SQLiteCache(BaseCache):
"""An async interface for caching objects in a SQLite database.
Example:
Expand Down

0 comments on commit 8fca61f

Please sign in to comment.