Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
064d6b1
commit 11af09e
Showing
11 changed files
with
357 additions
and
124 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,49 +1,105 @@ | ||
### MimicDB: An Isomorphic Key-Value Store for S3 | ||
## MimicDB | ||
|
||
#### S3 Metadata without the Latency or Costs | ||
[![PyPI](http://img.shields.io/pypi/v/mimicdb.svg?style=flat)](https://pypi.python.org/pypi/mimicdb/) | ||
[![Build Status](http://img.shields.io/travis/nathancahill/mimicdb/master.svg?style=flat)](https://travis-ci.org/nathancahill/mimicdb) | ||
[![Coverage Status](http://img.shields.io/coveralls/nathancahill/mimicdb/master.svg?style=flat)](https://coveralls.io/r/nathancahill/mimicdb) | ||
|
||
MimicDB is a local database of the metadata of objects stored on S3. Many tasks like listing, searching keys and calculating storage usage can be completely handled locally, without the latency or costs of calling the S3 API. | ||
|
||
On average, tasks like these are __2000x__ faster using MimicDB. | ||
#### Python Implementation of MimicDB | ||
|
||
Python works with the Boto library. | ||
|
||
#### Installation | ||
|
||
By default, MimicDB requires Redis (although other backends can be used instead). | ||
|
||
__Boto__ | ||
```python | ||
>>> c = S3Connection(KEY, SECRET) | ||
>>> bucket = c.get_bucket('bucket_name') | ||
>>> start = time.time() | ||
>>> bucket.get_all_keys() | ||
>>> print time.time() - start | ||
0.425064992905 | ||
``` | ||
$ pip install redis | ||
$ pip install mimicdb | ||
``` | ||
|
||
#### Quickstart | ||
|
||
__Boto + MimicDB__ | ||
```python | ||
>>> c = S3Connection(KEY, SECRET) | ||
>>> bucket = c.get_bucket('bucket_name') | ||
>>> start = time.time() | ||
>>> bucket.get_all_keys() | ||
>>> print time.time() - start | ||
0.000198841094971 | ||
If you're using Boto already, replace ```boto``` imports with ```mimicdb``` imports. | ||
|
||
Change: | ||
``` | ||
from boto.s3.connection import S3Connection | ||
from boto.s3.key import Key | ||
``` | ||
|
||
#### Key Value Store | ||
To: | ||
``` | ||
from mimicdb.s3.connection import S3Connection | ||
from mimicdb.s3.key import Key | ||
``` | ||
|
||
MimicDB uses a Redis backend to stored S3 metadata. Data is stored in the following layout. | ||
Additionally, import the MimicDB object itself, and initiate the backend: | ||
``` | ||
from mimicdb import MimicDB | ||
MimicDB() | ||
``` | ||
|
||
`mimicdb` A set of buckets | ||
After establishing a connection for the first time, sync the connection to save the metadata locally: | ||
``` | ||
conn = S3Connection(KEY, SECRET) | ||
conn.sync() | ||
``` | ||
|
||
`mimicdb:bucket` A set of keys | ||
Or sync only a couple buckets from the connection: | ||
``` | ||
conn.sync('bucket1', 'bucket2') | ||
``` | ||
|
||
`mimicdb:bucket:key` A hash of key metadata (size and MD5) | ||
After that, upload, download and list as you usually would. API calls that can be responded to locally will return instantly without hitting S3 servers. API calls that are made to S3 using MimicDB will be mimicked locally to ensure consistency with the remote servers. | ||
|
||
The `mimicdb` prefix can additionally use an optional `namespace` string, which allows multiple S3 connections to share the same backend. In that case, the layout looks like this: | ||
Pass ```force=True``` to most functions to force a call to the S3 API. This also updates the local database. | ||
|
||
`mimicdb:namespace` | ||
#### Alternate Backends | ||
|
||
`mimicdb:namespace:bucket` | ||
Besides the default Redis backend, MimicDB has SQLite and in-memory backends available. | ||
``` | ||
from mimicdb.backends.sqlite import SQLite | ||
MimicDB(SQLite()) | ||
``` | ||
``` | ||
from mimicdb.backends.memory import Memory | ||
MimicDB(Memory()) | ||
``` | ||
|
||
#### Documentation | ||
|
||
[mimicdb.readthedocs.org](http://mimicdb.readthedocs.org) | ||
|
||
#### Contributing | ||
|
||
|
||
1. Fork the repo. | ||
2. Run tests to ensure a clean, working slate. | ||
3. Improve/fix the code. | ||
4. Add test cases if new functionality introduced or bug fixed (100% test coverage). | ||
5. Ensure tests pass. | ||
6. Push to your fork and submit a pull request to the develop branch. | ||
|
||
#### Tests | ||
|
||
`mimicdb:namespace:bucket:key` | ||
Run tests after installing nose and coverage. | ||
|
||
#### Implementation | ||
``` | ||
$ nosetests --with-coverage --cover-package=mimicdb | ||
``` | ||
|
||
Integration testing is provided by Travis-CI at [travis-ci.org/nathancahill/mimicdb](https://travis-ci.org/nathancahill/mimicdb) | ||
|
||
Test coverage reporting is provided by Coveralls at [coveralls.io/r/nathancahill/mimicdb](coveralls.io/r/nathancahill/mimicdb) | ||
|
||
MimicDB is currently implemented in Python via Boto. If you're using Boto already, the MimicDB Python library works as a drop in replacement. | ||
#### Benchmarks | ||
|
||
Run ```benchmarks.py``` in the root of the repo: | ||
|
||
``` | ||
$ python benchmarks.py | ||
Boto Time: 0.338411092758 | ||
MimicDB Time: 0.00015789039612 | ||
Factor: 2143x faster | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,17 @@ | ||
from redis import StrictRedis | ||
from .s3 import tpl | ||
"""Python implementation of MimicDB | ||
""" | ||
|
||
|
||
class MimicDB(object): | ||
def __init__(self, *args, **kwargs): | ||
def __init__(self, backend=None, namespace=None): | ||
"""Initialze the MimicDB backend with an optional namespace. | ||
""" | ||
Initialze the MimicDB object by passing the Redis connection parameters: | ||
:host='localhost' | ||
:port=6379 | ||
:db=0 | ||
if not backend: | ||
from .backends.default import Redis | ||
backend = Redis() | ||
|
||
The Redis connection is accessed elsewhere in the module by importing | ||
mimicdb, then calling mimicdb.redis | ||
""" | ||
if kwargs and 'namespace' in kwargs: | ||
tpl.set_namespace(kwargs.pop('namespace')) | ||
globals()['backend'] = backend | ||
|
||
globals()['redis'] = StrictRedis(*args, **kwargs) | ||
if namespace: | ||
from .backends import tpl | ||
tpl.set_namespace(namespace) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
"""Base class for MimicDB backends | ||
""" | ||
|
||
class Backend(object): | ||
def __init__(self, *args, **kwargs): | ||
pass | ||
|
||
def delete(self, *names): | ||
pass | ||
|
||
def sadd(self, name, *values): | ||
pass | ||
|
||
def srem(self, name, *values): | ||
pass | ||
|
||
def sismember(self, name, value): | ||
pass | ||
|
||
def smembers(self, name): | ||
pass | ||
|
||
def hmset(self, name, mapping): | ||
pass | ||
|
||
def hgetall(self, name): | ||
pass |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
|
||
from redis import StrictRedis | ||
|
||
from . import Backend | ||
|
||
|
||
class Redis(Backend): | ||
def __init__(self, *args, **kwargs): | ||
self._redis = StrictRedis(*args, **kwargs) | ||
|
||
def keys(self, pattern='*'): | ||
return self._redis.keys(pattern) | ||
|
||
def delete(self, *names): | ||
return self._redis.delete(*names) | ||
|
||
def sadd(self, name, *values): | ||
return self._redis.sadd(name, *values) | ||
|
||
def srem(self, name, *values): | ||
return self._redis.srem(name, *values) | ||
|
||
def sismember(self, name, value): | ||
return self._redis.sismember(name, value) | ||
|
||
def smembers(self, name): | ||
return self._redis.smembers(name) | ||
|
||
def hmset(self, name, mapping): | ||
return self._redis.hmset(name, mapping) | ||
|
||
def hgetall(self, name): | ||
return self._redis.hgetall(name) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
|
||
|
||
from . import Backend | ||
|
||
|
||
class Memory(Backend): | ||
def __init__(self): | ||
self._data = dict() | ||
|
||
def keys(self, pattern='*'): | ||
pattern = pattern.replace('*', '') | ||
return [key for key in self._data if key.startswith(pattern)] | ||
|
||
def delete(self, *names): | ||
for name in names: | ||
self._data.pop(name, None) | ||
|
||
def sadd(self, name, *values): | ||
if name in self._data: | ||
self._data[name].update(values) | ||
else: | ||
self._data[name] = set(values) | ||
|
||
def srem(self, name, *values): | ||
if name in self._data: | ||
self._data[name].difference_update(values) | ||
|
||
def sismember(self, name, value): | ||
if name in self._data: | ||
return value in self._data[name] | ||
|
||
return False | ||
|
||
def smembers(self, name): | ||
return self._data.get(name, []) | ||
|
||
def hmset(self, name, mapping): | ||
self._data[name] = mapping | ||
|
||
def hgetall(self, name): | ||
return self._data.get(name, dict()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
|
||
import sqlite3 | ||
|
||
from . import Backend | ||
|
||
|
||
class SQLite(Backend): | ||
def __init__(self, *args, **kwargs): | ||
if not args and not kwargs: | ||
args = [':memory:'] | ||
|
||
self._sqlite = sqlite3.connect(*args, **kwargs) | ||
|
||
def keys(self, pattern='*'): | ||
c = self._sqlite.cursor() | ||
pattern = pattern.replace('*', '%') | ||
c.execute('SELECT name FROM sqlite_master WHERE type="%s" AND name LIKE "%s"' % ('table', pattern)) | ||
|
||
return [row[0] for row in c.fetchall()] | ||
|
||
def delete(self, *names): | ||
c = self._sqlite.cursor() | ||
|
||
for name in names: | ||
c.execute('DROP TABLE IF EXISTS "%s"' % (name,)) | ||
|
||
self._sqlite.commit() | ||
|
||
def sadd(self, name, *values): | ||
c = self._sqlite.cursor() | ||
c.execute('CREATE TABLE IF NOT EXISTS "%s" (member text)' % (name,)) | ||
|
||
for value in values: | ||
c.execute('INSERT INTO "%s" VALUES ("%s")' % (name, value)) | ||
|
||
self._sqlite.commit() | ||
|
||
def srem(self, name, *values): | ||
c = self._sqlite.cursor() | ||
c.execute('CREATE TABLE IF NOT EXISTS "%s" (member text)' % (name,)) | ||
|
||
for value in values: | ||
c.execute('DELETE FROM "%s" WHERE member="%s"' % (name, value)) | ||
|
||
self._sqlite.commit() | ||
|
||
def sismember(self, name, value): | ||
c = self._sqlite.cursor() | ||
c.execute('CREATE TABLE IF NOT EXISTS "%s" (member text)' % (name,)) | ||
c.execute('SELECT * FROM "%s" WHERE member="%s"' % (name, value)) | ||
|
||
return c.fetchone() != None | ||
|
||
def smembers(self, name): | ||
c = self._sqlite.cursor() | ||
c.execute('CREATE TABLE IF NOT EXISTS "%s" (member text)' % (name,)) | ||
c.execute('SELECT * FROM "%s"' % (name,)) | ||
|
||
return [row[0] for row in c.fetchall()] | ||
|
||
def hmset(self, name, mapping): | ||
c = self._sqlite.cursor() | ||
c.execute('CREATE TABLE IF NOT EXISTS "%s" (size text, md5 text)' % (name,)) | ||
c.execute('INSERT INTO "%s" VALUES ("%s", "%s")' % (name, mapping['size'], mapping['md5'])) | ||
|
||
self._sqlite.commit() | ||
|
||
def hgetall(self, name): | ||
c = self._sqlite.cursor() | ||
c.execute('CREATE TABLE IF NOT EXISTS "%s" (size text, md5 text)' % (name,)) | ||
c.execute('SELECT * FROM "%s"' % (name,)) | ||
|
||
row = c.fetchone() | ||
|
||
if row: | ||
return dict(size=row[0], md5=row[1]) | ||
|
||
return dict() |
File renamed without changes.
Oops, something went wrong.