Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache manager refactor #40

Merged
merged 28 commits into from Feb 6, 2016
Merged

Cache manager refactor #40

merged 28 commits into from Feb 6, 2016

Conversation

@c-w
Copy link
Owner

@c-w c-w commented Feb 1, 2016

@cpeel: I did a bit of refactoring on your metadata cache manager code in order to gain an understanding of it. I'd like to get some feedback on whether I understood everything correctly before I merge.

The gist of this refactor:

  • Create sqlite/bsddb subclasses of MetadataCache to avoid store-type checking in methods
  • Create testing MetadataCache to remove testing-only code in the api.
  • Fall back to sqlite cache when the user doesn't have bsddb installed.
c-w added 24 commits Feb 1, 2016
This will allow for easier instantiation of test cache managers.
This lets us make lines shorter!
Really there are only two supported stores:
- Sleepycat/BDB
- Sqlite
The sqlite-based MetadataCache should work on all systems (given that
sqlite is in the Python standard library). Thus we can use this cache as
a fall-back in case we don't manage to use Sleepycat (e.g. on Py3/Win).
Also:
- remove explicit dependency on bsddb3 under py2

There are some issues installing BSD-DB on Windows (see #26, #31) so
this fix should make gutenberg work on Windows Python 2 again.
On Windows Python 3, we can't use the standard-library bsddb module and
we can't fall-back to bsddb3 so we use the slow SQLite cache instead.

def set_metadata_cache_manager(cache_manager):
Copy link
Contributor

@cpeel cpeel Feb 2, 2016

The set_metadata_cache() function is not just for testing, but rather an integral part of my branch. Without a way to explicitly set a cache, you can't control where the cache is stored on the filesystem, or in the new code, control which of the two backend types you want to use. This level of control is vital when using this as part of a larger service.

For an example of what I'm trying to achieve with this module, see:
https://github.com/cpeel/openlibrary/blob/095df4b3be522eef218813a0106ee2ce7934f120/openlibrary/plugins/gutenberg/pgebook.py
and
https://github.com/cpeel/openlibrary/blob/73533a3e011074e2ffbfe416f241a6aa14fb5b6e/scripts/dev-instance/populate_gutenberg_cache.py

Copy link
Owner Author

@c-w c-w Feb 5, 2016

Fixed in 8df3e24.

c-w
Copy link
Owner

c-w commented on 9feaeae Feb 3, 2016

Fixed in d887023.

cpeel
Copy link
Contributor

cpeel commented on 9feaeae Feb 2, 2016

I'm a bit leery of having a version-less github repository as a core requirement. It was one thing for testing purposes, but quite another to say that whatever anyone commits to the rdflib-sqlalchemy github repo is good enough and won't ever break this module. This also means that this package now explicitly depends on SQLAlchemy as well, which might be overkill for some people (although it doesn't bother me any).

@cpeel
Copy link
Contributor

@cpeel cpeel commented Feb 2, 2016

Overall I like your refactor -- very clean. My one big concern is the removal of set_metadata_cache[_manager]() as that function enables one of the core requirements of what I was trying to achieve -- the ability to control where the cache ends up on the filesystem and what type of cache it is.

@@ -23,9 +23,5 @@ class CacheAlreadyExists(Error):
pass


class CacheNotRemovable(Error):
Copy link
Contributor

@cpeel cpeel Feb 2, 2016

Built-in python errors (almost) all end in Exception, Error, or Warning (https://docs.python.org/2/library/exceptions.html). See also PEP8: https://www.python.org/dev/peps/pep-0008/#exception-names

(I should have raised this comment against the commit when the exceptions.py file went in, as they're all this way.)

Copy link
Owner Author

@c-w c-w Feb 5, 2016

Fixed in 8921ee1.

This reverts commit 7c8f7c1 and adds some more documentation on how to
use the method.
@c-w
Copy link
Owner Author

@c-w c-w commented Feb 5, 2016

  • Added exception suffixes in 8921ee1.
  • Froze dependency on sqlalchemy-rdflib in d887023.
  • Brought back set_metadata_cache in 8df3e24.

cpeel
Copy link
Contributor

cpeel commented on 8df3e24 Feb 6, 2016

s/backen/backend/

@cpeel
Copy link
Contributor

@cpeel cpeel commented Feb 6, 2016

Ship it! :)

@c-w
Copy link
Owner Author

@c-w c-w commented Feb 6, 2016

           (`-,-,
           ('(_,( )
            _   `_'
         __|_|__|_|_
       _|___________|__
      |o o o o o o o o/   
     ~'`~'`~'`~'`~'`~'`~

@c-w c-w closed this Feb 6, 2016
@c-w c-w reopened this Feb 6, 2016
c-w added a commit that referenced this issue Feb 6, 2016
@c-w c-w merged commit c7343e9 into master Feb 6, 2016
0 of 2 checks passed
@c-w c-w deleted the cache-manager-refactor branch Feb 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants