Conversation
Assigned to @dhermes for initial review. |
/cc @waprin |
w00t, I made that! Run What is #319 anyway? |
@jonparrott Why always sending PRs with so much stuff? Can't we break them into tiny digestible pieces? Reviewers are people too 😁 |
Sorry! I tried to make this one pretty small. Is there a better way for me to break this up? I'm happy to do so. The only thing I can think of is to split it into separate PRs for LockedStorage, DictionaryStorage, updates to File storage, updates to flask_util. But there's all quite intertwined. |
(I feel like this one just looks big because of the tests. Really I'm just shuffling code around) |
I was trying to be complain-y and funny at once. I can handle it. |
# limitations under the License. | ||
|
||
"""Dictionary storage for OAuth2 Credentials. | ||
""" |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
You're not a person, you're a lobster. |
13ede5b
to
2dc4815
Compare
Docs fixed. |
If self._key is a callable, it will return the result of calling | ||
self._key. | ||
""" | ||
if callable(self._key): |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
Finished review. This is what I would've liked to have seen
Boiling the ocean in a commit makes it less likely to be reviewed quickly and increases the chances that issues/problems will slip through the review. |
I can still do that if it's easier for you. |
No worries, review wasn't too hard. Just file it away for future code changes. |
You got it. I thought of adding just locked storage and dictionary storage then doing another PR to refactor file storage, but it was such a small change. |
2dc4815
to
65152b8
Compare
65152b8
to
315aa18
Compare
|
||
Args: | ||
dictionary: A dictionary or dictionary-like object. | ||
key: A hashable or a function returning a hashable. The credentials |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
315aa18
to
658da6d
Compare
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
"""Oauth2client.file tests |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
658da6d
to
7c04fd4
Compare
@jonparrott Not sure why our discussion got folded under by GitHub. As I recall, |
Yeah let's keep the discussion above the fold, github collapses comments on outdated commits. If @nathanielmanistaatgoogle is willing to allow drastic, breaking changes, IMO storage could be more easily expressed with something like this: # Could likely just be dropped altogether.
class Storage(object):
def get(self):
abstract()
def set(self, credentials):
abstract()
def delete(self):
abstract()
class DictionaryStorage(Storage):
# Implements get, set, and delete
pass
class LockedStorage(Storage):
def __init__(self, storage, lock=None):
self._storage = storage
if lock is None:
lock = threading.Lock()
self._lock = lock
def get(self):
with self._lock:
self._storage.get()
# And so on...
# Usage
unlocked_storage = DictionaryStorage({}, key='credentials')
locked_storage = LockedStorage(unlocked_storage) |
def locked_delete(self): | ||
"""Remove the credentials from the dictionary, if they exist.""" | ||
key = self._get_key() | ||
try: |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
7c04fd4
to
d230fd2
Compare
def locked_delete(self): | ||
"""Remove the credentials from the dictionary, if they exist.""" | ||
key = self._get_key() | ||
self._dictionary.pop(key, None) |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
Let's do the drastic breaking change now. I agree that the current Tell me more about the "contextual" key creation circumstance? I'm having a hard time wrapping my head around needing a |
Are you comfortable for what I've proposed in terms of re-implementing Storage? At least as a starting point?
Take a look at Implementation / usage suggestions in #319, though relevant stuff included below. Basically imagine that you have a central database (e.g., Redis) and you want to keep track of the credentials for all of the users in your application. You can't just use def get_storage_for_user(request):
key = request['user_id']
return KeyValueStorage(redis_instance, key=key)
oauth2 = flask_util.UserOAuth2(app, storage=get_storage_for_user) Again, totally fine not having that has part of the |
Yes, I'm comfortable with what you've proposed - especially the way that I will take you up on the offer of keeping the indirection entirely in the web app helpers. It's too weird to have in the generally-unaffiliated |
Great. :) Not yet, but I'll watch it tonight.
You got it. Put this PR on ice. I'm going to rip Storage's heart out and fix everything that it breaks and submit a PR for that. Sorry, @dhermes, it'll be slightly big as it's boiling the ocean by definition. |
Interesting point of contention here between where we want to take storage and where it is. It seems the reason this locking logic is in storage is not only due to thread safety, but also due to this: def _refresh(self, http_request):
"""Refreshes the access_token.
This method first checks by reading the Storage object if available.
If a refresh is still needed, it holds the Storage lock until the
refresh is completed.
Args:
http_request: callable, a callable that matches the method
signature of httplib2.Http.request, used to make the
refresh request.
Raises:
HttpAccessTokenRefreshError: When the refresh fails.
"""
if not self.store:
self._do_refresh_request(http_request)
else:
self.store.acquire_lock()
try:
new_cred = self.store.locked_get()
if (new_cred and not new_cred.invalid and
new_cred.access_token != self.access_token and
not new_cred.access_token_expired):
logger.info('Updated access_token read from Storage')
self._updateFromCredential(new_cred)
else:
self._do_refresh_request(http_request)
finally:
self.store.release_lock() Credentials holds the storage lock while refreshing credentials to prevent duplicate refresh requests. Any ideas on this? My two so far:
|
FWIW That is for thread-safety, in order to make the refresh thread-safe. |
Putting on the credential itself wouldn't work, as there can be multiple instances of a single credential. Is storage the right place for this? It seems like a convenient place, but not necessarily the correct place. |
Not sure it's the right place, but you should try to limit the scope of your changes for everyone's sake (manage complexity). The abstract concept of a store has traditionally been used for a file-system and for GAE. In these cases, the lock is for the resource (e.g. writing to the datastore and writing to the filesystem). So acquiring the lock happens very likely on different instances of In Py27 on GAE, with threading turned on, certain resources are shared between requests. In particular, |
Right, so there's two separate issues here:
Should we split refresh locking into its own concern? |
This is a rough-cut to sanity check the concept. Docstrings and full tests still missing. New: * `client.Storage` is an abstract base class. * `locked_storage.LockedStorage` provides locking based on context managers and defaults to `threading.Lock`. * `locked_storage.threadsafe` is a class-level decorator to provide thread safety. This is applied to `FileStorage` and `KeyringStorage`. Changed: * All `Storage` subclasses have been updated to use the new base class. * `file.Storage` renamed to `file_storage.FileStorage`. * `keyring_storage.Storage` renamed to `keyring_storage.KeyringStorage`. * `django_orm.Storage` renamed to `django_orm.DjangoOrmStorage`. **Questionable changes** As mentioned in [a comment on googleapis#344], `Credentials` has the unfortunate behavior of using the Storage's lock to prevent concurrently refreshing the credentials. IMO, Refresh locking should be handled as a separate concern. For now, `Storage` implements a no-op `lock()`.
This is a rough-cut to sanity check the concept. Docstrings and full tests still missing. New: * `client.Storage` is an abstract base class. * `locked_storage.LockedStorage` provides locking based on context managers and defaults to `threading.Lock`. Changed: * All `Storage` subclasses have been updated to use the new base class. * `file.Storage` renamed to `file_storage.FileStorage`. * `keyring_storage.Storage` renamed to `keyring_storage.KeyringStorage`. * `django_orm.Storage` renamed to `django_orm.DjangoOrmStorage`. * `KeyringStorage` and `FileStorage` are no longer threadsafe by default. **Questionable changes** As mentioned in [a comment on googleapis#344], `Credentials` has the unfortunate behavior of using the Storage's lock to prevent concurrently refreshing the credentials. IMO, Refresh locking should be handled as a separate concern. For now, `Storage` implements a no-op `lock()`.
Jon, I am not totally sure those are two separate issues. In my mind they are both about synchronizing concurrent access to the same object, in this case the credentials. In the first case you are using the lock just to read/write the value safely, while in the second case you are holding the lock while you refresh the credentials so you don't refresh the credentials more time than you need to. So in other words I think it's totally sane to use the same lock for both. In your example with Redis and Zookeeper, if you're using Zookeeper to protect a Redis entry, then you should be using the Zookeeper lock any time you write the credentials, not just during credentials refresh. If your credentials store is in a distributed store then just adding thread/file locking isn't enough in any of the cases. I think part of the reason things are getting confusing is we keep saying "thread-safety" referring to the Storage lock, but isn't necessarily just a lock for threads. To me, if you were going to introduce different locking, it would be to introduce separate read/write locks, that way readers can still read the old credentials during the refresh process. But I think that's overcomplicating things unless long refreshes holding onto locks for too long actually pops up as an issue for someone. Hopefully that makes sense, apologies if I'm confused on this complicated topic. |
Fair point, we can continue the discussion in the PR for the storage On Mon, Nov 30, 2015, 4:54 PM Bill Prin notifications@github.com wrote:
|
* LockedStorage - implements a generic storage base class that uses a `threading.Lock` like-object. * DictionaryStorage - implements an optionally-locked storage over a dictionary-like object. Additionally: * Updated `file.Storage` to use `LockedStorage`. * Remove `flask_util.FlaskSessionStorage` and replaced it with `DictionaryStorage`.
d230fd2
to
dc542ab
Compare
I'm going to start a fresh PR for this. |
DictionaryStorage - implements an optionally-locked storage over a dictionary-like object.
Additionally:
Storage
now includes optional locking logic. Previously this was implemented in subclasses.flask_util.FlaskSessionStorage
and replaced it withDictionaryStorage
.