async impl of DynamoDBBackend #37

akursar · 2021-03-26T17:12:58Z

Draft PR to raise some questions I had while starting to work on this.

The minimal integration tests work for me if I run the dynamodb local service, but I have to run that as root to workaround perms issues sqlite raises when trying to write to the mounted volume. I'm not sure if this is specific to my setup.

I'll raise other questions inline.

aiohttp_client_cache/backends/dynamodb.py

JWCook · 2021-03-26T17:22:49Z

For the DynamoDB container file permissions, you could manually specify the container's user, use a bind mount to a local directory, and chown it to the same user.

However, since this is just used for tests and persistence isn't important, using an in-memory database is going to be simpler. I just updated the docker-compose.yml in the main branch.

aiohttp_client_cache/backends/dynamodb.py

akursar · 2021-03-26T17:31:31Z

Maybe tangential, but how are exceptions expected to be handled? I think that for most exceptions outside initial init, they should be swallowed. If I time out in reading from my cache, I would expect my session to fetch from origin, and if the save_response similarly times out, I still expect to get the response. Is that atypical? i.e. if I want that behavior, should I impl a custom cache that likely wraps and caches, or should the base backend do some of that exception handling? Basically in my actual impl, I swallow all exceptions during read/write and emit logs/metrics but I still want my response data from origin when my cache is falling over.

JWCook · 2021-03-26T18:05:36Z

how are exceptions expected to be handled? I think that for most exceptions outside initial init, they should be swallowed.

Right. If there's an exception when fetching a cached response, CacheBackend.get_response() returns None and CachedSession._request() makes a new request. Errors aren't currently handled in CacheBackend.save_response(), but adding a try-except block and error logging there would be good.

akursar · 2021-03-29T13:55:37Z

how are exceptions expected to be handled? I think that for most exceptions outside initial init, they should be swallowed.

Right. If there's an exception when fetching a cached response, CacheBackend.get_response() returns None and CachedSession._request() makes a new request. Errors aren't currently handled in CacheBackend.save_response(), but adding a try-except block and error logging there would be good.

Well, CacheBackend.get_response catches KeyError/TypeError right now, but for example if there's some sort of timeout in reading from the cache, it looks like that's uncaught. I'm not sure if you'd rather have catch-all except Exception: throughout CacheBackend to simplify cache impls or if that should be caught per-cache. If you think there should be more handling in that base backend, I can add that in a separate PR too. LMK if that doesn't make sense or if I'm missing some higher-level catching.

JWCook · 2021-03-30T19:12:14Z

if there's some sort of timeout in reading from the cache

Good point. If there's something like a timeout or other backend-specific thing you know you want to ignore, you could handle those in the backend class, like:

except botocore.exceptions.ClientError as error:
    if error.response['Error']['Code'] == 'TimeoutError':   # Or whatever error code(s) you want to check for
        pass

For non-backend-specific error handling in get_response(), I'd be hesitant to put a broad catch Exception in there, as I'd generally like to only handle errors for things that this library actually does, e.g. everything in between getting data from the backend and returning it to the client application. For example, if I got a ClientError because of bad IAM permissions, that would be a problem in my application that I'd want to fix, not something to quietly ignore, right? Similarly, I'd want to see any file permissions errors for SQLite, or config/networking errors for other backends. I'm open to suggestions, though.

Also, I added AttributeError and PickleError to the list of exceptions to catch, since I found that those could potentially be raised during deserialization.

akursar · 2021-04-07T17:38:48Z

@JWCook what do you think, re: mypy errors with type return. I used AsyncIterable here.

JWCook · 2021-04-07T18:39:47Z

what do you think, re: mypy errors with type return. I used AsyncIterable here.

I had to think about that one for a minute. I believe your type hint is correct. From the mypy docs:

"The return type annotation should be the same as the type of the value you expect to get back when await-ing the coroutine."

Currently in the other backend classes (and base class), they're typed as Iterable since that's what they return after being awaited, e,g, usage looks like for value in await cache.values().

Since you're yielding within an async for and it works with __aiter__, AsyncIterable is the correct type. I like that better than the current Iterators, so I can go ahead and change the rest of the keys() and values() methods to match, if that works for you.

akursar · 2021-04-07T18:49:44Z

what do you think, re: mypy errors with type return. I used AsyncIterable here.

I had to think about that one for a minute. I believe your type hint is correct. From the mypy docs:

"The return type annotation should be the same as the type of the value you expect to get back when await-ing the coroutine."

Currently in the other backend classes (and base class), they're typed as Iterable since that's what they return after being awaited, e,g, usage looks like for value in await cache.values().

Since you're yielding within an async for and it works with __aiter__, AsyncIterable is the correct type. I like that better than the current Iterators, so I can go ahead and change the rest of the keys() and values() methods to match, if that works for you.

That would be great! Thanks!

JWCook · 2021-04-07T20:40:49Z

Is this ready to merge? If so, would you mind rebasing and squashing your commits? Otherwise this looks good to me!

akursar · 2021-04-08T15:29:56Z

Is this ready to merge? If so, would you mind rebasing and squashing your commits? Otherwise this looks good to me!

Yes. Commits squashed. But you know github lets you do a squash commit from the PR. Curious if you've found issues with that, as I've gotten accustomed to cleaning up commit log, e.g. adding conventional commit tags, during the squash commit in github and am always curious to hear what others are doing.

JWCook · 2021-04-08T16:06:00Z

Sure, just wanted to give you the chance to make any commit log edits yourself, if you wanted to.

The 'Squash and merge' option from GitHub only lets the reviewer/merger edit the squashed commit message, right? As far as I know, the only way to let the PR author make those edits is to do an interactive rebase before merging. That's how my team at work does it with GitLab, at least.

Does GitHub have a feature to squash & edit commits as the PR author?

JWCook · 2021-04-08T16:14:25Z

Your changes are in the latest pre-release build, 0.3.0.dev1.

Thanks for the contribution, and the other suggestions along the way! This has been really helpful.

akursar · 2021-04-08T16:21:03Z

Sure, just wanted to give you the chance to make any commit log edits yourself, if you wanted to.

The 'Squash and merge' option from GitHub only lets the reviewer/merger edit the squashed commit message, right? As far as I know, the only way to let the PR author make those edits is to do an interactive rebase before merging. That's how my team at work does it with GitLab, at least.

Gotcha. That makes sense.

Does GitHub have a feature to squash & edit commits as the PR author?

Not that I know of. We've gotten into the habit of updating the PR title because github seems to use this as the default message for the squash commit... but seemingly not always. I'm not sure if that's a UI/caching thing, but our habit is to use the PR title as the staged commit log message by convention/ shared expectation, and double-check that it matches as the merger, heh.

async impl of DynamoDBBackend