Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hitting Google+ API quota #22

Closed
snarfed opened this issue Jan 6, 2014 · 14 comments
Closed

hitting Google+ API quota #22

snarfed opened this issue Jan 6, 2014 · 14 comments

Comments

@snarfed
Copy link
Owner

snarfed commented Jan 6, 2014

we currently hit our daily quota for Google+ API calls at around 6:30-7pm or so. :/ we're currently limited to the courtesy limit of 10k requests per day, and the G+ API doesn't yet support billing, so the only option is to request more quota.

https://code.google.com/apis/console/b/0/?noredirect&pli=1#project:1029605954231:quotas
https://support.google.com/plus/contact/request_quota?id=1029605954231

example log: https://www.brid.gy/log?start_time=1388982947&key=aglzfmJyaWQtZ3lyKQsSDkdvb2dsZVBsdXNQYWdlIhUxMDM2NTEyMzE2MzQwMTgxNTg3NDYM

HttpError 403 when requesting https://www.googleapis.com/plus/v1/people/me/activities/public?alt=json&maxResults=20 returned "Daily Limit Exceeded"

@snarfed
Copy link
Owner Author

snarfed commented Jan 6, 2014

we're currently refreshing the oauth token on every single G+ request, i think due to #7, which probably multiples the total number of calls by at least 2x. fixing that would help here.

@snarfed
Copy link
Owner Author

snarfed commented Jan 9, 2014

the G+ API sends ETags and supports conditional GETs. e.g. this kind of request returns 304:

curl -v -H 'Authorization:  Bearer ... -H 'If-None-Match: "[ETAG]"' 'https://www.googleapis.com/plus/v1/people/me/activities/public'

these requests will still count against the quota themselves - https://groups.google.com/d/topic/google-calendar-api/bvb3zsF095Q/discussion - but we could still use them to short circuit the poll and not fetch comments, +1s, and reshares for old activities.

@snarfed snarfed closed this as completed in 326d61b Jan 9, 2014
@snarfed
Copy link
Owner Author

snarfed commented Jan 9, 2014

at a higher level, fixing #17 would ideally fix this for all sources.

@snarfed snarfed reopened this Jan 9, 2014
@snarfed
Copy link
Owner Author

snarfed commented Jan 10, 2014

I'm going to do #17, which should give me enough breathing room for a while here.

@snarfed snarfed closed this as completed Jan 10, 2014
@snarfed
Copy link
Owner Author

snarfed commented Sep 17, 2014

reopening; hit this again yesterday and today. :(

@snarfed snarfed reopened this Sep 17, 2014
@snarfed
Copy link
Owner Author

snarfed commented Sep 18, 2014

this got bad suddenly. bridgy was humming along fine using just ~60% of the quota per day, but then suddenly spiked. today we burned through all of our quota before 8am! (quotas reset at midnight.) from https://code.google.com/apis/console/b/0/?pli=1#project:1029605954231:stats :

screen shot 2014-09-18 at 9 50 09 am

i suspect #44 (batching API calls), but i pushed that live on 9/7, and usage didn't spike until 9/15. hrmph.

@snarfed
Copy link
Owner Author

snarfed commented Sep 18, 2014

maybe some G+ user who signed up recently is really, really, really active?

snarfed added a commit that referenced this issue Sep 18, 2014
another vain effort to stem the bleeding from #22 :/
@snarfed
Copy link
Owner Author

snarfed commented Sep 20, 2014

ok, i think i see what's going on here. we memcache the last known number of comments, +1s, and reshares for each activity, and only refetch if we see that number change. however, we're on shared memcache, which is really limited: average 1-2MB and under 1h lifetime at most. so, we're losing those cached counts and refetching pretty much all comments/+1s/reshares on all G+ polls. ugh.

example log.

solution is probably to write these values to the datastore somewhere so that we're not at memcache's mercy.

snarfed added a commit to snarfed/granary that referenced this issue Sep 22, 2014
it's an alternative to the cache kwarg, for returning only new activities and responses since the call that returned that response.

for snarfed/bridgy#22
snarfed added a commit that referenced this issue Sep 22, 2014
needed so we can use it to determine when an activity has new comments/likes/etc without depending on memcache (and thus failing when it evicts). for #22; details in #22 (comment)

!!! NOT ENTIRELY SAFE !!! this sometimes raises `ValueError: Circular reference detected` on real world G+ data, probably due to how i hack up the inner objects during a poll. whee.

don't worry, this will get reverted immediately and never deployed. i'm only pushing it to origin for posterity.
@snarfed
Copy link
Owner Author

snarfed commented Sep 22, 2014

i implemented this for G+ in snarfed/granary@497445d, 836fdb5, and 6d906b3. it was brittle and didn't fully work, though, and more complicated than i'd like, so i'm reverting.

instead of storing the entire last response in a Source property, i'm going to try storing just the keys and values that we currently put into memcache.

as a side benefit, this should keep memcache space usage roughly constant, since the same data is just moving from native memcache into the sources, which are cached by ndb. full responses from get_activities_response() are way bigger, e.g. mine for G+ is currently 323KB (uncompressed)!

snarfed added a commit that referenced this issue Sep 22, 2014
snarfed added a commit that referenced this issue Sep 22, 2014
snarfed added a commit to snarfed/granary that referenced this issue Sep 22, 2014
@snarfed
Copy link
Owner Author

snarfed commented Sep 22, 2014

holy crap this was so much easier. why didn't i do this in the first place?! 🤦

@snarfed
Copy link
Owner Author

snarfed commented Sep 22, 2014

also i think snarfed/granary@3ed91d2 fixed a bug that meant we, uh, weren't caching response counts for twitter at all, so we were always refetching favorites, RTs, and replies for every tweet we saw.

disappointing that tests didn't catch that. also disappointing that i'm too lazy to add a test for it now. :P

@snarfed
Copy link
Owner Author

snarfed commented Sep 22, 2014

hot damn, this had some serious side benefits.

memcache hit ratio:
hit_ratio

total memcache usage:
size

memcache ops/s:
ops

on the other hand, latency bounced upward for a bit because the datastore caches were all cold initially, so we had to refetch all of the cached values to populate them. long term though, latency should drop a bit.
latency

@kylewm
Copy link
Contributor

kylewm commented Sep 22, 2014

I don't totally understand what happened here, but those charts/improvements are awesome 😮 👏👏👏

@snarfed
Copy link
Owner Author

snarfed commented Sep 23, 2014

thanks @kylewm!

finally applied the silly untested caching bug fix to G+ as well, and it's working like a charm. here's the before and after. the fix took effect at midnight 9/23. the two flat periods were when we'd hit the quota and G+ was rejecting all calls. looks like somewhere around a 10x drop in total API calls. woot!

screen shot 2014-09-23 at 12 25 27 pm

i'm going to bump G+ poll frequency back up to every 20m.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants