Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

twitter: fully support oversized block lists #764

Open
snarfed opened this issue Aug 9, 2017 · 2 comments
Open

twitter: fully support oversized block lists #764

snarfed opened this issue Aug 9, 2017 · 2 comments
Labels

Comments

@snarfed
Copy link
Owner

@snarfed snarfed commented Aug 9, 2017

we recently started fetching and obeying twitter block lists in #473. more implementation details in snarfed/granary#40. we use twitter's blocks/ids endpoint (with count=5000), which is rate limited to 15 calls per 15m per user. this means that we can only fetch 75k members of a user's block list at one time.

there are currently three bridgy twitter users with >75k users on their block lists. one has 149996 total (!); i haven't counted the others. they now hit the rate limit and fail pretty much every poll. we cache block lists, but we don't split and coalesce block list fetches across polls.

those three users' last webmentions from bridgy were 2 mos ago, 3 mos ago, and never, and the solutions to this are awkward at best, so i'm not prioritizing this right now.

@snarfed snarfed added the listen label Aug 9, 2017
@snarfed snarfed changed the title twitter: handle oversized block lists twitter: fully support oversized block lists Aug 23, 2017
@snarfed
Copy link
Owner Author

@snarfed snarfed commented Aug 23, 2017

my stopgap solution to this is to gracefully handle when we get rate limited and use the partial block list contents that we've fetched so far (06f5987). the cap i ended up with is actually 40k, not 75k, since i memcache block lists, and memcache values are limited to 1MB.

@snarfed
Copy link
Owner Author

@snarfed snarfed commented Sep 23, 2019

oof, hit a blocklist last night (vhfmag's) where 40k was still over 1MB. error here; log:

urlopen GET https://api.twitter.com/1.1/blocks/ids.json?count=5000&stringify_ids=true&cursor=-1 {} (local/lib/python2.7/site-packages/oauth_dropins/webutil/util.py:1316)
...
urlopen GET https://api.twitter.com/1.1/blocks/ids.json?count=5000&stringify_ids=true&cursor=1645386315131274403 {} (local/lib/python2.7/site-packages/oauth_dropins/webutil/util.py:1316)
Error 429, response body: u'{"errors":[{"message":"Rate limit exceeded","code":88}]}' (local/lib/python2.7/site-packages/oauth_dropins/webutil/util.py:1107)
Updating vhfmag (Twitter) /twitter/vhfmag : {u'poll_status': u'error', u'last_activity_id': u'1175780014345863169', u'last_public_post': datetime.datetime(2019, 9, 22, 14, 32, 55), u'recent_private_posts': 0} (models.py:262)
Values may not be more than 1000000 bytes in length; received 1087968 bytes (local/lib/python2.7/site-packages/webapp2.py:1590)
Traceback (most recent call last):
...
  File "tasks.py", line 95, in post
    self.poll(source)
  File "tasks.py", line 195, in poll
    self.backfeed(source, responses, activities=activities)
  File "tasks.py", line 329, in backfeed
    if source.is_blocked(resp):
  File "twitter.py", line 172, in is_blocked
    memcache.set(cache_key, self.blocked_ids, time=BLOCKLIST_CACHE_TIME)
...
  File "python27_lib/versions/1/google/appengine/api/memcache/__init__.py", line 238, in _validate_encode_value
    'received %d bytes' % (MAX_VALUE_SIZE, len(stored_value)))
ValueError: Values may not be more than 1000000 bytes in length; received 1087968 bytes

the memcache docs are light on size calculation details, but they do say Any type. If complex, will be pickled. i guess i could pickle the list, measure, and incrementally cut it down until it's under 1MB, but for now i'm just going to drop the cutoff down to 35k.

snarfed added a commit that referenced this issue Sep 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.