Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed a bug in RateLimitExceeded with incorrect key to response dict #276

Closed
wants to merge 1 commit into from
Closed

Conversation

north1
Copy link

@north1 north1 commented Jan 16, 2014

RateLimitExceeded.init was setting sleep_time to response['ratelimit'] which was throwing a KeyError exception every time I hit it. Swtiched to 'delay' which seemed to fix the functionality.

If this isn't correct, ignore this, but it fixed my problem and I thought I'd share it.

If this isn't the correct branch to send the pull request to but the fix should still be pulled, please let me know.

RateLimitExceeded.__init__ was setting sleep_time to response['ratelimit'] which was throwing a KeyError exception every time I hit it. Swtiched to 'delay' which seemed to fix the functionality.
@bboe
Copy link
Member

bboe commented Jan 16, 2014

We'll have to confirm that it was changed on the server side before accepting.

@north1
Copy link
Author

north1 commented Jan 16, 2014

Sounds good :) My own code was throwing that error enough that I noticed the problem.

@bboe
Copy link
Member

bboe commented Jan 16, 2014

The code here leads me to believe that the item name is still ratelimit (FYI: I added that code on reddit's side): https://github.com/reddit/reddit/blob/master/r2/r2/lib/jsonresponse.py#L65

Can you reproduce the problem and provide a log? Perhaps you're hitting something else.

@north1
Copy link
Author

north1 commented Jan 17, 2014

Alright, sorry I couldn't get to this yesterday. Bug is very reproduceable. Try running this script:

import praw
r = praw.Reddit(user_agent='rate limit test by /u/neph001')
while(not r.is_logged_in()):
    try:
        r.login('a', 'a')
    except praw.errors.InvalidUserPass:
        pass

Give it a few seconds, and it crashes throwing this error:

Traceback (most recent call last):
  File "/home/neph/PycharmProjects/praw_ratelimit/praw_ratelimit.py", line 9, in <module>
    r.login('a', 'a')
  File "/usr/local/lib/python2.7/dist-packages/praw/__init__.py", line 1157, in login
    self.request_json(self.config['login'], data=data)
  File "/usr/local/lib/python2.7/dist-packages/praw/decorators.py", line 172, in wrapped
    return_value))
  File "/usr/local/lib/python2.7/dist-packages/praw/errors.py", line 325, in __init__
    self.sleep_time = self.response['ratelimit']
KeyError: 'ratelimit'

So I went in and ran PyCharm's debugger to take a look at the errors source code as the error is thrown, and the response dict appears to look like this:

response={u'errors': [[u'RATELIMIT', u'you are doing that too much. try again in 3 minutes.', u'vdelay']]}

Which is weird because the RateLimitExceeded class is trying to assign response['ratelimit'] to RateLimitExceeded.sleep_time, which I'm fairly certain is used as an int.

Thoughts? Am I missing something obvious?

@bboe
Copy link
Member

bboe commented Jan 17, 2014

I see. This appears to be an inconsistency on reddit's end as I wrote PRAW's rate limit error handling functionality for submission and comments occurring too often. It appears the login rate limiting behaves differently. I'll look into it a bit.

@bboe
Copy link
Member

bboe commented Jan 17, 2014

Relevant IRC discussion: https://botbot.me/freenode/reddit-dev/msg/9889683/

@Deimos said he'll look into it.

@north1
Copy link
Author

north1 commented Jan 17, 2014

Cool! Well even if my 'fix' isn't helpful, I'm glad that I pointed out an actual bug and I wasn't just crazy. I would like to reiterate that when I changed my own copy of praw to use 'delay' instead of 'ratelimit' it did fix this issue for me, though now I suspect it also broke the RateLimitExceeded error for other limits (comments, etc). I might test that for fun.

In any case if there's anything else I can do to help, I'd be happy to. I'd like to stay in the loop at least.

@tjvezina
Copy link

Hi there - I'm having the exact same issue and error output as north1 mentioned above. I'm using PRAW for a bot that's been running for about a week and a half now without any problems, but about an hour ago I started to get this KeyError: 'ratelimit' issue. I tried north1's fix by swapping 'ratelimit' out for 'delay', but I just get the same crash and error.

I know the potential difference between login ratelimits and post/comment ratelimits is being looked into, but is there a short-term solution to this so I can keep my bot running? I tried taking it offline for 30 minutes but it continues to get the error as soon as I start it up again.

@north1
Copy link
Author

north1 commented Jan 21, 2014

tjvezina: Is your bot having this error thrown for logging in and out too much? If so...why is it doing that?

I ended up finding this because I was developing a desktop application and I was testing part of the script where it should keep asking for you to log in again if you pass it the wrong credentials, but I'd imagine you'd hard code the correct credentials into the bot and just keep it logged in?

In any case, there is certainly a short term solution: use a Try/Except to catch the KeyError exception.

@tjvezina
Copy link

This bot is running on Google App Engine, I got a sort of "template" from another mod, which sends an HTML request to the bot every 2 minutes, at which point it logs in, performs its intended tasks, and then exits.

I'm not sure if there's any way I could have a single instance running continuously, instead of a new instance every 2 minutes that only lives for ~15 seconds. My concern then would be if that single instance crashed, in which case it would stay crashed until I found out and manually launched another instance, I think.

As for the try/except idea, I threw that in for now, and it does stop the crashing, but of course the bot is still failing to log in.

@north1
Copy link
Author

north1 commented Jan 21, 2014

Well, it should eventually succeed when the ratelimit is no longer exceeded, which I think is what would happen even if this bug were not present. In both your case and mine, an error being thrown is the correct functionality.

This bug (and this pull request) is that, due to an error within the source code for the error (technically within Reddit's source code it seems, not PRAW's) the wrong error is being thrown. Praw already has a built in handler for the RateLimitExceeded exception that will wait the given amount of time to try again automatically (I think - bboe will have to back me up here or correct me) but in the case of login, Reddit's API handles things a bit incongruously.

My suggestion is to, catching the KeyError exception, manually wait through the delay time. You can do this without changing anything by just...letting it log in again on the next pass in 2 minutes.

@tjvezina
Copy link

That's a very a good idea. The version I have running right now seems to occasionally work (about once every 3 instances) which is much better than nothing. I'll try getting it to manually wait out the ratelimit in a bit - thanks for the help!

@tjvezina
Copy link

Hi there, me again - over the past week my bot has been bouncing between successfully running and getting the same 'ratelimit' KeyError as always when attempting to log in. Yesterday it totally flat-lined and repeatedly got the KeyError for about 12 hours, rendering the bot useless.

I took it offline yesterday afternoon, rewrote it to only need one login per day, and attempted to re-deploy this morning, only to hit the same KeyError as always (after 13 hours offline). I'm not exactly sure what I'm looking for now, but I guess I have 2 questions:

Since I'm getting this KeyError: 'ratelimit' error instead of a proper ExceededRateLimit error, is there any way to figure out how long reddit is trying to tell my bot to wait before logging in again?

And more generally, any ideas about why I'm getting so many of these errors, even after being offline so long?

Thanks again for all your help!

@bboe
Copy link
Member

bboe commented Jan 28, 2014

In my experience you should only get login rate-limited if you are entering the wrong password. If this is your bot, there is no reason for that to occur. My only other speculation is that you are starting too many instances of the bot at one time. If you're running more than one concurrent instance, then you should make use of praw-multiprocess.

@tjvezina
Copy link

Yeah, it definitely has the correct password, and there's only ever one instance running (a new instance is created every 2 minutes, which typically takes 15-20 seconds to complete execution). I've even experimented with a different version of the bot that only signs in once and executes in an infinite loop, sleeping for 2 minutes between rounds, but same problem.

I've even created a whole new application under Google App Engine, created a brand new reddit account for the bot (verified email and all), and launched the rewritten version of my python script, but it somehow still gets the KeyError every time I launch an instance.

Another thing is, I'm able to test the bot through localhost just fine, but the error returns as soon as it goes live. Is it possible that reddit has somehow blocked the application on Google App Engine via IP address or somesuch?

@bboe
Copy link
Member

bboe commented Jan 28, 2014

Yes, accessing reddit's API via GAE has a number of restrictions as it proxies external requests through only a few IPs. I would suggest talking with someone on #reddit-dev in IRC (freenode).

Here's my suggestion: Cookies for reddit users are valid indefinitely until a password change occurs (or the logout all other sessions feature is utilized). Adjust your bot to never login; just manually set the cookie. Then you shouldn't have to worry about this problem.

@Damgaard
Copy link
Contributor

You might be under heavier API restrictions if you are sending requests from a GAE server. Reddit's API restrictions are both on a per user and per IP basis. The precise details of this is unknown, the anti-spammer code is closed source. But if there are other programs on GAE accessing reddit, then you could be under much heavier restrictions. It would be a bit extreme if that's the case, but considering some of the other weird things we've encountered with GAE it wouldn't surprise me.

Have you considered using Heroku instead? I know there's quite a few people using it to host their bots/scripts for it, but we've never had a single bug, issue or even question related to Heroku. It seems it just works. The guy responsible for python stuff at Heroku is the same guy who wrote the requests library that PRAW uses to send and receive internet messages. So it's no surprise that it works so well there.

@tjvezina
Copy link

@bboe That's good to know. I get what you're saying about the cookies - I have no idea how to implement that but it's definitely something to look into, and I'll check out the IRC channel after work, thanks!

@Damgaard I've never heard of Heroku, I just signed up and started looking though the docs. It looks pretty interesting, and if it's already hosting other bots, hopefully I can get it worked for me as well.

@bboe
Copy link
Member

bboe commented Jan 28, 2014

Hopefully Heroku will work for you. I added #278 so that we can make dealing with cookies simpler.

@north1
Copy link
Author

north1 commented Jan 28, 2014

@bboe: Back on topic, has there been any updates regarding the original discrepancy between how ratelimit works with submissions and comments, and how it works with logins? I feel like I should stay on top of the issue and help however I can since I'm the dude who found it.

If you're curious how I found it (or how I failed to log in that many times in the first place) it's not a bot. I've been working on this project and was testing that the login handler worked correctly, since it needs to handle human input an not a hard coded username/password.

@Damgaard
Copy link
Contributor

It looks like no solution is coming upstream in the near future, so we should get a workaround in PRAW. Exceptions crashing is not excatly userfriendly. Since we still expect most responses to contain the 'ratelimit' variable, I don't think we should replace it with 'delay', which is only good for this particular case. Here is what I propose

        # Login ratelimits throw a special form of this exception, with the sleep_time
        # given in the "delay" variable instead. See https://github.com/praw-dev/praw/pull/276
        # for more information.
        if "delay" in self.response:
            self.sleep_time = self.response['delay']
        else:
            self.sleep_time = self.response['ratelimit']

With an entry in the changelog as well.

@Damgaard Damgaard added the Bug label Apr 29, 2014
@north1
Copy link
Author

north1 commented Apr 30, 2014

That looks acceptable. Super weird that this ever even came up at all.

@bboe
Copy link
Member

bboe commented Apr 30, 2014

@north1 can you confirm that this is still an issue?

@Damgaard were you ever able to reproduce the problem?

@bboe
Copy link
Member

bboe commented May 6, 2014

I was able to reproduce the problem (too many invalid login attempts) so I'll merge this stuff.

@bboe
Copy link
Member

bboe commented May 6, 2014

Actually, I have a response that does not contain the 'ratelimit' field, but it also does not have a 'delay' field. In what cases does a 'delay' value appear?

@bboe
Copy link
Member

bboe commented May 9, 2014

Fixed on reddit's end with respect to rate limiting on login. No need for a PRAW update. I couldn't find anything that indicates a delay field. Please create a new issue if something like that comes up. Thanks for reporting!

@bboe bboe closed this May 9, 2014
@bboe
Copy link
Member

bboe commented May 9, 2014

reddit-archive/reddit@e4a2d38

The fix on reddit's end for reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants