Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reviews: Timeouts during review assignments not able to "resume from previous" #1076

Open
connorferster opened this issue Sep 20, 2022 · 9 comments
Labels

Comments

@connorferster
Copy link

Hi @slarse,

I tried using the REPOBEE_4 features for the review and ran into a problem with GitHub where I am getting timeout errors and now "You have exceeded a secondary rate limit and have been temporarily blocked from content creation..." errors.

Unfortunately, it seems that the .json assignments file that is created gets recreated from scratch every time which means I now have a bunch of students who have been assigned multiple workbooks to review, some students who have been assigned one workbook to review, and some students who I cannot assign workbooks to because I am now blocked from GitHub :(

Perhaps the workflow would be best as follows:

  1. Add a flag in the command to overwrite an existing .json file
  2. Check for an existing .json file
  3. If none exists, then create a new one and populate it with the assignments
  4. Once populated, initiate creating repos on GitHub
  5. If the repo assignment already exists on GitHub, move onto the next in the .json file

Additionally, is there a way to set the rate limit and timeout limit in repobee so I don't have timeout errors and rate limit blocks?

Thanks!

@connorferster
Copy link
Author

connorferster commented Sep 20, 2022

Okay, was able to get the assignments removed (end command works with all assigned repos; excellent!) but I am unable to create assignments because I keep getting this secondary rate limit exceeded. I cannot assign reviews because of it. :(

Update
I am at 61 of 70 reviews completed and I am unsure if the non-REPOBEE_4 setting is able to resume from where it left off. I just tried resuming the assignments but GitHub has shut me out again.

I will open another issue for this secondary rate limit thing.

Update 2
I could not even create a separate issue for the secondary rate limit because my account was limited because of the secondary rate limit :(((

@slarse
Copy link
Collaborator

slarse commented Sep 20, 2022

Hi @connorferster,

What version of RepoBee are you running? Version 3.8.0 has patches for the secondary rate limit that should solve the problem. If you are running 3.8.0 and are still running into the rate limit, please let me know and I'll try to investigate.

@slarse
Copy link
Collaborator

slarse commented Sep 20, 2022

The rate limiting thing is fairly new, and there have been a number of unexpected problems with it lately. It's hard for me to test it directly as I've no "large class of students" to experiment with on github.com.

I'd very much appreciate if you could feedback on whether the rate limit mitigation works, because this has been reported twice (#1065 , #991), but I've not received any confirmation yet that my fixes have had the desired effect.

Unfortunately, it seems that the .json assignments file that is created gets recreated from scratch every time

Yes, this a consequence of how the old commands work and the new ones being built right on top of it. The .json file is not what RepoBee uses to actually allocate the reviews, it's just a "hey, I did this" kind of statement. It's built directly on top of the legacy commands.

This worked out perfectly before GitHub introduced the secondary rate limit, but now it's obviously insufficient.

Perhaps the workflow would be best as follows:

The reviews assign command must become idempotent (like pretty much all other commands are), and the only way to achieve this reliably is to have another command generate the review allocations, and then feed that as input to reviews assign. It'll be somewhat less user friendly but will completely avoid the trap of failing in assigning reviews halfway through the process.

@connorferster
Copy link
Author

Thanks @slarse,

It turns out that I had 3.7.1 installed. I just updated to 3.8.0 and am trying to create my reviews allocations again and I hit the error again. Now, just prior to upgrading, I tried to run it with 3.7.1 thinking that "it has been 9 hours since I list ran it - probably ok" but I got booted again. So I waited about 10-15 minutes before running again and I am still getting booted. I think that the time between each iteration (retry after period) may need to be between 5 and 10 seconds? It seems that GitHub does not publish the value intentionally.

@connorferster
Copy link
Author

connorferster commented Sep 21, 2022

Perhaps the workflow would be best as follows:

The reviews assign command must become idempotent (like pretty much all other commands are), and the only way to achieve this reliably is to have another command generate the review allocations, and then feed that as input to reviews assign. It'll be somewhat less user friendly but will completely avoid the trap of failing in assigning reviews halfway through the process.

That sounds completely reasonable to me! Thank you also for the new vocabulary word: idempotent :)

@slarse
Copy link
Collaborator

slarse commented Sep 21, 2022

Thanks @slarse,

It turns out that I had 3.7.1 installed. I just updated to 3.8.0 and am trying to create my reviews allocations again and I hit the error again. Now, just prior to upgrading, I tried to run it with 3.7.1 thinking that "it has been 9 hours since I list ran it - probably ok" but I got booted again. So I waited about 10-15 minutes before running again and I am still getting booted. I think that the time between each iteration (retry after period) may need to be between 5 and 10 seconds? It seems that GitHub does not publish the value intentionally.

Did you hit the error right away, or did it start running OK and then you hit the limit? If it's the former your credentials might still have a "temporary ban" on them. If it's the latter, then RepoBee is still too aggressive. I'm to the best of my knowledge following all of the best practices that GitHub outlines here: https://docs.github.com/en/rest/guides/best-practices-for-integrators#dealing-with-secondary-rate-limits

I will have to set up an experiment of my own to sort this out.

@connorferster
Copy link
Author

Hi @slarse,

It would run about 20 students and then hit the limit.

Something "similar" (in quotes because no actual error pops up) when I run the "end" command. It will run for most of the students (say, 40) and then stop with no message. I would run the "end" command again and it would do maybe 20 more students. I would run it again to get the rest of the students. And then I would run it one last time where it returns no value just to make sure that it deleted all of the review teams.

I have opted to cancel the peer review portion of the course this time around. If you make some updates and would like to test them out, let me know and I will ask a subset of students if they are willing to let me run tests with their submissions.

@slarse
Copy link
Collaborator

slarse commented Sep 21, 2022

I'm sorry you had to cancel the peer review, this secondary rate limit thing is really a nasty business to deal with. I realize now that there's one thing I haven't accounted for:

Requests that create content which triggers notifications, such as issues, comments and pull requests, may be further limited and will not include a Retry-After header in the response. Please create this content at a reasonable pace to avoid further limiting.

Which is just... Well I don't even know how to act upon that, what is reasonable? I'd wager that's the problem here (because the peer review commands indeed open issues).

I will run some experiments of my own and get back to you.

@slarse
Copy link
Collaborator

slarse commented Sep 25, 2022

So, I ran a little experiment with 2 students over 30 assignments (meaning a total of 60 review issues). Didn't encounter any rate limiting problems, and the actual API request load should be very similar there as if you had e.g. 60 students on 1 assignment.

@connorferster When you run these commands, does it feel like they run fast? It should take around 1 to 2 seconds to assign each review, if it runs faster then the local rate limiting of modify requests isn't working as it should. It should feel like it's running pretty darn slow.

I would run the "end" command again and it would do maybe 20 more students.

This I was actually able to reproduce. I will do some further debugging there.

Regardless, the only surefire way to solve this is to redesign the review assignment such that JSON file with allocations is generated first, such that even if GitHub is being grumpy you can just continue the allocation later. It would require checking if there are open issues, though, but that's a read request on the API which GitHub is generous with, it's just the modify requests (PUT, POST, DELETE) that it's gotten really stingy with lately.

Perhaps it would also be useful to be able to tune the local rate limiting, such that if you're having problems with it you can further reduce the request frequency. What do you think? It would be incredibly simple to implement.

@slarse slarse added the github label Sep 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants