-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Random delays #51
Comments
How much of a delay would be a good amount and what kind of difference would be enough to not trigger a block? Edit: Would it be enough to generate a random number from, say 50 to 100, and add it to the base SLEEP_DURATION here? |
@SpyrosRoum the delay you linked to only happens when the queue is empty. If that sleep happens too much, then the application will just end since it means we ran out of links to visit. However, it's possible that the queue is empty for a little while, while some other threads are repopulating it. So we set the inactive threads to sleep for a little bit (SLEEP_DURATION) so that they're not running for nothing. We need to add another type of delay when the queue is still populated so that we don't override the website's limits. Do you want to work on this ? I can assign you to it and you can ask all the questions you need here :) Thanks a lot for giving it a try ! |
Ohh you are right, I didn't pay much attention at all (I just did a ctrl + f for sleep_duration xd) So we would want to use a random delay only if we successfully got something from the queue. Which means we would sleep after we handled the url in the Ok arm of that same match. Right? I would agree to be assigned to it but I feel like what I am thinking is too easy to not be already done by one of you guys so I may have the wrong idea of what I am getting my self into, here If it is just adding a random delay after handling the message then sure, I can do that |
That sounds like a good idea ! The reason we haven't done it yet is there were other more important features to implement, and we're also quite busy. It's marked as a good first issue, so it is one of the easy ones, no worries :) some require a bit more understanding of other parts of the code but this one is fairly simple. I'll assign you :) |
Amazing, I'll get right to it This will inevitably slow down the whole thing so I guess we want to be as low as possible Also, is using the rand lib acceptable? Edit: Oh also since I am kinda new to github too, should I fork and create a new branch and then push to your master from my feature branch? |
I have zero idea about what a good delay would be haha. I'm sure there are some articles related to scraping that can help. Using a lib if you can't find the required function in the standard lib is perfectly fine ! And regarding the process: You should fork, make your changes and then open a pull request against our master :) we'll review it then. Thanks again ! |
Alright, I have some good and some bad news So I added a base of 2 seconds and an extra random number from 0 to 5. So in total the delay would be between 2 - 7 seconds This means running Oh, and the good news is that it's working with a random delay now |
Thanks for the work, you rock :) With this option enabled, yes SuckIT will be slow but otherwise the performance will be the same as before |
This is needed to avoid IP banning
The text was updated successfully, but these errors were encountered: