Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to avoid getting banned/blocked? #15266

Closed
zefoo opened this issue Jan 16, 2018 · 4 comments
Closed

How to avoid getting banned/blocked? #15266

zefoo opened this issue Jan 16, 2018 · 4 comments

Comments

@zefoo
Copy link

@zefoo zefoo commented Jan 16, 2018

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like this: [x])
  • Use the Preview tab to see what your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2018.01.14. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2018.01.14

Before submitting an issue make sure you have:

  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

I've read the docs about how to know when an IP gets banned/blocked. I've read bug reports of people trying to figure out if they're been blocked/banned by YouTube. I've read the solution is to contact YouTube to get an IP whitelisted, or in some cases to go to the browser to enter a captcha and restart youtube-dl.

I haven't been blocked/banned, only looking at my options to avoid it from happening in the first place.

  1. How do I not get blocked/banned in the first place? Aside from using proxies. No more than 60 requests an hour? A minute? Based off what you experienced folks have seen, what is acceptable to avoid getting banned? (I have a digital ocean droplet that seems to currently work)

  2. If I do decide to go proxies, what proxy service do you recommend? (This would likely come way later, ideally I don't want to get banned in the first place) I've used Stormproxies before on a craigslist scraping research project, it worked well.

  3. If YouTube does block an IP, do they release the block after 1 day? 3 days? A week? Any insight here? Only is only option to request whitelist. I see some people reporting a ban lifted, although I believe that was after a month.

A bunch of these questions are trial and error. Once we have clarity here, or if someone who is a heavy user of youtube-dl knows, this could be good to add to the docs too.

Thank you for your time, and for such a great project.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Jan 16, 2018

There are no clear answers for these questions.

@dstftw dstftw closed this Jan 16, 2018
@zefoo
Copy link
Author

@zefoo zefoo commented Jan 16, 2018

@dstftw I'm not looking for exact answers, simply estimates or "I've downloaded 1 video a minute and never had issues". Any more insight here, from anyone? Thank you.

@Hrxn
Copy link

@Hrxn Hrxn commented Jan 16, 2018

  1. It's hard to get banned in the first place. I don't think the amount of videos is relevant at all. You should be able to easily download dozens of gigabytes in one run. Connecting via another data center (Digital Ocean?) might be an issue, if they figure out such address ranges, they might put some limitations into place. Residential IP address (i.e. dynamic IP) access is definitely the safer way.

  2. Not sure. Any rotating proxy. Something like Stormproxies for example, yes. Because they "simulate" access via Residential IP, right?

  3. It depends. Not sure what you have to do to get an IP address banned in the first place, to be honest.
    Normal service providers don't use address bans if avoidable, because you'll block innocent accounts as well on a shared network, intranet, university network etc. Only after heavy abuse from one specific address, then yes, maybe. While not permanent, the duration of the ban will vary and only an insider can give you an exact answer here.

Okay, now I got to ask: What are you actually planning to do? I'm trying to understand the use case here..

@zefoo
Copy link
Author

@zefoo zefoo commented Jan 16, 2018

@Hrxn Thank you for taking the time to respond, I appreciate it.

  1. Dozens of gigabytes? Oh, cool, I had no idea. That's basically the type of answer I'm looking for. Residential IPs are better, got it.

  2. Yup.

  3. Oh, I thought "getting banned" would be more common. For example, with Craigslist, if I made too many requests in a day they blocked an IP for 24 hours. They may have had other logic at play. The news websites were similar. My only previous experience with anything crawling related was with Craigslist or with a news website, all of this is for research purposes. Sentimental analysis, mostly. With YouTube, I'm not even downloading original YouTube video, I'm only interesting in transcript. Speaking of which, this makes me wonder if the video is even downloaded if I select 'skip_download': True.

Sounds like I'm probably good with this, either way. I don't even know how youtube-dl would ban either, eg. GB download limit, or too many requests? Sounds like it's not as common as I thought it would be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.