Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Fixed 4chan #107
Basically when you add a referrer to the browser's request URL to 4chan, it sends your host to 4chan and it can detect you crawling and sends a captcha. So the code removes the referrer of your host being sent to 4chan and then you don't get a captcha.
However there is a host header still sent to cloudflare which is necessary. But only sends i.4cdn.org
As an example: If I ran this code at crawler.net
4chan doesnt like that many requests from the same referrer (and sending a referrer implies crawling as well) However, I'm not sure how their page loading works on their end and probably turn off referrers in their own code or have some sort of workaround. (Or from 4chan directly their own referrer would be 4chan.org)