add extra useragent. #152

fazialnjd · 2022-11-14T19:39:10Z

Hi.
My crawling process includes many requests and despite using a fake, my IP is still blocked.
Please add more fake users like iPhone and Android devices fake user agent.
For example, look at the fake useragents on this site:

And please add the ability to delete a fake useragent from list of fake useragents; in order to prevent this fake user from being used again;
and to avoid being blocked.
Thankful

melroy89 · 2022-11-14T22:16:43Z

Maybe you should also try to limit the requests / seconds / minutes you do. Since your IP is banned now, no fake useragent strings will help you with that.

If you are using scrapy framework for example, you have an option like DOWNLOAD_DELAY:

The amount of time (in secs) that the downloader should wait before downloading consecutive pages from the same website. This can be used to throttle the crawling speed to avoid hitting servers too hard.

See also another scrapy option called CONCURRENT_REQUESTS_PER_DOMAIN.

If however you use your own scripting without scrapy, consider adding sleeps to your crawling process.

melroy89 · 2022-11-14T22:21:41Z

Also are you using Amazon AWS?

fazialnjd · 2022-11-14T23:01:17Z

Also are you using Amazon AWS?No. I am not.

I use the googlesearch library python, which is based on requests and beautifulsoup; And I have also used time.sleep.
Actually; I have an API that receives almost 200 Google page links per request, and I get blocked with more requests.
The IP will be blocked for a few hours, and after that you can request it again.(The duration of the blocking is not known)

I am trying to prevent IP banning by using fake useragent and proxy.
The number of your fake useragent is 260, (and I choose them randomly); while some fake useagents may be used several times, so I need more fake useragent;
I wish the number could be increased to 500.

thanks for the help.

melroy89 · 2022-11-15T20:49:37Z

Related to: #109
and: #61

We want to switch to another source and also add mobile platforms.

melroy89 closed this as completed Mar 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add extra useragent. #152

add extra useragent. #152

fazialnjd commented Nov 14, 2022 •

edited

melroy89 commented Nov 14, 2022 •

edited

melroy89 commented Nov 14, 2022

fazialnjd commented Nov 14, 2022

melroy89 commented Nov 15, 2022

add extra useragent. #152

add extra useragent. #152

Comments

fazialnjd commented Nov 14, 2022 • edited

melroy89 commented Nov 14, 2022 • edited

melroy89 commented Nov 14, 2022

fazialnjd commented Nov 14, 2022

melroy89 commented Nov 15, 2022

fazialnjd commented Nov 14, 2022 •

edited

melroy89 commented Nov 14, 2022 •

edited