Add PetalBot
to bots list
#544
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See github issue: #540
Some notes on this bot.
This bot says it's from
aspiegel
and from day one we have seen it coming from HUAWEI CLOUD servers! After the USA started banning Huawei they admitted that this bot is from them and it's going to be there search engine, see here: https://consumer.huawei.com/en/mobileservices/search/Screenshot as they will probably take down their webpage:
Since the USA started trying to ban Huawei and TikTok etc. this bot has been hitting our test servers over 1000 times a day!
We label this bot as
bad
and their user agent keeps changing as time goes on. Sometimes they add fake user agents to hide, yet the same ip address then hits your server again in less than a second and then displays a user agent.The regex code in this pr we have added code to allow a bot version number. In the future PetalBot may be like this:
PetalBot/1.0
therefore we decided to add a version number now.Note this bot is always changing and is crawling the internet to gather more information than it needs to do it's search engine job! Clearly it is data gathering for the Chinese government. The search engine aims to crawl apps only! Yet it keeps coming back to our test servers hitting them thousands of times a day!
User agents hitting our test servers:
UA: