Skip to content

Commit

Permalink
Merge pull request #2477 from a-chacon/add_bot_user_agent_generator
Browse files Browse the repository at this point in the history
feat: add bot_user_agent method for generate web crawle's user agents
  • Loading branch information
thdaraujo committed Jul 4, 2022
2 parents 85016fe + c4c5f65 commit 2333f9e
Show file tree
Hide file tree
Showing 4 changed files with 66 additions and 1 deletion.
6 changes: 5 additions & 1 deletion doc/default/internet.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Faker::Internet

```ruby
# Keyword arguments: name, username, email, password, domain_name, user_agent, uuid etc...
# Keyword arguments: name, username, email, password, domain_name, user_agent, uuid etc...
Faker::Internet.user #=> { username: 'alexie', email: 'alexie@example.net' }
Faker::Internet.user('username', 'email', 'password') #=> { username: 'alexie', email: 'alexie@example.net', password: 'DtEf9P8wS31iMyC' }

Expand Down Expand Up @@ -79,5 +79,9 @@ Faker::Internet.slug(words: 'foo bar', glue: '-') #=> "foo-bar"
Faker::Internet.user_agent #=> "Mozilla/5.0 (compatible; MSIE 9.0; AOL 9.7; AOLBuild 4343.19; Windows NT 6.1; WOW64; Trident/5.0; FunWebProducts)"
Faker::Internet.user_agent(vendor: :firefox) #=> "Mozilla/5.0 (Windows NT x.y; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0"

# Keyword arguments: vendor
Faker::Internet.bot_user_agent #=> "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Faker::Internet.bot_user_agent(vendor: :duckduckbot) #=> "Mozilla/5.0 (compatible; DuckDuckBot-Https/1.1; https://duckduckgo.com/duckduckbot)"

Faker::Internet.uuid #=> "929ef6ef-b11f-38c9-111b-accd67a258b2"
```
17 changes: 17 additions & 0 deletions lib/faker/default/internet.rb
Original file line number Diff line number Diff line change
Expand Up @@ -521,6 +521,23 @@ def user_agent(legacy_vendor = NOT_GIVEN, vendor: nil)
sample(agents)
end

##
# Generate Web Crawler's user agents
#
# @return [String]
#
# @param vendor [String] Name of vendor, supported vendors are googlebot, bingbot, duckduckbot, baiduspider, yandexbot
#
# @example
# Faker::Internet.bot_user_agent #=> "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
# Faker::Internet.bot_user_agent(vendor: 'googlebot') #=> "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/99.0.4844.84 Safari/537.36"
# Faker::Internet.bot_user_agent(vendor: 'bingbot') #=> "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/86.0.4240.68 Safari/537.36 Edg/86.0.622.31"
def bot_user_agent(vendor: nil)
agent_hash = translate('faker.internet.bot_user_agent')
agents = vendor.respond_to?(:to_sym) && agent_hash[vendor.to_sym] || agent_hash[sample(agent_hash.keys)]
sample(agents)
end

##
# Generated universally unique identifier
#
Expand Down
29 changes: 29 additions & 0 deletions lib/locales/en/internet.yml
Original file line number Diff line number Diff line change
Expand Up @@ -124,3 +124,32 @@ en:
- Opera/9.80 (X11; Linux i686; Ubuntu/14.10) Presto/2.12.388 Version/12.16
safari:
- Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A
bot_user_agent:
googlebot:
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/83.0.4103.122 Safari/537.36
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/99.0.4844.84 Safari/537.36
- Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/87.0.4280.90 Safari/537.36
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Safari/537.36 Googlebot-Image/1.0
bingbot:
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/86.0.4240.68 Safari/537.36 Edg/86.0.622.31
- Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534 +(KHTML, like Gecko) BingPreview/1.0b
- Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0; BingPreview/1.0b) like Gecko
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/98.0.4758.102 Safari/537.36
- Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
duckduckbot:
- DuckDuckBot-Https/1.1; (+https://duckduckgo.com/duckduckbot)
- Mozilla/5.0 (compatible; DuckDuckBot-Https/1.1; https://duckduckgo.com/duckduckbot)
- DuckDuckBot/1.1; (+http://duckduckgo.com/duckduckbot.html)
- DuckDuckBot-Https/1.1; (+https://duckduckgo.com/duckduckbot)
- Mozilla/5.0 (compatible; DuckDuckBot-Https/1.1; https://duckduckgo.com/duckduckbot)
baiduspider:
- Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; Baiduspider-render/2.0 ; +http://www.baidu.com/search/spider.html)
- Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; Baiduspider-render/2.0 ; Smartapp; +http://www.baidu.com/search/spider.html)
- Mozilla/5.0 (compatible; Baiduspider-render/2.0 ; +http://www.baidu.com/search/spider.html)
yandexbot:
- Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
- Mozilla/5.0 (compatible; YandexDirect/3.0; +http://yandex.com/bots)
- Mozilla/5.0 (compatible; YandexMetrika/2.0; +http://yandex.com/bots yabs01)
- Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.268
- Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)
15 changes: 15 additions & 0 deletions test/faker/default/test_faker_internet.rb
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,21 @@ def test_user_agent_with_invalid_argument
assert @tester.user_agent(vendor: 1).match(/Mozilla|Opera/)
end

def test_bot_user_agent_with_no_argument
assert @tester.bot_user_agent.match(/Baiduspider|Bot|bot/)
end

def test_bot_user_agent_with_valid_argument
assert @tester.bot_user_agent(vendor: :duckduckbot).match(/DuckDuckBot/)
assert @tester.bot_user_agent(vendor: 'duckduckbot').match(/DuckDuckBot/)
end

def test_bot_user_agent_with_invalid_argument
assert @tester.bot_user_agent(vendor: :ie).match(/Baiduspider|Bot|bot/)
assert @tester.bot_user_agent(vendor: nil).match(/Baiduspider|Bot|bot/)
assert @tester.bot_user_agent(vendor: 1).match(/Baiduspider|Bot|bot/)
end

def test_uuid
uuid = @tester.uuid
assert_equal(36, uuid.size)
Expand Down

0 comments on commit 2333f9e

Please sign in to comment.