New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False Positives with Robot Detection #136

Open
McA opened this Issue Apr 9, 2018 · 2 comments

Comments

Projects
None yet
2 participants
@McA

McA commented Apr 9, 2018

Just stumbled on some user agent strings which are falsely detected as robots with version 3.16. These are user agent strings from the Chinese vendor 'CUBOT' (https://en.wikipedia.org/wiki/Cubot).

One example:
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/39.0.0.0 Mobile Safari/537.36

Is there a way to exclude these from the 'general' bot detection based on substring matching?

Best regards
Andreas

@McA

This comment has been minimized.

Show comment
Hide comment
@McA

McA Apr 9, 2018

I thought I could attach a file with further user agent strings. Now I put them here:
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.93 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.84 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.107 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.111 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/39.0.0.0 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/39.0.0.0 Mobile Safari/537.36 ACHEETAHI/1
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/39.0.0.0 Mobile Safari/537.36 [Pinterest/Android]
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.83 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.84 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.111 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.83 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3271.3 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/59.0.3071.125 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/59.0.3071.125 Mobile Safari/537.36 [Pinterest/Android]
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/62.0.3202.84 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/63.0.3239.111 Mobile Safari/537.36
Mozilla/5.0 (Linux; U; Android 5.1; en-US; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 UCBrowser/11.3.8.976 U3/0.8.0 Mobile Safari/534.30

McA commented Apr 9, 2018

I thought I could attach a file with further user agent strings. Now I put them here:
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.93 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.84 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.107 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.111 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/39.0.0.0 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/39.0.0.0 Mobile Safari/537.36 ACHEETAHI/1
Mozilla/5.0 (Linux; Android 5.1; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/39.0.0.0 Mobile Safari/537.36 [Pinterest/Android]
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.83 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.84 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.111 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.83 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3271.3 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/59.0.3071.125 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/59.0.3071.125 Mobile Safari/537.36 [Pinterest/Android]
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/62.0.3202.84 Mobile Safari/537.36
Mozilla/5.0 (Linux; Android 6.0; CUBOT_NOTE_S Build/MRA58K; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/63.0.3239.111 Mobile Safari/537.36
Mozilla/5.0 (Linux; U; Android 5.1; en-US; CUBOT_NOTE_S Build/LMY47I) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 UCBrowser/11.3.8.976 U3/0.8.0 Mobile Safari/534.30

@oalders

This comment has been minimized.

Show comment
Hide comment
@oalders

oalders Apr 9, 2018

Owner

Hi Andreas,

Yes, I'm sure we can handle this case. I don't personally have the spare time to implement it right now, but if someone can implement it I'm happy to review and release.

All the best,

Olaf

Owner

oalders commented Apr 9, 2018

Hi Andreas,

Yes, I'm sure we can handle this case. I don't personally have the spare time to implement it right now, but if someone can implement it I'm happy to review and release.

All the best,

Olaf

oalders added a commit that referenced this issue Oct 10, 2018

Merge pull request #140 from reneeb/reneeb_fix136
Define exceptions for ROBOT_FRAGMENTS (fix for #136)

oalders added a commit that referenced this issue Oct 10, 2018

v3.18
    - Define exceptions for ROBOT_FRAGMENTS (fix for #136) (GH#140) (Renee)
    - Add another test for SeznamBot. UA string was provided in #131. (GH#142) (Renee)
    - Fix (GH#119): add researchscan.comsys.rwth-aachen.de as a robot (GH#141) (Renee)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment