Skip to content
This repository has been archived by the owner on Sep 7, 2023. It is now read-only.

Commit

Permalink
Merge pull request #1749 from unixfox/patch-1
Browse files Browse the repository at this point in the history
[fix] Force Google old UI with a new user agent
  • Loading branch information
asciimoo committed Nov 26, 2019
2 parents 42d5e2c + 8f51430 commit 2a527b8
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions searx/engines/google.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,8 +199,9 @@ def request(query, params):
params['headers']['Accept-Language'] = language + ',' + language + '-' + country
params['headers']['Accept'] = 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'

# Force Internet Explorer 12 user agent to avoid loading the new UI that Searx can't parse
params['headers']['User-Agent'] = "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"
# Force Safari 3.1 on Mac OS X (Leopard) user agent to avoid loading the new UI that Searx can't parse
params['headers']['User-Agent'] = ("Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_4)"
"AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.20.1")

params['google_hostname'] = google_hostname

Expand Down

7 comments on commit 2a527b8

@return42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a doubt, that this patch does what you think. The result of the string concatenation is:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_4)AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.20.1

You see the missing space after the parentheses, in front of 'AppleWebK..'? Is this really a valid (or an expectable) value for a User-Agent field?

@unixfox
Copy link
Member

@unixfox unixfox commented on 2a527b8 Dec 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concatenation is incorrect but the patch still works thanks to the parenthesis just before the AppleWebKit.
You can see it in action here: https://developers.whatismybrowser.com/useragents/parse/#parse-useragent
Try to remove the parenthesis and you will see that whatismybrowser.com will interpret it as a different user agent.

Anyway in the end it's my fault, I don't often deal with python so that's why I made this mistake when I tried to write the patch as pep8 compliant.

@return42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see it in action here: https://developers.whatismybrowser.com/useragents/parse/#parse-useragent

Ah, thanks for sharing the link. One note, we do not know how the peer is interpreting .. for developers.whatismybrowser.com it seems to be ok to miss the space, but you already said ..

Anyway in the end it's my fault, I don't often deal with python so that's why I made this mistake when I tried to write the patch as pep8 compliant.

Ah, OK .. no problem. I recommend an editor with a good python support. I do prefer emacs, other prefer pycharm. Such editors can help with auto-indent and more:

params['headers']['User-Agent'] = (
    "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_4)"
    " AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.20.1"
)

Another hint, did you know The Hitchhiker’s Guide to Python!, a really good starting point .. and reference .. for me and other .. all the time :)

If I can help you any more in python, don't hesitate to contact me directly markus.heiser@darmarit.de

@unixfox
Copy link
Member

@unixfox unixfox commented on 2a527b8 Dec 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for sharing the link. One note, we do not know how the peer is interpreting .. for developers.whatismybrowser.com it seems to be ok to miss the space, but you already said ..

I do think that Google is parsing the User Agent the same way as whatismybrowser because I get the old UI with this user agent:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_4)AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.20.1

But not with this one:

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_4AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.20.1

Ah, OK .. no problem. I recommend an editor with a good python support. I do prefer emacs, other prefer pycharm. Such editors can help with auto-indent and more:

I'm using Visual Studio Code which is a good editor with the Python extension. The extension does have an auto-indent but not for an auto-concatenation.
The mistake that I made is mostly due to the fact that I'm not used to do the concatenation style that pep8 requires.

@return42
Copy link
Contributor

@return42 return42 commented on 2a527b8 Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that Google is parsing the User Agent the same way as whatismybrowser because I get the old UI with this user agent:

Thanks for verifying both headers (teh one with space and the one without the needed space) .. This is the reason why I said first:

I have a doubt, that this patch does what you think.

I guess your conclusion false: for whatismybrowser it does not matter if the space is missed, for google it matters .. this is what your test with both headers shows.

At the end google thinks it can't detect the browser so it response with the old UI ... that is what I guess and why I started this discussion.

@unixfox
Copy link
Member

@unixfox unixfox commented on 2a527b8 Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it doesn't for Google, read again the user agent. The first one is the same as the patch.
And the second one is to check that indeed the parenthesis is important for having a good user agent.
Google is interpreting the same way as whatismybrowser

@return42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read again the user agent. The first one is the same as the patch. ... second one is to check that indeed the parenthesis is important

aargh, my fail ..haven't realized, that you talking about "missing parenthesis", while I am talking about the "missing space" .. in front of this closing parenthesis ...

Thanks for clarifying!

Please sign in to comment.