Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] Force Google old UI #1597

Merged
merged 2 commits into from May 29, 2019

Conversation

Projects
None yet
4 participants
@unixfox
Copy link
Contributor

commented May 22, 2019

More details about this PR here: #1596.

In summary: Google sometimes tries to load his new UI that Searx can't parse so by defining the user agent of Internet Explorer 12 it will by default respond with the old UI because it knows that IE doesn't support its new UI.

@immanuelfodor
Copy link

left a comment

I tested your PR by these commands and it did not solve the original problem, still getting the message "Sorry! we didn't find any results. Please use another query or search in more categories." :S

git clone https://github.com/asciimoo/searx.git ~/docker/searx

# enable checkout to "hidden" PR branch
cat ~/docker/searx/.git/config
grep "origin/pr" ~/docker/searx/.git/config
sed -i -E 's#(asciimoo\/searx\.git)#\1\n        fetch = +refs/pull/*/head:refs/remotes/origin/pr/*#' ~/docker/searx/.git/config
cat ~/docker/searx/.git/config

# checkout to https://github.com/asciimoo/searx/pull/1597
cd ~/docker/searx
git fetch origin
git checkout pr/1597
git config user.email randomuser@gmail.com
git config user.name randomuser
git merge master
git status

# start testing
docker build -t searx-f ~/docker/searx/Dockerfile ~/docker/searx/
docker run -d --name searx -p 8888:8888 -e IMAGE_PROXY=False -e BASE_URL=https://domain.tld -e TINI_SUBREAPER=True searx

In the docker logs, there are only messages from werkzeug, if you have any idea how to debug this with verbose information produced, I can try.

@immanuelfodor

This comment has been minimized.

Copy link

commented May 25, 2019

I tried to debug with these steps:

docker exec -it searx sh
~ $ vi searx/engines/google.py

Then add to line ~213:

    file = open('google.html', 'w')          
    file.write(resp.text.encode('utf-8'))                               
    print(resp.text.encode('utf-8'))                           
    file.close()
docker restart searx
docker logs -f searx

Made a search and the classes are still obfuscated:

<div class="ZINbbc xpd O9g5cc uUPGi"><div><div class="jfp3ef"><a href="/url?q=https://www.linkedin.com/company/asdasdasdasds&amp;sa=U&amp;ved=2ahUKEwjq0frGzLbiAhXHLlAKHSTGBBsQFjAEegQIBxAB&amp;usg=AOvVaw3q4RreQiKUrUXDYy9QFB4G"><div class="BNeawe vvjwJb AP7Wnd">asdasd | LinkedIn</div><div class="BNeawe UPmit AP7Wnd">https://www.linkedin.com › company › asdasdasdasds</div></a></div><div class="NJM3tb"></div><div class="jfp3ef"><div><div class="BNeawe s3v9rd AP7Wnd"><div><div><div class="BNeawe s3v9rd AP7Wnd">Learn about working at asdasd. Join LinkedIn today for free. See who you know at asdasd, leverage your professional network, and get hired.</div></div></div></div></div></div></div></div>
@unixfox

This comment has been minimized.

Copy link
Contributor Author

commented May 25, 2019

Can you check if in your searx/engines/google.py file at the 203th line you have the same exact line as my PR: https://github.com/asciimoo/searx/pull/1597/files#diff-ed0043204dab3ab8a98a1e916146b068R203?

@immanuelfodor

This comment has been minimized.

Copy link

commented May 25, 2019

Yes, it's the same on the pr/1597 branch.

I also tried out many other user agent strings there starting from IE9 but the responses are always the same.

@unixfox

This comment has been minimized.

Copy link
Contributor Author

commented May 25, 2019

That's strange... you have the error message Sorry! we didn't find any results. Please use another query or search in more categories. at every search? I had before my patch sometimes that error but not at every search.
Where is your instance located (country)?

@immanuelfodor

This comment has been minimized.

Copy link

commented May 25, 2019

Yes, each and every search. I recognized this behavior this morning, this is why I came here right after it. Previously, a few tap on the search button solved it, but starting from today I could not get a response anymore. It's in Budapest, Hungary.

@@ -199,6 +199,9 @@ def request(query, params):
params['headers']['Accept-Language'] = language + ',' + language + '-' + country
params['headers']['Accept'] = 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'

# Force Internet Explorer 12 user agent to avoid loading the new UI that Searx can't parse
params['headers']['user-agent'] = "Mozilla / 5.0(MSIE 12.0; Trident / 7.0; rv: 11.0) like Gecko"

This comment has been minimized.

Copy link
@immanuelfodor

immanuelfodor May 25, 2019

I think I've found what's the problem here:

  • Your user agent string is not formatted well, extra and/or missing spaces, etc.
  • The header's name should be written as "User-Agent" to be valid.

I tried with this line and now it works! (IE9 on Win7: https://en.wikipedia.org/wiki/Internet_Explorer_9#User_agent_string)

params['headers']['User-Agent'] = "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

Please update the PR and I'll mark it as done :)

This comment has been minimized.

Copy link
@unixfox

unixfox May 25, 2019

Author Contributor

Oh! Nice finding! I used the user agent that a website gave me in the first google results and it worked for me that's why I didn't bother about the formatting.

@immanuelfodor
Copy link

left a comment

Thanks for the edit, I approve the changes although I don't have write access to the repo, so it just means it finally works for me :)

@kvch If you would be so kind to have a look on this PR and maybe release a minor version, it fixes the broken Google search engine, which is a huge win.

@Pofilo

Pofilo approved these changes May 27, 2019

Copy link
Collaborator

left a comment

We don't know how long it will works, but let's enjoy this time !
@kvch, I let you merge if you agree.

@dalf dalf merged commit cbd1ebd into asciimoo:master May 29, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.