Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested your PR by these commands and it did not solve the original problem, still getting the message "Sorry! we didn't find any results. Please use another query or search in more categories." :S
git clone https://github.com/asciimoo/searx.git ~/docker/searx
# enable checkout to "hidden" PR branch
cat ~/docker/searx/.git/config
grep "origin/pr" ~/docker/searx/.git/config
sed -i -E 's#(asciimoo\/searx\.git)#\1\n fetch = +refs/pull/*/head:refs/remotes/origin/pr/*#' ~/docker/searx/.git/config
cat ~/docker/searx/.git/config
# checkout to https://github.com/asciimoo/searx/pull/1597
cd ~/docker/searx
git fetch origin
git checkout pr/1597
git config user.email randomuser@gmail.com
git config user.name randomuser
git merge master
git status
# start testing
docker build -t searx-f ~/docker/searx/Dockerfile ~/docker/searx/
docker run -d --name searx -p 8888:8888 -e IMAGE_PROXY=False -e BASE_URL=https://domain.tld -e TINI_SUBREAPER=True searx
In the docker logs, there are only messages from werkzeug, if you have any idea how to debug this with verbose information produced, I can try.
I tried to debug with these steps: docker exec -it searx sh
~ $ vi searx/engines/google.py Then add to line ~213: file = open('google.html', 'w')
file.write(resp.text.encode('utf-8'))
print(resp.text.encode('utf-8'))
file.close() docker restart searx
docker logs -f searx Made a search and the classes are still obfuscated: <div class="ZINbbc xpd O9g5cc uUPGi"><div><div class="jfp3ef"><a href="/url?q=https://www.linkedin.com/company/asdasdasdasds&sa=U&ved=2ahUKEwjq0frGzLbiAhXHLlAKHSTGBBsQFjAEegQIBxAB&usg=AOvVaw3q4RreQiKUrUXDYy9QFB4G"><div class="BNeawe vvjwJb AP7Wnd">asdasd | LinkedIn</div><div class="BNeawe UPmit AP7Wnd">https://www.linkedin.com › company › asdasdasdasds</div></a></div><div class="NJM3tb"></div><div class="jfp3ef"><div><div class="BNeawe s3v9rd AP7Wnd"><div><div><div class="BNeawe s3v9rd AP7Wnd">Learn about working at asdasd. Join LinkedIn today for free. See who you know at asdasd, leverage your professional network, and get hired.</div></div></div></div></div></div></div></div> |
Can you check if in your |
Yes, it's the same on the pr/1597 branch. I also tried out many other user agent strings there starting from IE9 but the responses are always the same. |
That's strange... you have the error message |
Yes, each and every search. I recognized this behavior this morning, this is why I came here right after it. Previously, a few tap on the search button solved it, but starting from today I could not get a response anymore. It's in Budapest, Hungary. |
searx/engines/google.py
Outdated
@@ -199,6 +199,9 @@ def request(query, params): | |||
params['headers']['Accept-Language'] = language + ',' + language + '-' + country | |||
params['headers']['Accept'] = 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' | |||
|
|||
# Force Internet Explorer 12 user agent to avoid loading the new UI that Searx can't parse | |||
params['headers']['user-agent'] = "Mozilla / 5.0(MSIE 12.0; Trident / 7.0; rv: 11.0) like Gecko" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've found what's the problem here:
- Your user agent string is not formatted well, extra and/or missing spaces, etc.
- The header's name should be written as "User-Agent" to be valid.
I tried with this line and now it works! (IE9 on Win7: https://en.wikipedia.org/wiki/Internet_Explorer_9#User_agent_string)
params['headers']['User-Agent'] = "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"
Please update the PR and I'll mark it as done :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! Nice finding! I used the user agent that a website gave me in the first google results and it worked for me that's why I didn't bother about the formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the edit, I approve the changes although I don't have write access to the repo, so it just means it finally works for me :)
@kvch If you would be so kind to have a look on this PR and maybe release a minor version, it fixes the broken Google search engine, which is a huge win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't know how long it will works, but let's enjoy this time !
@kvch, I let you merge if you agree.
More details about this PR here: #1596.
In summary: Google sometimes tries to load his new UI that Searx can't parse so by defining the user agent of Internet Explorer 12 it will by default respond with the old UI because it knows that IE doesn't support its new UI.