New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

searx.me doesn't display Google results #1089

Closed
ghost opened this Issue Nov 22, 2017 · 36 comments

Comments

Projects
None yet
4 participants
@ghost

ghost commented Nov 22, 2017

I don't know if this is the appropriate place to ask since this is an issue specific to a single instance of searx, but since yesterday I can't seem to obtain results from Google using searx.me .
If I disable all engines but Google I get no results at all. I tried other public instances, including search.homecomputing.fr and searx.info and they don't seem to have this problem.
Does anyone else have similar issues?

@dalf

This comment has been minimized.

Collaborator

dalf commented Nov 22, 2017

I confirm. May be related to 9ab8536 ?

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Nov 22, 2017

Sorry, I misconfigured something after the upgrade to 0.13.0. It should work now.

Btw, I don't recommend to use searx.me, it is still very crowded and has errors because of the massive amount of requests. Check out other public instances: https://github.com/asciimoo/searx/wiki/Searx-instances or host your own =)

@ghost

This comment has been minimized.

ghost commented Nov 22, 2017

Nope, it still doesn't seem to work. Tried on Firefox 57.0 and Chromium 62.0.3202.94 with a clean history and cache, and with no extensions.
Can others confirm?

EDIT: same problem with searx.info. This isn't an issue specific to searx.me anymore.

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Nov 22, 2017

This is strange.. It doesn't throw any error and it works on my local instance...

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Nov 22, 2017

Seems like google is throwing CAPTCHAs. I still don't know why and how to solve it. Any ideas?

Perhaps it is related to 9ab8536 as @dalf mentioned.

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Nov 22, 2017

I reverted 9ab8536 on searx.me and google engine works again. So it's probably related to 9ab8536.

Any ideas how to fix the language setting without getting CAPTCHAs?

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Nov 22, 2017

Hopefully 6fdb664 and 6eb9503 will solve both problems. I've deployed it to searx.me.

@muppeth

This comment has been minimized.

muppeth commented Nov 23, 2017

Is there a way to solve the captcha issue once it occurs?
I updated to latest searx (git pull of master) but google still timesout with 503 error. Checking google search with links from the searx host brings me to the warning page with captcha.

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Nov 23, 2017

@muppeth upgrade to the latest master and wait few minutes/hours before restarting searx. The captcha page says that "[...] The block will expire shortly after those requests stop.".

@muppeth

This comment has been minimized.

muppeth commented Nov 23, 2017

@asciimoo thanks. I've disabled google search engine until cooldown.

@dalf

This comment has been minimized.

Collaborator

dalf commented Nov 23, 2017

Technical note on search.py : We could imagine that an engine can throw a SearxThrottleException which gives how many minutes / hours to wait before to use this engine again. Then search.py can makes this request effective. It would avoid endless captcha (even if search.py already do that but I guess it doesn't wait enough because this is not the same use case)

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Nov 23, 2017

Seems, the problem is solved. Feel free to reopen if you notice the opposite.

@asciimoo asciimoo closed this Nov 23, 2017

@muppeth

This comment has been minimized.

muppeth commented Nov 24, 2017

Hmmm... Got Captcha'ed last evening already but I decided to update to 13.1 and swich off google search for few hours. Once enabled again it worked fine. This mornining noticed we are blocked again :(

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Nov 24, 2017

@muppeth interesting.. It works fine on searx.me since 0.13.1. Is your instance public?

@muppeth

This comment has been minimized.

muppeth commented Nov 24, 2017

@asciimoo yes it is.

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Nov 24, 2017

@muppeth perhaps some scripts abuse it and the blocking is valid. Do you use firewall or any other defense mechanism to protect your instance? I'm using filtron on searx.me.

@Pofilo

This comment has been minimized.

Collaborator

Pofilo commented Dec 1, 2017

I have the same issue on my own instance since yesterday (13.1 installed).
It used to work fine before and I don't have anything in the logs.
Is there a version before 0.13 where it used to work ?

@Pofilo

This comment has been minimized.

Collaborator

Pofilo commented Dec 1, 2017

It is due to the CAPTCHAs, was it supposed to be solved ?

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Dec 1, 2017

@Pofilo it is solved in v0.13.1. It works well on searx.me. Have you tried to suspend google engine for 1-2 hours?

@Pofilo

This comment has been minimized.

Collaborator

Pofilo commented Dec 2, 2017

@asciimoo I move to v.0.13.1 the hour it was out !
In 21 hours without using Google, I still have the issue.
I was also using autocomplete with Google (not since Yesterday tho).
Other thing, my researches are in french (fr-FR so google.fr), maybe it can be related.

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Dec 2, 2017

This is strange.. Are you sure that nobody abuses your instance?

@muppeth

This comment has been minimized.

muppeth commented Dec 2, 2017

@Pofilo You should definately enable access logs and try to follow for a while whats going on. In my case it was a bot sending enormous amount of same type queries for a day or two ( Download mp3 or Download etc). Then at some point it just stopped (I guess its the same bot jumping form one searx instance to another).

Setting up filtron (something I havent done yet) will probably save you from this happening in the future.

@Pofilo

This comment has been minimized.

Collaborator

Pofilo commented Dec 2, 2017

I'm the only one using my instance (logs are confirming that).
I will check more precisely if something on my server is talking to google (or if something talks to google through my server).

@dalf

This comment has been minimized.

Collaborator

dalf commented Dec 2, 2017

The old commit 52e615d is nearly the same, but if you have time, you can give a try :
https://raw.githubusercontent.com/asciimoo/searx/52e615dede8538c36f569d2cf07835427a9a0db6/searx/engines/google.py

@asciimoo

This comment has been minimized.

Owner

asciimoo commented Dec 2, 2017

@dalf If @Pofilo is using fr-FR, the code is identical on that call path I think, that's why I don't understand the bug.

@Pofilo

This comment has been minimized.

Collaborator

Pofilo commented Dec 5, 2017

Even in all languages I have the problem.
This is what I get from the logs:

DEBUG:urllib3.connectionpool:https://www.google.com:443 "GET /search?q=test&start=0&gws_rd=cr&gbv=1&lr=lang_en&ei=x HTTP/1.1" 302 405 DEBUG:urllib3.connectionpool:https://ipv4.google.com:443 "GET /sorry/index?continue=https://www.google.com/search%3Fq%3Dtest%26start%3D0%26gws_rd%3Dcr%26gbv%3D1%26lr%3Dlang_en%26ei%3Dx&q=EgSVyjQ0GJmpmdEFIhkA8aeDS3PI0bodzHZIBxboq0U--AsOajJWMgFy HTTP/1.1" 503 2702

We can see the request is correct according to 6fdb664.

My instance was shut down all the week-end + monday and nothing went to google from the IP of my server, I don't really understand what's happening ..

EDIT: should I open a new ticket about that ?

@ghost

This comment has been minimized.

ghost commented Dec 9, 2017

Got the same problem after upgrading to 13.1. As soon as i switch back to my 12.0 backup everything works flawlessly.

@Pofilo

This comment has been minimized.

Collaborator

Pofilo commented Dec 11, 2017

@himBeere @asciimoo okay, thanks for the information.
I'll switch back to 12.0 and give you a feedback too.

@Pofilo

This comment has been minimized.

Collaborator

Pofilo commented Dec 12, 2017

Okay, after the switch back, it was not working either (I mean google results).
After 1 day off (instance of searx down), I still have the problem too (in 12.0).

However in 13.1, I don't understand why 6fdb664 is not working for me.

@Pofilo

This comment has been minimized.

Collaborator

Pofilo commented Dec 12, 2017

Okay, I also have (only used by myself with no request on google) an instance of openvpn on my server and somehow, google is able to detect it and block it.

I'll try to make some researches about that to solve this problem.

@ghost

This comment has been minimized.

ghost commented Dec 14, 2017

I just tried again. Now i have results from google with version 13.1. Strange. But good enough for me.

cheers
t.

@ghost

This comment has been minimized.

ghost commented Dec 15, 2017

And problem occurs again. searx - 0.13.1 -> google (unexpected crash: CAPTCHA required)

@Pofilo

This comment has been minimized.

Collaborator

Pofilo commented Dec 15, 2017

Me too, the problem seems not to be related to searx directly.
When searx is down for a week, I still have the issue (openvpn and unbound down).

I don't know if my server has a service sending requests to google (I really don't think so) or if my IP has somehow been blacklisted at a moment or maybe this reason.

@ghost

This comment has been minimized.

ghost commented Dec 15, 2017

OK, i see. I thought 12.0 worked without problems. I have to re-check. Maybe 12.0 just misses the error message?

@ghost

This comment has been minimized.

ghost commented Dec 15, 2017

Hm. Switched back to 12.0. There is no error message regarding google. But the results contain no results from google.

@Pofilo

This comment has been minimized.

Collaborator

Pofilo commented Dec 15, 2017

The result of a request asking for a captcha looks like that:

https://ipv4.google.com/sorry/index?continue=https://www.google.com/search%3Fq%3Dtest...

So as we have a response, I guess this is why there is no logs about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment