It's possible to get Google results even when getting blocked by Google Recaptcha. #159

unixfox · 2021-06-20T18:15:35Z

On the mobile UI of Google Search, the button More results is not affected by Google rate limiting and I can still do requests while I'm actively blocked by the original Google search. You can even fetch the results from the first page by modifying the request.

The results are given in strange raw data, but you can extract the needed HTML code, which seems to be the same code as the one from Google search if you fake the user agent to a desktop browser.

RAW response data: https://gist.github.com/unixfox/0cb7eebd3add42bfcbf42ea29a063b89#file-raw-txt

HTML manually parsed from the RAW response data: https://gist.github.com/unixfox/0cb7eebd3add42bfcbf42ea29a063b89#file-beautifier-html

Here is how to do requests:

URL: https://www.google.com/search?vet=12ahUKEwjE4O6xoajxAhWL_KQKHVCLBKoQxK8CegQIAhAG..i&ved=2ahUKEwjE4O6xoajxAhWL_KQKHVCLBKoQqq4CegQIAhAI&yv=3&q=test&prmd=vmin&ei=c0fQYITbBIv5kwXQlpLQCg&start=0&sa=N&asearch=arc&async=arc_id:srp_510,ffilt:all,ve_name:MoreResultsContainer,next_id:srp_5,use_ac:true,_id:arc-srp_510,_pms:qs,_fmt:pc
Headers:
- user-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0
- accept: */*
- sec-fetch-site: same-origin
- sec-fetch-mode: cors
- sec-fetch-dest: empty
- referer: https://www.google.com/
- accept-encoding: gzip, deflate
- accept-language: en-US,en;q=0.9

The query parameters are very similar to the ones from the original Google search. Not all query parameters are required, and some can be omitted. That's also the case of the headers.

@return42 Any idea how we could implement that in Searx?

The text was updated successfully, but these errors were encountered:

unixfox · 2021-06-21T08:40:03Z

@return42 I just updated my comment with a simpler way of extracting the data. Please have a look at it again.

return42 · 2021-06-21T08:50:44Z

Thanks!! .. I will have a look, but give me some time :-)

unixfox · 2021-06-21T09:55:57Z

Well I tried myself and this just need a couple of modified lines, and it works! Here is the patch:

diff --git a/searx/engines/google.py b/searx/engines/google.py
index 841212e0..ae8e6ab5 100644
--- a/searx/engines/google.py
+++ b/searx/engines/google.py
@@ -273,6 +273,8 @@ def request(query, params):
         'ie': "utf8",
         'oe': "utf8",
         'start': offset,
+        'asearch': "arc",
+        'async': "arc_id:srp_510,ffilt:all,ve_name:MoreResultsContainer,next_id:srp_5,use_ac:true,_id:arc-srp_510,_pms:qs,_fmt:pc"
     })
 
     if params['time_range'] in time_range_dict:
@@ -282,9 +284,7 @@ def request(query, params):
     params['url'] = query_url
 
     params['headers'].update(lang_info['headers'])
-    params['headers']['Accept'] = (
-        'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'
-    )
+    params['headers']['Accept'] = ('*/*')
 
     return params

The issue is that Searx can't find the number of results. This thing:

related to #159

disable by default, it has to be enabled in settings.yml related to #159

disable by default, it has to be enabled in settings.yml related to searxng#159

SecureCPU · 2022-06-29T16:28:06Z

How can I implement this in whoogle? I just got rate limited.

unixfox · 2022-06-29T17:00:38Z

How can I implement this in whoogle? I just got rate limited.

You are on the searxng repository not whoogle, please open an issue on the correct project.

dalf added a commit that referenced this issue Jun 21, 2021

[experimental] google: use the mobile UI

95e634a

related to #159

dalf mentioned this issue Jun 21, 2021

[experimental] google: use the mobile UI #160

Merged

dalf added a commit that referenced this issue Jun 21, 2021

[mod] google: add "use_mobile_ui" parameter to use mobile endpoint.

7a5c364

disable by default, it has to be enabled in settings.yml related to #159

unixfox closed this as completed Jun 21, 2021

return42 mentioned this issue Jun 21, 2021

improve & document google engine #165

Merged

dalf referenced this issue in dalf/searxng Jun 22, 2021

[mod] google: add "use_mobile_ui" parameter to use mobile endpoint.

1fbe333

disable by default, it has to be enabled in settings.yml related to #159

unixfox mentioned this issue Aug 13, 2021

[FEATURE] anti-captcha support. benbusby/whoogle-search#211

Open

MarcAbonce pushed a commit to MarcAbonce/searxng that referenced this issue Sep 23, 2021

[mod] google: add "use_mobile_ui" parameter to use mobile endpoint.

8bf216e

disable by default, it has to be enabled in settings.yml related to searxng#159

aw-jansen mentioned this issue Jun 10, 2022

[BUG] Instance has been ratelimited benbusby/whoogle-search#707

Open

1 task

gromans35614 mentioned this issue Aug 1, 2022

SearXNG not searching Google for responses #1600

Closed

unixfox mentioned this issue Aug 9, 2022

Google search internal API with JSON results #1642

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

It's possible to get Google results even when getting blocked by Google Recaptcha. #159

It's possible to get Google results even when getting blocked by Google Recaptcha. #159

unixfox commented Jun 20, 2021 •

edited

Loading

unixfox commented Jun 21, 2021 •

edited

Loading

return42 commented Jun 21, 2021

unixfox commented Jun 21, 2021 •

edited

Loading

SecureCPU commented Jun 29, 2022

unixfox commented Jun 29, 2022

It's possible to get Google results even when getting blocked by Google Recaptcha. #159

It's possible to get Google results even when getting blocked by Google Recaptcha. #159

Comments

unixfox commented Jun 20, 2021 • edited Loading

unixfox commented Jun 21, 2021 • edited Loading

return42 commented Jun 21, 2021

unixfox commented Jun 21, 2021 • edited Loading

SecureCPU commented Jun 29, 2022

unixfox commented Jun 29, 2022

unixfox commented Jun 20, 2021 •

edited

Loading

unixfox commented Jun 21, 2021 •

edited

Loading

unixfox commented Jun 21, 2021 •

edited

Loading