Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/FEATURE] Bypass EU cookie consent #243

Closed
2 of 8 tasks
Suika opened this issue Mar 30, 2021 · 26 comments
Closed
2 of 8 tasks

[BUG/FEATURE] Bypass EU cookie consent #243

Suika opened this issue Mar 30, 2021 · 26 comments
Labels
bug Something isn't working

Comments

@Suika
Copy link
Contributor

Suika commented Mar 30, 2021

Describe the bug
I use proxy setup that rotates the proxing in the background and from time to time, when I connect to EU in this particular instance DE, whoogle will display a "Cookie consent" form from google.
After reloading the pages 2-3 times it vanishes, nonetheless it shows up every ~1-3 request via the addressbar search.

To Reproduce
Steps to reproduce the behavior:

  1. Add you instance to your browser, in my case ungoogled-chromium
  2. Either proxy or VPN the instance to the EU(DE)
  3. Perform a serach via the url bar
  4. See error

Deployment Method

  • Heroku (one-click deploy)
  • Docker
  • run executable
  • pip/pipx
  • Other: [describe setup]

Version of Whoogle Search

  • Latest build from [source] (i.e. GitHub, Docker Hub, pip, etc)
  • Version [version number]
  • Not sure

Desktop (please complete the following information):

  • OS: [Win/Linux]
  • Browser [ungoogled-chromium]
  • Version [88.0.4324.182]

Additional context
I want non-consensual google search

@Suika Suika added the bug Something isn't working label Mar 30, 2021
@readall
Copy link

readall commented Mar 30, 2021

This seems some kind of change from goog. I am also facing the same on my docker instance.

@benbusby
Copy link
Owner

That's annoying. I'll try to replicate on my end and see what I can do.

@readall I assume your instance is also running in the EU?

@Hunam6
Copy link

Hunam6 commented Mar 30, 2021

I assume your instance is also running in the EU?

Mine is too

@benbusby
Copy link
Owner

From a cursory bit of testing, it looks like accepting this dialog sets a CONSENT cookie with a value that roughly matches the following format YES+[country code].[language]+[V#]+[B?]+[###].

Examples:

Netherlands (UK English) -- CONSENT: YES+NL.en-GB+V9+BX+723

Sweden (Swedish) -- CONSENT: YES+SE.sv+V10+B+256

I tried manually swapping the Sweden and Netherlands cookies, and it didn't seem to have a problem with that, so I'm curious if there's just a default value that can be set in requests.send() to bypass this altogether. Unfortunately it seems that whatever cryptic method they're using to generate the exact cookie value isn't easily accessible, but that might not be an issue (for now).

I'll try to look into this some more soon. Thanks everyone for reporting it!

@dr460nf1r3
Copy link
Contributor

Screenshot_FireDragon_1

I can confirm that this happens since yesterday every few searches.

@benbusby benbusby changed the title [BUG] Cookie consent shows up every X searches [BUG/FEATURE] Bypass EU cookie consent Mar 30, 2021
@Suika
Copy link
Contributor Author

Suika commented Mar 30, 2021

@benbusby ok, here is a strange one. Set cookie CONSENT=PENDING+600 and you should be fine for as many queries as you can do. Set it to ~220 you might get a consent page, set it to <=210 and you will always get the consent page. Max is 999 and it seems to be doing fine, anything higher resets the number to 060.

Don't explicitly know how it is tied together, but that one cookie seems to keep the consent away for a duration of time.

@readall
Copy link

readall commented Mar 31, 2021

our instance is also running in th

@benbusby Yes, my VPS is hosted within EU.

@readall
Copy link

readall commented Mar 31, 2021

@benbusby ok, here is a strange one. Set cookie CONSENT=PENDING+600 and you should be fine for as many queries as you can do. Set it to ~220 you might get a consent page, set it to <=210 and you will always get the consent page. Max is 999 and it seems to be doing fine, anything higher resets the number to 060.

@Suika if you don't mind, where is code can this setting be done. I would like to try that.

@Suika
Copy link
Contributor Author

Suika commented Mar 31, 2021

@benbusby ok, here is a strange one. Set cookie CONSENT=PENDING+600 and you should be fine for as many queries as you can do. Set it to ~220 you might get a consent page, set it to <=210 and you will always get the consent page. Max is 999 and it seems to be doing fine, anything higher resets the number to 060.

@Suika if you don't mind, where is code can this setting be done. I would like to try that.

Quick and dirty would be https://github.com/benbusby/whoogle-search/blob/develop/app/request.py#L210-L212 just slap another entry in like 'cookie': 'CONSENT=PENDING+999'

@readall
Copy link

readall commented Mar 31, 2021

Quick and dirty would be https://github.com/benbusby/whoogle-search/blob/develop/app/request.py#L210-L212 just slap another entry in like 'cookie': 'CONSENT=PENDING+999'
@Suika
Thanks for this hack. In effect the cookie-consent appears for a very brief moment (like few mili seconds) and then the search results load. So the irritation is temporarily addressed.

@Suika
Copy link
Contributor Author

Suika commented Mar 31, 2021

Quick and dirty would be https://github.com/benbusby/whoogle-search/blob/develop/app/request.py#L210-L212 just slap another entry in like 'cookie': 'CONSENT=PENDING+999'
@Suika
Thanks for this hack. In effect the cookie-consent appears for a very brief moment (like few mili seconds) and then the search results load. So the irritation is temporarily addressed.

You saw that probably just after you changed it. That is normal. Any consequent searches should not show anything. At least it doesn't for me.

benbusby added a commit that referenced this issue Mar 31, 2021
@benbusby
Copy link
Owner

Just pushed a hotfix that includes @Suika's fix while I look into this a bit more. It's available on the develop and heroku-app-beta branches, as well as the beta and buildx-experimental images (once the CI build finishes).

@federicotorrielli
Copy link
Contributor

From today on it's not possibile anymore to search in EU, this pops up anytime on 0.3.1, can we get a new release? Thanks!

@Hunam6
Copy link

Hunam6 commented Apr 1, 2021

@federicotorrielli
I'm using the heroku-app-beta branch as specified in the previous comment by @benbusby and it works as intended.

@jeroenev
Copy link

jeroenev commented Apr 1, 2021

From today on it's not possibile anymore to search in EU, this pops up anytime on 0.3.1, can we get a new release? Thanks!

same for me, it used to show every few searches, but now it shows every single time I click search.

@federicotorrielli
Copy link
Contributor

federicotorrielli commented Apr 1, 2021

whoogle beta branch isn't really working for me right now btw, I tried creating more instances but this keeps showing:
image

@Hunam6
Copy link

Hunam6 commented Apr 1, 2021

@federicotorrielli I can relate, I have this bug since I updated my fork.
This bug was introduced here: whoogle-mirror/whoogle-search@449999d...282208f

@gpopesc
Copy link

gpopesc commented Apr 1, 2021

use beta tag in docker compose or docker cli in order to pull the corresponding image.
the latest version 0.3.1 it doesn't work for EU users
it works for me
image: benbusby/whoogle-search:beta

@LeonMusCoden
Copy link

buildx-experimental works again, thanks.

docker pull benbusby/whoogle-search:buildx-experimental
docker run --publish 5000:5000 --detach --name whoogle-search benbusby/whoogle-search:buildx-experimental

@benbusby
Copy link
Owner

benbusby commented Apr 1, 2021

The heroku-app-beta branch should be fixed now (minor regression, unrelated to this issue).

@inlophe
Copy link

inlophe commented Apr 2, 2021

Is it just me or the beta and buildx-experimental docker tag stuck at starting for anyone else? I tried with both compose and run directly on 2 different servers and all stuck at starting then change to unhealthy after a while. But when I revert to the latest tag, it works (but with EU cookie consent ofc). Running it with -it produces the same log for latest and beta/buildx-experimental

@gpopesc
Copy link

gpopesc commented Apr 2, 2021

@inlophe no problem for me running beta version since yesterday
version: "3"
services:
whoogle-search:
image: benbusby/whoogle-search:beta
container_name: whoogle-search
ports:
- 6051:5000
restart: unless-stopped

@Suika
Copy link
Contributor Author

Suika commented Apr 7, 2021

Sooo, did this fix it? Well, if the non-consensual searching ever breaks the only way will be, is to give consent ytdl-org/youtube-dl@14f29f0.
Unless there is a loophole. Either way, if fixed do close the issue?

@jeroenev
Copy link

jeroenev commented Apr 7, 2021

Currently the 4.0 release seems to work again for me

@dr460nf1r3
Copy link
Contributor

I also did not experience issues after the fix was pushed, seems like it is indeed fixed

@benbusby benbusby closed this as completed Apr 7, 2021
return42 added a commit to return42/searxng that referenced this issue Jun 18, 2021
In the EU there exists a "General Data Protection Regulation" [1] aka GDPR (BTW:
very user friendly!) which requires consent to tracking.  To get the consent
from the user, google-news requests are redirected to confirm and get a CONSENT
Cookie from https://consent.google.de/s?continue=...

This patch adds a CONSENT Cookie to the google-news request to avoid
redirection.

The behavior of the CONTENTS cookies over all google engines seems similar but
the pattern is not yet fully clear to me, here are some random samples from my
analysis ..

Using common google search from different domains::

    google.com:        CONSENT=YES+cb.{{date}}-14-p0.de+FX+816
    google.de:         CONSENT=YES+cb.{{date}}-14-p0.de+FX+333
    google.fr:         CONSENT=YES+srp.gws-{{date}}-0-RC2.fr+FX+826

When searching about videos (google-videos)::

    google.es:         CONSENT=YES+srp.gws-{{date}}-0-RC2.es+FX+076
    google.de:         CONSENT=YES+srp.gws-{{date}}-0-RC2.de+FX+171

Google news has only one domain for all languages::

    news.google.com:   CONSENT=YES+cb.{{date}}-14-p0.de+FX+816

Using google-scholar search from different domains::

    scholar.google.de: CONSENT=YES+cb.{{date}}-14-p0.de+FX+333
    scholar.google.fr: does not use such a cookie / did not ask the user
    scholar.google.es: does not use such a cookie / did not ask the user

Interim summary:

  Pattern is unclear and I won't apply the CONSENT cookie to all google engines.
  More experience is need before we generalize the CONSENT cookies over all
  google engines.

Related:

- e9a6ab4 [fix] youtube - send CONSENT Cookie to not be redirected
- benbusby/whoogle-search#311
- benbusby/whoogle-search#243

[1] https://en.wikipedia.org/wiki/General_Data_Protection_Regulation
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

11 participants