Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] WHOOGLE_MINIMAL removes search results #634

Closed
2 of 8 tasks
DUOLabs333 opened this issue Feb 1, 2022 · 17 comments · Fixed by #637
Closed
2 of 8 tasks

[BUG] WHOOGLE_MINIMAL removes search results #634

DUOLabs333 opened this issue Feb 1, 2022 · 17 comments · Fixed by #637
Labels
bug Something isn't working

Comments

@DUOLabs333
Copy link
Contributor

DUOLabs333 commented Feb 1, 2022

Describe the bug
After some time, in some cases, a Whoogle search leads to an empty search page.

To Reproduce
Steps to reproduce the behavior:

  1. Search (though it's unreliable)
  2. See blank page

Deployment Method

  • Heroku (one-click deploy)
  • Docker
  • run executable
  • pip/pipx
  • Other: [describe setup]

Version of Whoogle Search

  • Latest build from [source] (i.e. GitHub, Docker Hub, pip, etc)
  • Version [version number]
  • Not sure

Desktop (please complete the following information):

  • OS: Arch Linux
  • Browser: firefox
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.
image

@DUOLabs333 DUOLabs333 added the bug Something isn't working label Feb 1, 2022
@bruvv
Copy link
Contributor

bruvv commented Feb 1, 2022

Log files are essential to debug this....

@DUOLabs333
Copy link
Contributor Author

Whoogle doesn't output anything.

@glitsj16
Copy link
Contributor

glitsj16 commented Feb 1, 2022

I've seen this too, on 0.7.1 and also on 0.7.0 (Arch Linux, same whoogle setup). Removing WHOOGLE_MINIMAL=1 from whoogle.env (or setting WHOOGLE_MINIMAL=0) fixes this for me. Perhaps google changed something in those 'People also ask' cards that needs adjusting?

@benbusby
Copy link
Owner

benbusby commented Feb 1, 2022

Yep, looks like WHOOGLE_MINIMAL no longer works due to some behind the scenes changes to search result formatting.

Edit: I'd be cautious about any fix at this point actually. It seems like they're doing A/B testing with how they format search results, so a fix for one response body might not work for the other and vice versa. I'm getting 50/50 odds of receiving a weird left-aligned search result response w/o div backgrounds vs the usual center-aligned results.

@benbusby benbusby changed the title [BUG] Whoogle page is empty [BUG] WHOOGLE_MINIMAL removes search results Feb 1, 2022
@DUOLabs333
Copy link
Contributor Author

Oh, foo -- I really liked WHOOGLE_MINIMAL too.

@DUOLabs333
Copy link
Contributor Author

Also, side note -- body {max-width: 100% !important; /* or whatever width you want */} doesn't work anymore -- it's right aligned now.

@benbusby
Copy link
Owner

benbusby commented Feb 2, 2022

Yeah, likely related to the new result formatting that they're applying. I've noticed some styles are being forced as !important, which overrides styles that Whoogle applies (even with !important). I'm going to keep an eye on what gets changed in the next few days. Over the course of the day I've noticed that my prior 50/50 odds of getting their new result format have gone up to almost 100% new formatted results, so this might be a permanent change. I just pushed 9ba7333, which should eliminate some of the weird white box-shadow elements I was seeing, which looked really bad for dark themes in particular.

@DUOLabs333
Copy link
Contributor Author

I'm trying out a way to fix MINIMAL -- in collapse_sections, is there a way to remove a section? Just setting result_children=[] just causes a blank page.

@benbusby
Copy link
Owner

benbusby commented Feb 2, 2022

If the section is a BS4 element, then you can call decompose() on that element (i.e. elem.decompose()).

@DUOLabs333
Copy link
Contributor Author

I got something that basically works -- it should work with both types of response bodies.

@DUOLabs333
Copy link
Contributor Author

When you remove a section, a line is left behind -- it there a way to remove that?

image

@benbusby
Copy link
Owner

benbusby commented Feb 2, 2022

Hmm, I'm not sure really. I'm assuming it's leftover css, like a border being applied to an empty div (or something along those lines). Maybe for whatever section you're removing, it would be more accurate to remove its immediate parent element instead?

@DUOLabs333
Copy link
Contributor Author

I removed the "Top stories" section, how would I remove the parent element?

@benbusby
Copy link
Owner

benbusby commented Feb 2, 2022

By calling <elem>.parent.decompose() instead of just decompose(), although I can't promise that's the solution, just a hunch.

@DUOLabs333
Copy link
Contributor Author

Oh, I got it to work: I just called result.decompose()

@DUOLabs333
Copy link
Contributor Author

I'll make a pull request soon.

@glitsj16
Copy link
Contributor

glitsj16 commented Feb 2, 2022

Confirming #637 fixes WHOOGLE_MINIMAL=1 here.

benbusby pushed a commit that referenced this issue Feb 2, 2022
Google's latest formatting changes broke the modifications made when enabling
`WHOOGLE_MINIMAL`. This updates the result filtering to work with the new
changes.

Fixes #634
TrueMysterious added a commit to TrueMysterious/whoogle-search that referenced this issue Feb 28, 2022
* Make `/config` directory writable by all (benbusby#616)

The `/config` directory needs to be writable by all in order to run the container
as a non-root user.

* Run container as non-root `whoogle` user (benbusby#617)

Creates a non-root user ("whoogle"), and runs the container as that user.

* Fix docker-compose.yml permission errors (benbusby#623)

* Refactor Docker CI workflows

Split previous docker test CI into one for PRs and one for triggering
the main buildx workflow that deploys new images to Docker Hub.

Note that this needs to be further refactored soon to use reusable
workflows. The main portion of docker/docker-compose tests is duplicated
between the new main + test workflows.

* Move bangs init to bg thread

Initializing the DDG bangs when running whoogle for the first time
creates an indeterminate amount of delay before the app becomes usable,
which makes usability tests (particularly w/ Docker) unreliable. This
moves the bang json init to a background thread and writes a temporary
empty dict to the bangs json file until the full bangs json can be used.

* Remove trailing whitespace

* Use `test` image tag for docker-compose tests

Also adds the ability to overwrite the image in docker-compose.yml,
which allows the CI build to use the same image for all docker tests.
The default is still 'benbusby/whoogle-search' though.

* Remove bash dependency

Depending on bash wasn't strictly necessary, as the two minimal scripts
in the repo were both nearly POSIX anyways.

Aside from simplifying the repo's dependencies a little bit, this also
helps reduce the overall Docker image size as an added bonus.

* Add nightly container vuln scan

Introduces a new 'scan' workflow for scanning the main branch container for
vulnerabilities nightly. By default, this will fail for any 'medium' or higher
vulnerability. 

Fixes benbusby#613

* Bump version to 0.7.1

* Remove broken public instance [skip ci]

search.exonip.de now redirects to startpage

Fixes benbusby#635

* Run buildx workflow on new tag

Fixes benbusby#630

* Add note for fosshost instance [skip ci]

The fosshost team decommissioned the region that Whoogle was hosted in,
but hasn't provided an option to transfer the domain record to the new VM. Until
that is fixed, the instance is inaccessible.

* Read `WHOOGLE_CONFIG_DISABLE` var as bool in app init

Fixes benbusby#636, which pointed out that the var was being interpreted as
"active" (config hidden) regardless of the value that was set.

* Override new Google search result formatting

There have been some recent formatting changes made by Google for search
results that do not look good (especially for dark themes). This
mostly overrides those styles to resemble the original Whoogle
result formatting.

* Amend body width formatting in search css

`min-width` is a better field to override than `max-width`, since some
users prefer full width results.

* Push images to ghcr.io

Alternative container registries like ghcr.io are a good option for anyone
seeking to avoid things like docker hub's latest changes to rate limiting

* Fix incorrect min-width for mobile screen sizes

min-width was previously set to 736px for all screen sizes, which forced
content off screen for smaller devices such as mobile phones. This
modifies the search stylesheet to only apply a min-width style to
devices > 800px wide.

* Update minimal mode for new Google formatting (benbusby#637)

Google's latest formatting changes broke the modifications made when enabling
`WHOOGLE_MINIMAL`. This updates the result filtering to work with the new
changes.

Fixes benbusby#634

* Fix Sinhala translation for farside search (benbusby#594)

* Use consistent header for all result types (benbusby#535)

Introduces a header for switching between result types (i.e. "All", "News",
etc) that is consistent between the different result types. Previously, image
results had a tab header that was formatted in a drastically different manner,
which was jarring when switching from a different result page to the Images
page.

Created a G class enum to reference class names returned in search
results. As noted in the class doc, this should only be used/updated as
a last resort, as class names change frequently. For some instances,
such as replacing the tbm tab, it's a lot easier to just replace by
header name than attempting to replace it based on how the element is
structured.

Also updated a few styles to revert the latest styling changes being
applied by Google.

Co-authored-by: jacr13 <ramos.joao@protonmail.com>
Co-authored-by: Ben Busby <contact@benbusby.com>

* Clean "Show more results" of all site blocks (benbusby#646)

* Add new instance to readme [skip ci] (benbusby#647)

https://whoogle.esmailelbob.xyz

* Add public instance to instance list [skip ci]

https://whoogle.esmailelbob.xyz

Amendment to benbusby#647

* Add gowogle.voring.me as public instance (benbusby#650)

Also removes fosshost instance from readme

From @benbusby:
I'm unable to get in touch with fosshost support about the whoogle
instance being unavailable, and am no longer interested in
maintaining the instance due to the lack of communication.

* Add new public instances to txt list [skip ci]

Missing from benbusby#650

* Check for soup body in `remove_site_blocks` (benbusby#651)

Fixes error with `remove_site_blocks` in the Images tab

* Fix `collapse_sections` for `MINIMAL_MODE` (benbusby#654)

* Fix "my ip" search regression

Removes dependency on class names for creating the "my ip" info card in
the results list for searches pertaining to the user's public IP.

Adds test to prevent this from happening again.

Note to anyone reading this and looking to contribute: please avoid
using hardcoded class names at all costs. This approach of
creating/removing content just results in issues if/when Google decides
to introduce/remove class names from the result page.

Fixes benbusby#657

* Check for updates using 24 hour time delta

Rather than only checking for an available update on app init, the check
for updates now performs the check once every 24 hours on the first
request sent after that period.

This also now catches the requests.exceptions.ConnectionError that is
thrown if the app is initialized without an active internet connection.

Fixes benbusby#649

* Give `Accept-Language` div its own class (benbusby#659)

Fixes accidental assignment of "get-only" class to the
"Accept-Language" config option

* Ensure valid str->float conv in currency calc

Currency amounts returned by google seem to randomly include unicode
chars ('\xa0' noted in benbusby#642) which broke the currency calculator
included in the project. This ensures that only strings that can be
converted to float are ever used in the conversion.

Fixes benbusby#642

* [Docker] Split config dir creation/set permissions

If the config dir already exists, setting the mode (`-m 777`) doesn't
actually work as it should. This change splits the command into two
separate commands for directory creation and enabling the directory to
be writable by all.

Fixes benbusby#658

* Upgrade Python image in Dockerfile (benbusby#669)

Vulnerable Python image upgraded to python:3.11.0a5-alpine

* Configure setup() using setup.cfg (benbusby#667)

Dependencies are not read from requirements.txt intentionally, so only
direct dependencies without version pinning are included.

Setuptools documentation:
https://setuptools.pypa.io/en/latest/userguide/declarative_config.html

* Update ad filter

Recent changes to ads in search results caused Whoogle to display ads
for certain searches. In particular, ads recently started appearing
grouped into one div, as opposed to a singular ad per div. This was
accompanied by the div label "ads" (instead of just "ad"), which threw
off the existing ad filter. The ad keyword blacklist has been updated
accordingly, and has been enhanced to only check against alpha chars for
each label.

This only seems to have affected English language searches, and only for
very specific searches.

Co-authored-by: ras07 <17038818+ras07@users.noreply.github.com>
Co-authored-by: nakoo <4975021+nakoo@users.noreply.github.com>
Co-authored-by: Ben Busby <contact@benbusby.com>
Co-authored-by: DUO Labs <dvdugo333@gmail.com>
Co-authored-by: සයුරි | Sayuri <85907926+sayuri-gi@users.noreply.github.com>
Co-authored-by: Joao A. Candido Ramos <joao.candido@etu.unige.ch>
Co-authored-by: jacr13 <ramos.joao@protonmail.com>
Co-authored-by: Esmail EL BoB <github.defilable@simplelogin.co>
Co-authored-by: Kainoa Kanter <44733677+ThatOneCalculator@users.noreply.github.com>
Co-authored-by: Nitish Yadav <nitishy01@gmail.com>
Co-authored-by: Albony Cal <67057319+Albonycal@users.noreply.github.com>
Co-authored-by: jan Anja <bs@sysrq.in>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants