Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] search for bitcoin gives Internal Server Error #642

Closed
1 of 2 tasks
eddik1970 opened this issue Feb 4, 2022 · 16 comments
Closed
1 of 2 tasks

[BUG] search for bitcoin gives Internal Server Error #642

eddik1970 opened this issue Feb 4, 2022 · 16 comments
Labels
bug Something isn't working

Comments

@eddik1970
Copy link

Describe the bug

Whenever my search includes the phrase bitcoin I get:

Internal Server Error
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

To Reproduce
Steps to reproduce the behavior:

  1. Go to search page
  2. Search for a phrase including bitcoin
  3. See error

Deployment Method

  • Docker on Centos 8

Version of Whoogle Search

  • Version v0.7.1

Desktop (please complete the following information):

Smartphone (please complete the following information):

  • Device: Samsung S8
  • OS: Android
  • Browser Opera
  • Version

Additional context
I use the following docker-compose:

whoogle-search:
image: benbusby/whoogle-search
container_name: whoogle-search
restart: unless-stopped
pids_limit: 50
mem_limit: 256mb
memswap_limit: 256mb
# user debian-tor from tor package

user: '1500'

security_opt:
  - no-new-privileges
cap_drop:
  - ALL

tmpfs:

- /config/:size=10M,uid=102,gid=102,mode=1700

- /var/lib/tor/:size=10M,uid=102,gid=102,mode=1700

- /run/tor/:size=1M,uid=102,gid=102,mode=1700

environment: # Uncomment to configure environment variables
  # Basic auth configuration, uncomment to enable
  #- WHOOGLE_USER=<auth username>
  #- WHOOGLE_PASS=<auth password>
  # Proxy configuration, uncomment to enable
  #- WHOOGLE_PROXY_USER=<proxy username>
  #- WHOOGLE_PROXY_PASS=<proxy password>
  #- WHOOGLE_PROXY_TYPE=<proxy type (http|https|socks4|socks5)
  #- WHOOGLE_PROXY_LOC=<proxy host/ip>
  # Site alternative configurations, uncomment to enable
  # Note: If not set, the feature will still be available
  # with default values.
  # - WHOOGLE_CONFIG_TOR=1
  - WHOOGLE_ALT_TW=farside.link/nitter
  - WHOOGLE_ALT_YT=farside.link/invidious
  - WHOOGLE_ALT_IG=farside.link/bibliogram/u
  - WHOOGLE_ALT_RD=farside.link/libreddit
  - WHOOGLE_ALT_MD=farside.link/scribe
  - WHOOGLE_ALT_TL=lingva.ml
  - WHOOGLE_ALT_IMG=imgin.voidnet.tech
  - WHOOGLE_ALT_WIKI=wikiless.org
#env_file: # Alternatively, load variables from whoogle.env
  #- whoogle.env
ports:
  - 5000:5000
@eddik1970 eddik1970 added the bug Something isn't working label Feb 4, 2022
@DUOLabs333
Copy link
Contributor

(Possibly) related: only on "bitcoin" does WHOOGLE_MINIMAL not work? Maybe "bitcoin" is interfering somehow?

@DUOLabs333
Copy link
Contributor

Interesting -- the sections of "People also ask", etc. does not show up in result_children. Maybe Google is making an exception for this page?

@DUOLabs333
Copy link
Contributor

Can you show us the log?

@eddik1970
Copy link
Author

how do I find the log in a docker container?

@DUOLabs333
Copy link
Contributor

It should be with docker logs, but I'm not sure.

@benbusby
Copy link
Owner

benbusby commented Feb 4, 2022

My guess is that it has something to do with the currency converter that appears when searching for bitcoin. Logs would definitely help.

Also, what settings do you have configured for your whoogle instance (i.e. language, country, etc)?

@DUOLabs333
Copy link
Contributor

Weird, I can't get the problem to show up with any other currency, crypto or otherwise.

@eddik1970
Copy link
Author

eddik1970 commented Feb 7, 2022

It should be with docker logs, but I'm not sure.

I cannot find that anything happens in the logs, except for normal activity, like about Tor and bootstrapping and such. Is there some extra logging I could turn on?

Also, you write " I can't get the problem to show up with any other currency, crypto or otherwise". Does that mean it crashes for you on bitcoin, or not? Other cryptocurrencies works fine here.

I have not set Country, Interface og search language, but it is in norwegian, so I guess that is coming via the browser.

@DUOLabs333
Copy link
Contributor

@eddik1970 Can you try the latest tag?

@eddik1970
Copy link
Author

I did a docker-compose pull and up, but it seems there are still the same problem unfortunately.

But now I got something in the log:

Feb 16 07:34:23.000 [notice] While bootstrapping, fetched this many bytes: 561668 (consensus network-status fetch); 13280 (authority cert fetch); 2621895 (microdescriptor fetch)
ERROR:app:Exception on /search [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2446, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1951, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1820, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1949, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1935, in dispatch_request
return self.view_functionsrule.endpoint
File "/whoogle/app/routes.py", line 92, in decorated
return f(*args, **kwargs)
File "/whoogle/app/routes.py", line 51, in decorated
return f(*args, **kwargs)
File "/whoogle/app/routes.py", line 359, in search
conversion = check_currency(str(response))
File "/whoogle/app/utils/results.py", line 255, in check_currency
'currencyValue2': float(currency2[0]),
ValueError: could not convert string to float: '391\xa0529.23'

@DUOLabs333
Copy link
Contributor

Oh, seems simple enough to fix -- just have to remove non numeric characters (except periods of course).

@benbusby
Copy link
Owner

@DUOLabs333 yes and no. That solution would work, but the bigger issue here is that there's a unicode char (\xa0) in the currency string returned by Google for some reason, which includes a 0. So it'd probably be better to fix like:

amount = ''.join([i if ord(i) < 128 else '' for i in str(response)])
conversion = check_currency(amount)

Might need further sanitizing though, I haven't looked yet.

@DUOLabs333
Copy link
Contributor

DUOLabs333 commented Feb 16, 2022

It's still Unicode right, so wouldn't something like re.sub('[^\d\.]', '',currency[0]) work?

@benbusby
Copy link
Owner

Yep, that works as well

@DUOLabs333
Copy link
Contributor

Should I make a pull request?

@benbusby
Copy link
Owner

@DUOLabs333 I got it, thanks though

TrueMysterious added a commit to TrueMysterious/whoogle-search that referenced this issue Feb 28, 2022
* Make `/config` directory writable by all (benbusby#616)

The `/config` directory needs to be writable by all in order to run the container
as a non-root user.

* Run container as non-root `whoogle` user (benbusby#617)

Creates a non-root user ("whoogle"), and runs the container as that user.

* Fix docker-compose.yml permission errors (benbusby#623)

* Refactor Docker CI workflows

Split previous docker test CI into one for PRs and one for triggering
the main buildx workflow that deploys new images to Docker Hub.

Note that this needs to be further refactored soon to use reusable
workflows. The main portion of docker/docker-compose tests is duplicated
between the new main + test workflows.

* Move bangs init to bg thread

Initializing the DDG bangs when running whoogle for the first time
creates an indeterminate amount of delay before the app becomes usable,
which makes usability tests (particularly w/ Docker) unreliable. This
moves the bang json init to a background thread and writes a temporary
empty dict to the bangs json file until the full bangs json can be used.

* Remove trailing whitespace

* Use `test` image tag for docker-compose tests

Also adds the ability to overwrite the image in docker-compose.yml,
which allows the CI build to use the same image for all docker tests.
The default is still 'benbusby/whoogle-search' though.

* Remove bash dependency

Depending on bash wasn't strictly necessary, as the two minimal scripts
in the repo were both nearly POSIX anyways.

Aside from simplifying the repo's dependencies a little bit, this also
helps reduce the overall Docker image size as an added bonus.

* Add nightly container vuln scan

Introduces a new 'scan' workflow for scanning the main branch container for
vulnerabilities nightly. By default, this will fail for any 'medium' or higher
vulnerability. 

Fixes benbusby#613

* Bump version to 0.7.1

* Remove broken public instance [skip ci]

search.exonip.de now redirects to startpage

Fixes benbusby#635

* Run buildx workflow on new tag

Fixes benbusby#630

* Add note for fosshost instance [skip ci]

The fosshost team decommissioned the region that Whoogle was hosted in,
but hasn't provided an option to transfer the domain record to the new VM. Until
that is fixed, the instance is inaccessible.

* Read `WHOOGLE_CONFIG_DISABLE` var as bool in app init

Fixes benbusby#636, which pointed out that the var was being interpreted as
"active" (config hidden) regardless of the value that was set.

* Override new Google search result formatting

There have been some recent formatting changes made by Google for search
results that do not look good (especially for dark themes). This
mostly overrides those styles to resemble the original Whoogle
result formatting.

* Amend body width formatting in search css

`min-width` is a better field to override than `max-width`, since some
users prefer full width results.

* Push images to ghcr.io

Alternative container registries like ghcr.io are a good option for anyone
seeking to avoid things like docker hub's latest changes to rate limiting

* Fix incorrect min-width for mobile screen sizes

min-width was previously set to 736px for all screen sizes, which forced
content off screen for smaller devices such as mobile phones. This
modifies the search stylesheet to only apply a min-width style to
devices > 800px wide.

* Update minimal mode for new Google formatting (benbusby#637)

Google's latest formatting changes broke the modifications made when enabling
`WHOOGLE_MINIMAL`. This updates the result filtering to work with the new
changes.

Fixes benbusby#634

* Fix Sinhala translation for farside search (benbusby#594)

* Use consistent header for all result types (benbusby#535)

Introduces a header for switching between result types (i.e. "All", "News",
etc) that is consistent between the different result types. Previously, image
results had a tab header that was formatted in a drastically different manner,
which was jarring when switching from a different result page to the Images
page.

Created a G class enum to reference class names returned in search
results. As noted in the class doc, this should only be used/updated as
a last resort, as class names change frequently. For some instances,
such as replacing the tbm tab, it's a lot easier to just replace by
header name than attempting to replace it based on how the element is
structured.

Also updated a few styles to revert the latest styling changes being
applied by Google.

Co-authored-by: jacr13 <ramos.joao@protonmail.com>
Co-authored-by: Ben Busby <contact@benbusby.com>

* Clean "Show more results" of all site blocks (benbusby#646)

* Add new instance to readme [skip ci] (benbusby#647)

https://whoogle.esmailelbob.xyz

* Add public instance to instance list [skip ci]

https://whoogle.esmailelbob.xyz

Amendment to benbusby#647

* Add gowogle.voring.me as public instance (benbusby#650)

Also removes fosshost instance from readme

From @benbusby:
I'm unable to get in touch with fosshost support about the whoogle
instance being unavailable, and am no longer interested in
maintaining the instance due to the lack of communication.

* Add new public instances to txt list [skip ci]

Missing from benbusby#650

* Check for soup body in `remove_site_blocks` (benbusby#651)

Fixes error with `remove_site_blocks` in the Images tab

* Fix `collapse_sections` for `MINIMAL_MODE` (benbusby#654)

* Fix "my ip" search regression

Removes dependency on class names for creating the "my ip" info card in
the results list for searches pertaining to the user's public IP.

Adds test to prevent this from happening again.

Note to anyone reading this and looking to contribute: please avoid
using hardcoded class names at all costs. This approach of
creating/removing content just results in issues if/when Google decides
to introduce/remove class names from the result page.

Fixes benbusby#657

* Check for updates using 24 hour time delta

Rather than only checking for an available update on app init, the check
for updates now performs the check once every 24 hours on the first
request sent after that period.

This also now catches the requests.exceptions.ConnectionError that is
thrown if the app is initialized without an active internet connection.

Fixes benbusby#649

* Give `Accept-Language` div its own class (benbusby#659)

Fixes accidental assignment of "get-only" class to the
"Accept-Language" config option

* Ensure valid str->float conv in currency calc

Currency amounts returned by google seem to randomly include unicode
chars ('\xa0' noted in benbusby#642) which broke the currency calculator
included in the project. This ensures that only strings that can be
converted to float are ever used in the conversion.

Fixes benbusby#642

* [Docker] Split config dir creation/set permissions

If the config dir already exists, setting the mode (`-m 777`) doesn't
actually work as it should. This change splits the command into two
separate commands for directory creation and enabling the directory to
be writable by all.

Fixes benbusby#658

* Upgrade Python image in Dockerfile (benbusby#669)

Vulnerable Python image upgraded to python:3.11.0a5-alpine

* Configure setup() using setup.cfg (benbusby#667)

Dependencies are not read from requirements.txt intentionally, so only
direct dependencies without version pinning are included.

Setuptools documentation:
https://setuptools.pypa.io/en/latest/userguide/declarative_config.html

* Update ad filter

Recent changes to ads in search results caused Whoogle to display ads
for certain searches. In particular, ads recently started appearing
grouped into one div, as opposed to a singular ad per div. This was
accompanied by the div label "ads" (instead of just "ad"), which threw
off the existing ad filter. The ad keyword blacklist has been updated
accordingly, and has been enhanced to only check against alpha chars for
each label.

This only seems to have affected English language searches, and only for
very specific searches.

Co-authored-by: ras07 <17038818+ras07@users.noreply.github.com>
Co-authored-by: nakoo <4975021+nakoo@users.noreply.github.com>
Co-authored-by: Ben Busby <contact@benbusby.com>
Co-authored-by: DUO Labs <dvdugo333@gmail.com>
Co-authored-by: සයුරි | Sayuri <85907926+sayuri-gi@users.noreply.github.com>
Co-authored-by: Joao A. Candido Ramos <joao.candido@etu.unige.ch>
Co-authored-by: jacr13 <ramos.joao@protonmail.com>
Co-authored-by: Esmail EL BoB <github.defilable@simplelogin.co>
Co-authored-by: Kainoa Kanter <44733677+ThatOneCalculator@users.noreply.github.com>
Co-authored-by: Nitish Yadav <nitishy01@gmail.com>
Co-authored-by: Albony Cal <67057319+Albonycal@users.noreply.github.com>
Co-authored-by: jan Anja <bs@sysrq.in>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants