Skip to content
This repository has been archived by the owner on Sep 7, 2023. It is now read-only.

[mod] by default allow only HTTPS, not HTTP #2641

Merged
merged 1 commit into from Mar 12, 2021

Conversation

dalf
Copy link
Contributor

@dalf dalf commented Mar 8, 2021

What does this PR do?

This PR disable HTTP by default, and allows only the HTTPS protocol.

The HTTP can be explicitly enable using enable_http: True.

Why is this change important?

The PR #2373 detects if engine use HTTP instead of HTTPS using the request function of each engine.

But some engine send HTTP(S) requests in the request function:

Also, some other engines send additional request in the response function:

This PR minimize the number of HTTP(S) requests each time searx starts (and tested too).

In addition it makes sure searx never sends HTTP request even if an engine sends more than one request (except if explicitly enabled).

How to test this PR locally?

  • In settings.yml, set enable_http: False in the library genesis section.
  • Search using this engine
  • The response should be:
    image
  • the log:
ERROR:searx.search.processor.online:engine library genesis : requests exception(search duration : 0.08051466941833496 s, timeout: 7.0 s) : No connection adapters were found for 'http://libgen.rs/search.php?req=time'
Traceback (most recent call last):
  File "/home/alexandre/code/searx/searx/search/processors/online.py", line 143, in search
    search_results = self._search_basic(query, params)
  File "/home/alexandre/code/searx/searx/search/processors/online.py", line 123, in _search_basic
    response = self._send_http_request(params)
  File "/home/alexandre/code/searx/searx/search/processors/online.py", line 95, in _send_http_request
    response = req(params['url'], **request_args)
  File "/home/alexandre/code/searx/searx/poolrequests.py", line 209, in get
    return request('get', url, **kwargs)
  File "/home/alexandre/code/searx/searx/poolrequests.py", line 181, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/home/alexandre/code/searx/local/py3/lib/python3.8/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/alexandre/code/searx/local/py3/lib/python3.8/site-packages/requests/sessions.py", line 649, in send
    adapter = self.get_adapter(url=request.url)
  File "/home/alexandre/code/searx/local/py3/lib/python3.8/site-packages/requests/sessions.py", line 742, in get_adapter
    raise InvalidSchema("No connection adapters were found for {!r}".format(url))
requests.exceptions.InvalidSchema: No connection adapters were found for 'http://libgen.rs/search.php?req=time'

Author's checklist

Related issues

Related to

@dalf dalf requested review from asciimoo and kvch March 8, 2021 10:49
@kvch kvch merged commit a1a492b into searx:master Mar 12, 2021
MarcAbonce added a commit to MarcAbonce/searx that referenced this pull request Mar 15, 2021
regression from searx#2641
most onion websites only serve HTTP, so it must be enabled
MarcAbonce added a commit to MarcAbonce/searx that referenced this pull request Mar 15, 2021
regression from searx#2641
most onion websites only serve HTTP, so it must be enabled
jhigginbotham added a commit to jhigginbotham/searx that referenced this pull request Mar 19, 2021
Added a line to the yacy entry to enable HTTP if the local yacy instance isn't using HTTPS. Otherwise, an error will be thrown in the logs: "No connection adapters were found for 'http://localhost:8090/yacysearch.json...'". This is likely related to ticket searx#2641 that forces HTTPS by default.
@dalf dalf deleted the disable_http_by_default branch April 27, 2021 06:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants