Skip to content
This repository has been archived by the owner on Sep 7, 2023. It is now read-only.

Problem with lobste.rs (and possibly other search engines): timeout #1738

Closed
x-0n opened this issue Nov 5, 2019 · 5 comments · Fixed by #2253
Closed

Problem with lobste.rs (and possibly other search engines): timeout #1738

x-0n opened this issue Nov 5, 2019 · 5 comments · Fixed by #2253
Assignees
Labels

Comments

@x-0n
Copy link

x-0n commented Nov 5, 2019

Issue:

When running a search, lobste.rs fails for me. In the results page, I get notified that

"Engines cannot retrieve results: lobste.rs (timeout)".

in /var/log/uwsgi/uwsgi.log it looks like this:

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (2): lobste.rs:443
WARNING:searx.search:engine timeout: lobste.rs
ERROR:searx.search:engine lobste.rs : HTTP requests timeout(search duration : 8.756943225860596 s, timeout: 8.0 s) : ReadTimeout

In my settings.yml, I have the following settings:

  - name : lobste.rs
    engine : xpath
    search_url : https://lobste.rs/search?utf8=%E2%9C%93&q={query}&what=stories&order=relevance
    results_xpath : //li[contains(@class, "story")]
    url_xpath : .//span[@class="link"]/a/@href
    title_xpath : .//span[@class="link"]/a
    content_xpath : .//a[@class="domain"]
    categories : it
    shortcut : lo
    timeout : 10.0

But checking prefs from web UI looks like this:
img

Expected result

  • lobste.rs should respect the timeout set in settings.yml and thereby not overrun the max timeout

Steps to reproduce

  • Activate lobste.rs in prefs
  • run any search query

Version

  • docker image searx/searx:latest (0.15.0-186-42d5e2c0)
@return42
Copy link
Contributor

No need to activate lobste.rs in the prefs, just use !lo foo. With I can confirm the behavior see: !lo foo

@return42 return42 added the bug label Dec 10, 2019
@return42
Copy link
Contributor

return42 commented Dec 14, 2019

I tested from different nodes from DE, never got response under 6sec.

curl https://lobste.rs/search?q=foo >/dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 37971    0 37971    0     0   6027      0 --:--:--  0:00:06 --:--:--  8254

Set timeout to 10sec and the got error

File "/share/searx/searx/engines/xpath.py", line 53, in extract_url
    raise Exception('Empty url resultset')

fixed url_xpath (see below) helped, but results_xpath returns on some elements \n which cause the next exception.

I will stop here, maybe there's someone who wants to pick up the baton.

modified   searx/settings.yml
@@ -421,11 +421,13 @@ engines:
     engine : xpath
     search_url : https://lobste.rs/search?utf8=%E2%9C%93&q={query}&what=stories&order=relevance
     results_xpath : //li[contains(@class, "story")]
-    url_xpath : .//span[@class="link"]/a/@href
+    url_xpath : .//span[contains(concat(' ', @class, ' '), ' link ')]/a/@href
     title_xpath : .//span[@class="link"]/a
     content_xpath : .//a[@class="domain"]
     categories : it
     shortcut : lo
+    timeout: 10
+    disabled: True
 
   - name : microsoft academic
     engine : microsoft_academic

@unixfox
Copy link
Member

unixfox commented Mar 4, 2020

@return42 why don't you push your commit into the master branch of searx?

@return42
Copy link
Contributor

return42 commented Mar 4, 2020

because it is only a partial bugfix, not the solution .. read again ..

fixed url_xpath helped, but results_xpath returns on some elements \n which cause the next exception.

@kvch
Copy link
Member

kvch commented Oct 9, 2020

I could not reproduce the timeout problem you are describing. Whenever I changed the timeout and restarted searx, the UI displayed the correct value and the search was either done or not depending on the setting.

I have submitted a PR with an XPATH fix which closes the issue. @x-0n if the timeout problem still persists, feel free to reopen this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants