Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SearXNG sometimes redirects to main page without query #3191

Open
mooreye opened this issue Feb 8, 2024 · 25 comments
Open

SearXNG sometimes redirects to main page without query #3191

mooreye opened this issue Feb 8, 2024 · 25 comments
Labels
area: limiter Things about the limiter bot protection bug Something isn't working

Comments

@mooreye
Copy link

mooreye commented Feb 8, 2024

See: libredirect/browser_extension#880

Problem encountered in Firefox extension Libredirect, but they believe this is SearXNG's problem.

@mooreye mooreye added the bug Something isn't working label Feb 8, 2024
@unixfox
Copy link
Member

unixfox commented Feb 8, 2024

It's possible that something does not work great with the anti bot protection.

@return42
Copy link
Member

return42 commented Feb 8, 2024

Can't reproduce the issue with my instance https://darmarit.org/searx/ or https://paulgo.io/ you mentioned in libredirect/browser_extension#880 (comment) .. not sure what the issue is you have.

@return42 return42 closed this as completed Feb 8, 2024
@Austin-Olacsi
Copy link
Contributor

Austin-Olacsi commented Feb 9, 2024

I can say for sure that this is not an issue with libredirect. It has happened to me on some occasions, and i do not have this extension.

i don't know how to reproduce it unfortunately. it just happened to me on my private instance a few moments ago. it usually does not happen. seems very random.

@return42 return42 reopened this Feb 10, 2024
@gabeklavans
Copy link

gabeklavans commented Feb 14, 2024

Happens to me with enough frequency to be annoying (about once a week). Both when I search using Firefox opensearch (no extensions), and when I use a redirect on iOS safari. It seems to happen with about the same frequency with both access methods.

The only other variables for me are my DNS provider (cloudflare, on which I have proxy turned off), and nginx being my reverse proxy. But I'm not sure if the problem lies with either of those.

I will try searching directly from the search home page for a while to see if it still occurs.

@MadAim123
Copy link

MadAim123 commented Feb 16, 2024

I have the same problem. I testes a lot of instances from https://searx.space/ and it happens randomly. I tried it with different browsers and devices and can't resolve it. I don't have any special plugins or extensions installed.

@return42
Copy link
Member

@MadAim123
Copy link

@return42 when I tested it now (Firefox, Ege, Chrome on Windows), I can't reproduce it.
But it happens randomly. So maybe it would happen tomorrow or in 2 weeks - or maybe not. Did you changed something in routing? Or do you use vanilla?

@return42
Copy link
Member

Vanilla installed by the scripts ..

./utils/searxng.sh install all
./utils/searxng.sh install apache

Not sure its related .. I'm using apache .. AFAIK the docker images are using caddy

@MadAim123
Copy link

MadAim123 commented Feb 16, 2024

@return42 I tried it now, after closing my browser after 1 hour and it happend with your instance too.
I klicked on your link (https://darmarit.org/searx/search?q=test&language=de&safesearch=0&categories=general&time_range=month) and was redirected to
https://darmarit.org/searx/

When I click another time, I was shown the results and I'm not redirected to the startpage.

With: Firefox, Windows. No extra installed Plugins, no extensions, Cookies are saved, jacascript is enabled.

@return42
Copy link
Member

Its a pity, I can't reproduce this issue ..

Cookies are saved

Did you saved some SearXNG preferences? if so, what prefs .. I am still looking for clues .. may this is related?

@unixfox
Copy link
Member

unixfox commented Feb 16, 2024

@return42 what are the possible ways that the anti bot may force a redirect?

I can only see that it's the case when there are too many requests:

return flask.redirect(flask.url_for('index'), code=302)

@return42
Copy link
Member

what are the possible ways that the anti bot may force a redirect?

good point 👍 / haven't in mind -->

return flask.redirect(flask.url_for('index'), code=302)

This redirects a browser to the index page / for cases in which (for whatever reasons) the browser has not requested the CSS-ping.

The index page is not in the bot detection and can be loaded even the IP has to many counts in the ip_limit.SUSPICIOUS_IP_WINDOW time window.

When the index page is loaded by the browser, the browser will send a CSS-ping request ..

suspicious = link_token.is_suspicious(network, request, True)
if not suspicious:
# this IP is no longer suspicious: release ip again / delete the counter of this IP
drop_counter(redis_client, 'ip_limit.SUSPICIOUS_IP_WINDOW' + network.compressed)
return None

and the ip_limit.SUSPICIOUS_IP_WINDOW time window for this IP will be dropped.

This method is intended to ensure that a normal user is never blocked, even if his IP (for unknown reasons) has had too many accesses in the time window ip_limit.SUSPICIOUS_IP_WINDOW the window will be dropped when the index page is loaded.

One reason why a normal user ends up in the time window may be that requests are still coming from the same subnet that do not trigger a CSS ping request. Example: there is a bot and a normal user in the subnet ... then the normal user should not be blocked, even if we have to let the bot pass.

I'll have to analyze this in more detail ... the key is generated here

def get_ping_key(network: IPv4Network | IPv6Network, request: flask.Request) -> str:
"""Generates a hashed key that fits (more or less) to a *WEB-browser
session* in a network."""
return (
PING_KEY
+ "["
+ secret_hash(
network.compressed + request.headers.get('Accept-Language', '') + request.headers.get('User-Agent', '')
)
+ "]"
)

And its lifetime is:

PING_LIVE_TIME = 3600
"""Livetime (sec) of the ping-key from a client (request)"""

@mooreye
Copy link
Author

mooreye commented Feb 16, 2024

Did you saved some SearXNG preferences? if so, what prefs .. I am still looking for clues .. may this is related?

For what it's worth, my browser with Libredirect extension has forgetful settings (no cookies, cache, site data saved), all is lost on browser quit and I still experienced the issue on SearXNG instances that I visit for the first time in current session.

@KrypticKahos
Copy link

If additional information is required for this bug I should be able to assist. I'm having a similar issue, except that it occurs on every search. My setup is as follows:
I have two instances of sear-xng running. One on an unraid machine and the other on a raspberry pi. I have a local reverse proxy setup for failover.

If I directly connect to the specific searXNG instances I have no issues. Searching works as expected. However if I use the reverse proxy address my searches always redirect to the home page in Firefox. This issue persists on a macbook as well as a windows 11 machine.
Using the exact same setup I don't have any issues using a chrome instance.

@unixfox
Copy link
Member

unixfox commented Apr 16, 2024

If it's personal usage you shouldn't have the anti bot features enabled.

It's not useful.

@KrypticKahos
Copy link

If it's personal usage you shouldn't have the anti bot features enabled.

It's not useful.

I don't believe that is my issue. Don't have any rate limiters enabled that I know of and the config shows it disabled in the docker.

@Kostrol
Copy link

Kostrol commented May 27, 2024

This is a big, big issue when you are going between a lot of random instances with libredirect daily for each search.

  1. query a new search from the address bar
  2. sometimes land on empty page on certain SearXNG instances

Sometimes this will happen multiple times in a row and can get frustrating quickly.

Few selected offending instances I've found so far to be doing this behavior:
https://search.starless.one/
https://s.mble.dk/
https://paulgo.io/
https://search.in.projectsegfau.lt/
https://ooglester.com

@Fauli1221
Copy link

I'm running into this on my own issue constantly

is there any config I can tweak that makes it less prominent?

@unixfox
Copy link
Member

unixfox commented Jun 4, 2024

It's very easy to replicate, at least on these instances:

Just don't enable javascript and you will get the bug very easily on the first visit.

@ianweedlun
Copy link

Same deal for me, goes away if I relaunch Firefox. Clearing cache/cookies beforehand doesn't seem to make a difference.

@gabeklavans
Copy link

I think it's pretty clear this is a confirmed issue. I would suggest anyone considering adding a comment to only add potential solutions or discussions for solutions, so that everyone watching this issue doesn't keep getting "same here" notifications 😁

@unixfox unixfox mentioned this issue Jun 20, 2024
@dalf
Copy link
Member

dalf commented Jun 20, 2024

@return42 @unixfox @Bnyro would it make sense to change this line:

return flask.redirect(flask.url_for('index'), code=302)

to something like this:

return flask.redirect(flask.url_for('index', error="too_many_query"), code=302) 

the error parameter is made up, the point is to let the users who have report issue here to help debugging the issue by looking at the URL.

@return42
Copy link
Member

return42 commented Jul 2, 2024

To all users affected by this unintentional redirects to SearXNG's main page. A recent discussion on #searxng:matrix.org suggests that this problem may be caused by an inadequate proxy setup.

Make sure that X-Forwarded-For is passed correctly from the proxy, it must be a list of IPs, where the last IP is the proxy's IP and the second last (the first) is the client's IP. Some proxies transmit an X-Real-IP, this should then be trustworthy / please check whether this is the real IP of the client.

The following error messages in the LOGs suggest an incorrect configuration:

WARNING:searx.botdetection: IP from X-Real-IP (random-ip) is not equal to IP from X-Forwarded-For (proxy-ip)
WARNING:searx.botdetection: IP from WSGI environment (proxy-ip) is not equal to IP from X-Real-IP (random-ip)

If the client IP is determined incorrectly, for example if the IP of the proxy is mistakenly used instead of the client IP, then the entire bot defense does not work and you are redirected to the main page again and again!

@return42
Copy link
Member

return42 commented Jul 2, 2024

The client IP is determined by the get_real_ip(..) function. This function is also used by the "Self information" plugin.

A very simple and minimal check that client IP is determined correctly can be done with this plugin:

  1. first activate the plugin in your preferences ...

image

  1. Use search term my ip .. and the IP of your client should be reported on top of the result list ..

image

The example in the screenshot is a request from my subnet and the IP is the IP of my DSL router.

You can verify the IP by https://whatismyipaddress.com/

@myned
Copy link

myned commented Jul 12, 2024

I can confirm that this is related to at least a single proxy setup.

In my case, the Self Information plugin was originally returning the IP of my proxy because of a few reasons:

  • X-Forwarded-For: CLIENTIP, PROXYIP (correct, but get_real_ip() returns PROXYIP in searxng logs, fixed by adding x_for = 2 to limiter.toml)
  • X-Real-IP: PROXYIP (incorrect because the searxng-docker Caddyfile uses header_up X-Real-IP {remote_host}, fixed by using {client_ip} with trusted_proxies)
  • WSGI environment: PROXYIP (incorrect due to Configurable x_for for both the WSGI app and for botdetection #3632, but did not affect plugin result)

After the above fixes, the plugin now correctly returns the client IP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: limiter Things about the limiter bot protection bug Something isn't working
Projects
None yet
Development

No branches or pull requests