Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some engines don't obey chosen language #1838

Open
geckolinux opened this issue Feb 9, 2020 · 24 comments · May be fixed by #1866
Open

Some engines don't obey chosen language #1838

geckolinux opened this issue Feb 9, 2020 · 24 comments · May be fixed by #1866

Comments

@geckolinux
Copy link

@geckolinux geckolinux commented Feb 9, 2020

Hi there, I noticed that when enabling Yandex I get russian results mixed in, despite having set English as the language:

Screenshot_2020-02-08 patagonia - searx

And Bing news always returns results only in German:

Screenshot_2020-02-08 coronavirus - searx

@tbergeron

This comment has been minimized.

Copy link

@tbergeron tbergeron commented Feb 25, 2020

I'm wondering the same! I use a German hosted mirror and all news content is in German except for Faroo's content. Anyone has an idea of a workaround for that?

@return42

This comment has been minimized.

Copy link
Collaborator

@return42 return42 commented Feb 25, 2020

Yando and faroo do not support do languages. To see which engine support languages take a look at the preferences / Engines / column "Selected language"

@return42 return42 closed this Feb 25, 2020
@tbergeron

This comment has been minimized.

Copy link

@tbergeron tbergeron commented Feb 25, 2020

@return42 I'm talking about Google, Bing, etc. i.e. http://0x0.st/iqDv.png

@tbergeron

This comment has been minimized.

Copy link

@tbergeron tbergeron commented Feb 25, 2020

@return42 both google and bing supports "selected language" http://0x0.st/iqD3.png

so this issue is valid.

This happens because the mirror we use is hosted in another country so when fetching results from Google, Google thinks we're from that country. So when Searx sends a request to Google or Bing it should specify the language it wants results for. (I'll dig up the code to see if I can PR anything)

@geckolinux

This comment has been minimized.

Copy link
Author

@geckolinux geckolinux commented Feb 25, 2020

I agree that this issue is indeed valid, because Bing is supposed to support language filtering, and it frequently contaminates SearX results with wrong-language (mostly German) results.

@return42

This comment has been minimized.

Copy link
Collaborator

@return42 return42 commented Feb 25, 2020

Sorry, I can't reproduce it .. e.g. my host is located in germany and when I test google with English selected

I got only english results, if I test with german lang selected, I get only german results from google:

bing seems also work

If you think I am wrong, ask for reopen the issue and give an example at hand / Thanks!

@return42

This comment has been minimized.

Copy link
Collaborator

@return42 return42 commented Feb 25, 2020

BTW: please use searx's search syntax to test explicit combinations of languages and engines .. no need to activate the search engine in the preferences.

@tbergeron

This comment has been minimized.

Copy link

@tbergeron tbergeron commented Feb 25, 2020

That's the funny thing. Even on my side when using :en patagonia in "News" it works. If I do patagonia I get German results. This should let you replicate this issue.

And it bugs only when using POST!!! Just noticed this! I switched to GET to send you test URLs and both works :-/

So please try to do this:

  • Use English & POST in Preferences
  • Click on the News tab and search for :en patagonia: you won't see German results
  • Then on the same News tab search for patagonia and you should see German results even if English is selected

i.e. with GET: http://0x0.st/iqDk.png
with POST: http://0x0.st/iqDn.png

EDIT: Make sure to run the search / refresh a few times after swapping settings.

@tbergeron

This comment has been minimized.

Copy link

@tbergeron tbergeron commented Feb 25, 2020

So the language parameter doesn't seem to be passed to the engine when using POST. Only when using GET for some reason.

@tbergeron

This comment has been minimized.

Copy link

@tbergeron tbergeron commented Feb 25, 2020

Nevermind, even when using GET after refreshing a few times to see more results Google and Bing are still in German... http://0x0.st/iqD7.png

Bonus fun fact: when using the language drop down on the search page. Switching to French actually display French results from Bing and Google XD but switching back to English shows German results.

@return42

This comment has been minimized.

Copy link
Collaborator

@return42 return42 commented Feb 25, 2020

Please do not mix bing and bing-news engines .. use search syntax and give me an example .. screenshot are not useful for debugging ;). I also can not see different POST/GET results on my engine .. BTW we made some bugfix on language support these days / did you updated your instance?

@tbergeron

This comment has been minimized.

Copy link

@tbergeron tbergeron commented Feb 25, 2020

I understand. I'm still a bit new using Searx so I'm figuring out the UX and everything at once so thanks for your patience :P I wasn't aware of Searx having separate engines for search/news.

This issue seems to be only happening with Google News and Bing News engine after double checking "general" results. So I now know a bit more of where to focus to debug this issue.

This is not my instance so I cannot alter anything on it unfortunately. It's running on 0.16.0 so maybe it's not totally up-to-date? Do you see anything weird with these settings: http://0x0.st/iqDC.png

I'll do my best to come up with fully 100% reproducible steps. At least now I know it's related only with Bing News and Google News engines.

@return42

This comment has been minimized.

Copy link
Collaborator

@return42 return42 commented Feb 25, 2020

I understand. I'm still a bit new using Searx

welcome :)

I can't speak for other instances. . if unsure use my instance and give me an example at hand with search syntax .. if you find a different result in GET/POST name it, so I can reproduce the misbehaviour / thanks!

@tbergeron

This comment has been minimized.

Copy link

@tbergeron tbergeron commented Feb 25, 2020

welcome :)

thanks a lot :D

First thing when trying your instance: I clicked on News, searched for Coronavirus and behold! German results! http://0x0.st/iqkK.png (after the digg results; yes I know... another screenshot XD)

I'll do my best to hunt down that bug but it is definitively reproducible everywhere I tried damn it :P I see there has been language-related changes in engines on Jan 7 (b63d645) but I doubt this is the reason why this happens... let's get my hands dirty! hehe

Thanks again for your warm welcome and help!

@return42

This comment has been minimized.

Copy link
Collaborator

@return42 return42 commented Feb 25, 2020

The changes I told are from the last days ... #1860 should be related to the behaviour.

Thanks again for your warm welcome and help!

No problem .. but .. please give me a search syntax ;)

@geckolinux

This comment has been minimized.

Copy link
Author

@geckolinux geckolinux commented Feb 25, 2020

  • search.disroot.org/?q=coronavirus&time_range=&language=en&category_news=on
    • Some Dutch results mixed in when enabling Bing, this instance is located in Holland.
  • searx.info/?q=coronavirus&time_range=&language=en&category_news=on
    • All German results, instance located in Germany
@return42

This comment has been minimized.

Copy link
Collaborator

@return42 return42 commented Feb 25, 2020

Again, no need to switch to a category, no need to enable a engine in the preferences .. to see misbehaviour of a engine select this engine explicit with search syntax.

Anyway I see a that bing-news brings german results on top for servers hosted in germany, even if we select the english language ... same with google-news

For now the only thing I can say, that (some) news-engines do not support languages .. I will have a look on bing-news and google-news.

@return42 return42 reopened this Feb 25, 2020
@tbergeron

This comment has been minimized.

Copy link

@tbergeron tbergeron commented Feb 25, 2020

Again, no need to switch to a category, no need to enable a engine in the preferences .. to see misbehaviour of a engine select this engine explicit with search syntax.

Good to know even if it's disabled in preferences, we can still use the shortcuts. I wasn't aware of that. Sorry for not being able to come up with search syntax right away. I now understand better the concept. Thanks for your patience with us, I forked the code and will play around to learn the development environment and might start to contribute soon. I really love the project, great idea and great execution! Keep up the great work!

return42 added a commit to return42/searx that referenced this issue Feb 25, 2020
closes: asciimoo#1838

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
@tbergeron

This comment has been minimized.

Copy link

@tbergeron tbergeron commented Feb 25, 2020

@return42 just saw the commit already passing through, you are a mad man! thanks so much for already figuring this out for us! How does releases work? I guess this change will be available on your instance sooner than any other since I guess admins need to update manually their instances? Sorry for throwing so many questions at you 😆

@return42

This comment has been minimized.

Copy link
Collaborator

@return42 return42 commented Feb 25, 2020

I guess this change will be available on your instance sooner than any other

First the PR needs reviewed & merged ..

I guess admins need to update manually their instances?

We recommend to update from git repos

See also my remark why versioning does not make much sense / engines are a moving targets, this issue is the best example

@return42

This comment has been minimized.

Copy link
Collaborator

@return42 return42 commented Feb 25, 2020

@geckolinux @tbergeron

If you find more engines not working please open new issue with the search syntax selecting explicite the engine and language and we can react much faster ;)

Another tip: it is very easy to test a up-to-date instance locally (in your download folder where you can delete it after your tests)

$ cd ~/Download
$ git clone https://github.com/asciimoo/searx/
$ cd searx
$ make run
@geckolinux

This comment has been minimized.

Copy link
Author

@geckolinux geckolinux commented Feb 25, 2020

@return42 Hey, cool, I didn't know it was that easy to test it. I just did as suggested.

So I'm still seeing that with Bing and Google news engines, it's giving me local results in a non-English language, despite me having specified English.

@return42

This comment has been minimized.

Copy link
Collaborator

@return42 return42 commented Feb 25, 2020

The PR #1866 is not yet merged .. if you want to merge the PR locally just pull it:

git pull origin refs/pull/1866/head

.. if started, first stop make run with [CTRL-C] ..

@geckolinux

This comment has been minimized.

Copy link
Author

@geckolinux geckolinux commented Feb 25, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants
You can’t perform that action at this time.