-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Google News engine gives bad URLs #1959
Comments
For reference / I addressed this issue in my rework of locales & languages: |
has been merged, but this problem still needs to be investigated / issue still exists --> :en !gon abcnews |
Google-News returns internal links where the origin URL is encoded in a base64 (RFC 2045 aka URL-safe) string. Closes: searxng#1959 Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
@AlyoshaVasilieva sorry for the late response on your issue report .. now I have implemented #2306 that fixes the issue you reported / would you like to test #2306 / thanks 👍 |
The PR fixed every news source I can find (thanks!), except one: Des Moines Register.
The URL after � (U+0001?) seems valid. |
Follow up of 8de8070 to fix the issue reported by AlyoshaVasilieva [1]. [1] searxng#1959 (comment) Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Follow up of 8de8070 to fix the issue reported by AlyoshaVasilieva [1]. [1] searxng#1959 (comment) Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
@AlyoshaVasilieva thanks for your elaborate tests 👍 ... helps me to make the URL decoding more robust .. I fixed the issue: If you see other issues, please let me know / thanks. |
Version of SearXNG, commit number if you are using on master branch and stipulate if you forked SearXNG
2022.11.11-3a765113
How did you install SearXNG?
Script, also occurs in public instances
What happened?
Google News URLs encodes URLs, SearXNG does not decode them
How To Reproduce
Bing, Qwant give URL:
https://abcnews.go.com/Politics/ranked-choice-voting-decide-alaskas-senate-race/story?id=93063277
Google News gives URL:
https://abcnews.go.com/Politics/ranked-choice-voting-decide-alaskas-senate-race/story?id\\u003d93063277
The Google URL does not work.
Repro 2:
Search "McConnell dismisses Scott's GOP leadership challenge"
Qwant gives URL:
https://abcnews.go.com/Politics/mcconnell-dismisses-scotts-gop-leadership-challenge-votes/story?id=93354279
Google gives URL:
https://abcnews.go.com/Politics/mcconnell-dismisses-scotts-gop-leadership-challenge-votes/story/x1aY/x17/x1dL/x0c/x0c/xd9/x0eL/xcc/xcdM/x0c/x8d/xceH/x97'
Expected behavior
URL is decoded properly
The text was updated successfully, but these errors were encountered: