Skip to content
This repository has been archived by the owner on Sep 7, 2023. It is now read-only.

Add new engine: SJP - Słownik języka polskiego #2736

Merged
merged 3 commits into from Apr 16, 2021
Merged

Conversation

plague-doctor
Copy link
Contributor

What does this PR do?

Adds a new search engine: SJP - Słownik języka polskiego (The Great Dictionary of Polish).

The engine aggregates three major dictionaries of Polish language:

  • Wielki słownik ortograficzny PWN (The Great Spelling Dictionary of the Polish Language)
  • Słownik języka polskiego PWN (Dictionary of Polish)
  • Słownik poprawnej polszczyzny (Dictionary of Correct Polish)

The engine presents responses in the infobox.

Why is this change important?

At the moment Searx is missing the engines allowing users to improve their vocabulary. I am trying to patch this gap with wordnik.com for English language (already in PR: 2735) and now with SJP for Polish.

How to test this PR locally?

Run make run and search eg. !sjp słownik.


raise_for_httperror(resp)
dom = fromstring(resp.text)
word = extract_text(dom.xpath('//*[@id="content"]/div/div[1]/div/div[1]/div[1]/div[2]/div/div/div[2]/div/div'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind simplifying the XPATH expressions in the PR?
The page you are scraping is indeed very messy, but it would be nice to use simpler expressions so we could minimize the chance of this engine breaking if the page gets redesigned.

For starters, I think this line could be simplified to

word = extract_text(dom.xpath('//div[@class="query"]'))

I have started to simplify the other expressions, but it is a bit hard as I do not speak Polish (besides "polak, wegier, dwa bratanki // i do szabli, i do szklanki" :D :D :D) so I have no idea if I am selecting the right parts of the page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good advice @kvch.
Well, I can only say "egészségedre!" in your language. Probably the most important "do szklanki" :-)

Copy link
Member

@kvch kvch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to simplify XPATH expressions.

@kvch kvch merged commit 8362257 into searx:master Apr 16, 2021
@kvch
Copy link
Member

kvch commented Apr 16, 2021

Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants