Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multizim search is allowed only on a monolanguage set of books #838

Merged
merged 7 commits into from
Nov 1, 2022

Conversation

veloman-yunkan
Copy link
Collaborator

Quickfix for #785

I started the work on this PR with some clean-up since the prospective location of the intended fix was not correctly identified in the beginning. The fix proper is limited to the small change in getSearchInfo() (and the newly introduced helper tiny function getLanguages() on which it depends). It is properly validated by the updated and enhanced server_search unit-test.

@codecov
Copy link

codecov bot commented Oct 21, 2022

Codecov Report

Base: 71.03% // Head: 71.06% // Increases project coverage by +0.03% 🎉

Coverage data is based on head (f0f31f6) compared to base (8cc1c47).
Patch coverage: 100.00% of modified lines in pull request are covered.

❗ Current head f0f31f6 differs from pull request most recent head d1b8519. Consider uploading reports for the commit d1b8519 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #838      +/-   ##
==========================================
+ Coverage   71.03%   71.06%   +0.03%     
==========================================
  Files          53       53              
  Lines        3701     3705       +4     
  Branches     2052     2062      +10     
==========================================
+ Hits         2629     2633       +4     
  Misses       1070     1070              
  Partials        2        2              
Impacted Files Coverage Δ
include/library.h 100.00% <ø> (ø)
src/server/internalServer.h 89.47% <ø> (ø)
src/tools/archiveTools.cpp 86.66% <ø> (-0.35%) ⬇️
src/book.cpp 87.83% <100.00%> (ø)
src/library.cpp 83.52% <100.00%> (ø)
src/server/internalServer.cpp 88.90% <100.00%> (+0.39%) ⬆️
src/server/request_context.cpp 83.80% <100.00%> (+0.98%) ⬆️
src/server/request_context.h 100.00% <100.00%> (ø)
include/name_mapper.h 71.42% <0.00%> (-28.58%) ⬇️
src/opds_dumper.cpp 99.16% <0.00%> (-0.02%) ⬇️
... and 1 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

src/book.cpp Show resolved Hide resolved
@@ -306,6 +316,10 @@ SearchInfo InternalServer::getSearchInfo(const RequestContext& request) const
{
auto bookIds = selectBooks(request);
checkBookNumber(bookIds.second, m_multizimSearchLimit);
if ( getLanguages(*mp_library, bookIds.second).size() != 1 ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It be better to extend getBooksLanguages and/or getBooksLanguagesWithCount with a bookIds parameter instead of implementing a new helper method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not certain that this would be better but implemented it anyway. See the fixup commit. It ended up even worse than I expected it to be. I think you will agree that we better not include that change in the PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you will agree that we better not include that change in the PR.

I don't see why. Can you explain why you don't want ot include this change ?
You indeed have to pass the set of bookId to inner functions but there is not a lot we can do.
I would have add an overloading method (std::vector<std::string> getBooksLanguages(const BookIdCollection& bookIds) const and std::vector<std::string> getBooksLanguages() const) instead of a default argument but it would not fundamentally change your implementation.


Another solution would be to add another intermediate class like BookSet and implement some helper methods on it (getLanguages, getPublisher, ...) and have the Library inherit on BookSet and have a BookSet getBookSet(std::set<std::string> bookIds). (And we could have OPDSDumper also working on a BookSet).
But if you go this way, it should be in another PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why. Can you explain why you don't want ot include this change ?

Because it is too extensive for the small issue it is intended to address. Also in its current form it makes the API slightly inconsistent. An optional bookIds parameter was added only to Library::getBooksLanguages() and Library::getBooksLanguagesWithCounts() (the enhanced version of the latter not being justified by any current usage). From the abstract API's point of view it is not clear why Library::getBooksCategories(), Library::getBooksCreators() and Library::getBooksPublishers() don't deserve the same enhancement (in which case the change would become even larger).

You indeed have to pass the set of bookId to inner functions but there is not a lot we can do.

For the time being we can just get along with a custom small & simple helper function which keeps the total complexity of the full code at a lower level. Enhancing API's on every opportunity is not a good strategy. The ultimate goal should be maintaining globally optimal complexity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let move on with your small helper. If we need API enhancement in the future, we will make it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I undid/dropped the fixup commit made in response to this review comment.

static/i18n/en.json Show resolved Hide resolved
src/server/request_context.cpp Show resolved Hide resolved
Multizim search requires that all selected books be in the same
language.

No new URL query parameter was introduced for specifying the intended
search language - `books.filter.lang` can be used for that purpose.

The server_search unit-test was updated to use a slightly cheating
library xml file where the language of example.zim was tweaked from "en"
to "eng" in order to match that of zimfile.zim. Note that this change
drops from the tested server two other goofy ZIM files corner_cases.zim
and poor.zim that have been/are included in ServerTest.
Before this change RequestContext::get_query() returned a reordered
query string (alphabetically sorted by the parameter names).

This fix facilitiates testing of responses where the request URL appears
in the response.
@veloman-yunkan
Copy link
Collaborator Author

There was a tiny logical conflict with #837 which was resolved when rebasing after the latter was merged.

@mgautierfr mgautierfr merged commit a52138e into master Nov 1, 2022
@mgautierfr mgautierfr deleted the language_handling_during_search branch November 1, 2022 17:28
mgautierfr added a commit that referenced this pull request Nov 30, 2022
* [API Break] Remove wrapper around libzim (@mgautierfr #789)
* Allow kiwix-serve to use custom resource files (@veloman-yunkan #779)
* Properly handle searchProtocolPrefix when rendering search result (@veloman-yunkan #823)
* Prevent search on multi language content (@veloman-yunkan #838)
* Use new `zim::Archive::getMediaCount` from libzim (@mgautierfr #836)
* Catalog:
 - Include tags in free text catalog search (@veloman-yunkan #802)
 - Illustration's url is based on book's uuid (@veloman-yunkan #804)
 - Cleanup of the opds-dumper (@veloman-yunkan #829)
 - Allow filtering of catalog content using multiple languages (@veloman-yunkan #841)
 - Make opds-dumper respect the namemapper (@mgautierfr #837)
* Server:
 - Correctly handle `\` in suggestion json generation (@veloman-yunkan #843)
 - Better http caching (@veloman-yunkan #833)
 - Make `/suggest` endpoint thread-safe (@veloman-yunkan #834)
 - Better redirection of main page (@veloman-yunkan #827)
 - Remove jquery (@mgautierfr @juuz0 #796)
 - Better Viewer of zim content :
   . Introduce `/content` endpoints (@veloman-yunkan #806)
   . Switch to iframe based content viewer (@veloman-yunkan #716)
 - Optimised design of the welcome page:
   . Alignement (@juuz0 @kelson42 #786)
   . Exit download modal on pressing escape key (@Juzz0 #800)
   . Add favicon for different devices (@Juzz0 #805)
   . Fix auto hidding of the toolbar (@veloman-yunkan #821)
   . Allow user to filter books by tags in the front page (@juuz0 #711)
* CI :
  - Trigger CI on pull_request (@kelson42 #791)
  - Drop Ubuntu Impish packaging (@legoktm #825)
  - Add Ubuntu Kinetic packaging (@legoktm #801)
* Testing:
  - Test ICULanguageInfo (@veloman-yunkan #795)
  - Introduce fake `test` language to test i18n (@veloman-yunkan #848)
* Fix documentation (@kelson42 #816)
* Udpate translation (#787 #839 #847)
@mgautierfr mgautierfr mentioned this pull request Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants