Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zimdump list --ns=<N> doesn't work as a filter #171

Closed
asashnov opened this issue Oct 29, 2020 · 10 comments
Closed

zimdump list --ns=<N> doesn't work as a filter #171

asashnov opened this issue Oct 29, 2020 · 10 comments

Comments

@asashnov
Copy link
Contributor

asashnov commented Oct 29, 2020

Program: zimdump (for zim-tools)
Version: current master (243e869)

Actual result:

The output of those commands below are equal:

$ ~/oz/zim-tools/build/src/zimdump list ~/oz/ZIM.files/storybox_en_2020-05.zim | wc -l
591

$ ~/oz/zim-tools/build/src/zimdump list --ns=M ~/oz/ZIM.files/storybox_en_2020-05.zim | wc -l
591

Expected result:

$ ~/oz/zim-tools/build/src/zimdump list --ns=M ~/oz/ZIM.files/storybox_en_2020-05.zim
Counter
Creator
Date
Description
Flavour
Language
Name
Publisher
Scraper
Source
Tags
Title

I expect --ns= to work as a filter, listing entries only from the specified namespace, skipping the others.

@kelson42
Copy link
Contributor

I believe I can confirm this problem I was already impacted myself. Is that easy to fix?

@kelson42 kelson42 pinned this issue Nov 11, 2020
@kelson42
Copy link
Contributor

@maneeshpm Would you like to look at this bug? This is a pretty anooying one for users.

@maneeshpm
Copy link
Contributor

Sure @kelson42, I'll start working on this one.

@maneeshpm
Copy link
Contributor

@kelson42 Even though we are providing the --ns= option, there is no handler for it in any of the zimdump subcmds. I could not find any function in libzim/archive.cpp that could directly help us filter by namespace. So I think we can add a condition like if (path[0] != nsfilter) continue if the ns filter is provided. Is there any better method?

@kelson42
Copy link
Contributor

@veloman-yunkan Do you believe his is the proper way of doing?

@veloman-yunkan
Copy link
Collaborator

@maneeshpm I guess you missed the zim::Archive::findByPath() method:

      /** Find a range of entry starting with path.
       *
       * The path is the "long path". (Ie, with the namespace)
       *
       * @param path The path prefix to search for.
       * @return A range starting from the first entry starting with path
       *         and ending past the last entry.
       *         If no entry starts with `path`, begin == end.
       */
      EntryRange<EntryOrder::pathOrder>  findByPath(std::string path) const;

@maneeshpm
Copy link
Contributor

Thanks for pointing it out @veloman-yunkan, I tried it, findByPath() throws a runtime error because when a namespace url like A is passed to this function, another function zim::parseLongPath() is called which throws a runtime error in such a case.

std::tuple<char, std::string> zim::parseLongPath(const std::string& longPath)
{
  /* Index of the namespace char; discard '/' from absolute paths */
  const unsigned int i = (longPath[0] == '/') ? 1 : 0;
  if (i + 2 >= longPath.size() || longPath[i] == '/' || longPath[i+1] != '/')
    throw std::runtime_error("Cannot parse path");

But modifying this function such that it returns the namespace along with an empty string as path when such a url is passed

std::tuple<char, std::string> zim::parseLongPath(const std::string& longPath)
{
  /* Index of the namespace char; discard '/' from absolute paths */
  const unsigned int i = (longPath[0] == '/') ? 1 : 0;
  if (i + 2 >= longPath.size() && longPath[i] != '/') 
    return std::make_tuple(longPath[i], "");
  if (i + 2 >= longPath.size() || longPath[i] == '/' || longPath[i+1] != '/')
    throw std::runtime_error("Cannot parse path");

After this modification:

$ zimdump list --ns=M khan-academy-videos_fr_amine_2020-06.zim                                                                                             
M/Counter
M/Creator
M/Date
M/Description
M/Flavour
M/Language
M/Name
M/Publisher
M/Scraper
M/Tags
M/Title

@kelson42 I believe this is the expected behavior.

@kelson42
Copy link
Contributor

@maneeshpm Yes, seems to be the proper behaviour but you should not have to modify the libzim. I let you sort ot these details and the review with @veloman-yunkan

@kelson42
Copy link
Contributor

@maneeshpm Please, prepare a PR (even in draft), it is not a good idea to work on code (review) base on the ticket.

@maneeshpm
Copy link
Contributor

Ohk @kelson42, thanks for the information.

maneeshpm added a commit to maneeshpm/zim-tools that referenced this issue Jan 18, 2021
- Throws runtime error in call to findByPath()->findx()->parseLongPath().
maneeshpm added a commit to maneeshpm/zim-tools that referenced this issue Jan 21, 2021
- changed function to use a brute force approach
- provides a work around to the parseLongPath
maneeshpm added a commit to maneeshpm/zim-tools that referenced this issue Jan 21, 2021
maneeshpm added a commit to maneeshpm/zim-tools that referenced this issue Jan 25, 2021
maneeshpm added a commit to maneeshpm/libzim that referenced this issue Jan 26, 2021
maneeshpm added a commit to maneeshpm/zim-tools that referenced this issue Jan 28, 2021
kelson42 added a commit that referenced this issue Feb 4, 2021
…s-a-filter

Fixes #171 zimdump list --ns=<N> doesn't work as a filter
@kelson42 kelson42 unpinned this issue Feb 4, 2021
@kelson42 kelson42 added this to the 2.2.0 milestone Sep 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants