-
-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix parseLongPath() to handle namespaces #479
Conversation
@veloman-yunkan The changes have been included. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you run unit-tests? The parseLongPath unit test is failing. Please review and modify that unit test respectively, to reflect the updated functionality of parseLongPath()
. Try to add more corner cases if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good! We are about to converge. Only need to take consistency seriously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
The two comments below are just suggestions. Since it's a matter of personal taste, feel free to ignore them. But we also need to get back to your observation regarding Archive::findByPath()
. Since parseLongPath()
now doesn't reject inputs of the form A/
or /A/
, Archive::findByPath()
will work incorrectly in those cases. Please check that hypothesis in the test/find.cpp
unit test and propose a fix for it, too.
@veloman-yunkan Archive::EntryRange<EntryOrder::pathOrder> Archive::findByPath(std::string path) const
{
/* Removing trailing slash */
if(path.back() == '/') path.pop_back();
entry_index_t begin_idx, end_idx;
if (m_impl->hasNewNamespaceScheme()) {
...
} else {
...
}
return Archive::EntryRange<EntryOrder::pathOrder>(m_impl, begin_idx.v, end_idx.v);
} Works as expected and passes the build tests. |
Why it is failing ? (And how ?) This should not. The path of a item is just a string of bytes. There is no semantics associated to it (except the namespace part). If we search for a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving. Please squash all your commits into one - we don't need all the history of this PR.
@mgautierfr makes a good point
Please also add corresponding test cases to |
@mgautierfr Sorry, I meant it fails for all namespace urls with trailing slash. This is an implementation issue in Suppose we pass
The list should only output entries starting with |
You are right. But then, your proposed fix ( |
@mgautierfr I understand now, thanks for pointing it out! Perhaps I should modify the condition to check if it's actually a namespace url and only then remove the trailing slash if required. |
16df835
to
9134973
Compare
9134973
to
01f08fa
Compare
@mgautierfr I think your review is again needed here. |
@maneeshpm Please rebase your branch on latest |
0172ab3
to
c90fe22
Compare
@maneeshpm You have only one commit now in the PR, not sure if you have squased everything together or if you have lost some commits... but to do a rebase you need to resync your fork (https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/syncing-a-fork) and then make in your feature branch:
|
@kelson42 Thanks for the info. I am never going to forget that 😅, sorry for the clutter. I have squashed the commits and this pr contains all the intended changes. |
There is another use case not handled : The increment of the last char is a bit more complex that it seems at first sight, maybe we need a specific method. It is a matter of style, but I prefer to have the test written on several lines :
|
@mgautierfr Thats a valid point, parsing the path before beforehand seems to be the correct approach. I agree, in the future having a more sophisticated method for this would be a nice idea. Archive::EntryRange<EntryOrder::pathOrder> Archive::findByPath(std::string path) const
{
entry_index_t begin_idx, end_idx;
if (path.empty() || path == "/") {
begin_idx = m_impl->getStartUserEntry();
end_idx = m_impl->getEndUserEntry();
} else if (m_impl->hasNewNamespaceScheme()) {
begin_idx = m_impl->findx('C', path).second;
path.back()++;
end_idx = m_impl->findx('C', path).second;
} else {
char ns;
std::tie(ns, path) = parseLongPath(path);
begin_idx = m_impl->findx(ns, path).second;
if (path.empty()){
ns++;
end_idx = m_impl->findx(ns, path).second;
} else {
path.back()++;
end_idx = m_impl->findx(ns, path).second;
}
}
return Archive::EntryRange<EntryOrder::pathOrder>(m_impl, begin_idx.v, end_idx.v);
} I think this is pretty much exhaustive and covers all the unit tests. After this modification, all the invalid paths like the one you mentioned will be handled by parseLongPath by throwing an
will be changed from |
We should not change the API based on the nature of a internal method. For a long time, a (long) path was composed of a namespace and a short path. The new api remove this. |
@mgautierfr That makes sense. Since we are trying to hide the old namespace scheme, I think we should place |
@mgautierfr This fix will prevent any error from being thrown in parseLongPath and rather, return an empty range if an "unknown" URL is passed to the function. |
d41a7bc
to
e7a26f7
Compare
We should try/catch only the parsing of the url. |
e7a26f7
to
24330dd
Compare
@mgautierfr Understood, I've made the necessary changes. Is this code structure fine? |
We are good.
But it is a matter of preference, it works anyway. If @veloman-yunkan is ok, we can merge. |
24330dd
to
7662fa4
Compare
Thanks for the suggestion @mgautierfr. This looks much more neater and concise. I will try to follow it in the future as well. |
@maneeshpm Your branch would benefit to be rebased on our git master, so we can merge it. |
Return empty range for unknown paths
7662fa4
to
6d8de41
Compare
@kelson42 rebased to master. Thanks! |
@maneeshpm It seems you latest PR breaks on windows, see https://github.com/openzim/libzim/runs/1825653239?check_suite_focus=true |
@kelson42 Looks like |
@maneeshpm yes please. You should have write permission now on this repo, please male you PR here (and not in your fork). |
Fixes #477
If a
path
of length 1 is passed and qualifies as a namespace character, return(ns, "")