Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting entries ending in / #793

Closed
rgaudin opened this issue Jun 7, 2024 · 7 comments · Fixed by #797
Closed

Supporting entries ending in / #793

rgaudin opened this issue Jun 7, 2024 · 7 comments · Fixed by #797
Assignees
Milestone

Comments

@rgaudin
Copy link
Member

rgaudin commented Jun 7, 2024

There is nothing in the ZIM spec that prevents using entries ending in / and thus we ended up in #779 with a ZIM (quartierschinois) that has a /-ending main page.

Nothing prevents URLs to end in slash neither.

One popular convention though is that the path is composed of segments separated by / and the last part is the document. As such, a /-ending path can be considered equivalent to the same path without that trailing slash.
This convention is still implemented in some web servers, browsers, and libraries.

A swift URL has a path property that we use all over the code base to extract the ZIM entry path.

In the swift version we use, URL.path removes that trailing slash, breaking lookup of ZIM entries with /-ending path.

In later version of swift (macOS 13+), Apple added an URL.path() that keeps the trailing slash and works as intended with those entries.

What should we do?

  1. Consider /-ending ZIM entries incorrect and fix ZIM files to not use them?
  2. Fix by using URL.path() and bump our support from macOS12+ to macOS13+ and iOS15+ to iOS16+ ?
  3. Fix for all versions by normalizing path ourselves (URL.hasDirectoryPath tells us if URL ends in /) ?

I'm in favor of the latter, until we drop those versions support at the end of the year.

@kelson42
Copy link
Contributor

kelson42 commented Jun 7, 2024

Fix 1 if possible in good conditions, not because this is wrong, but because this can be a source of trouble without bringing clear advantage.

AND

Fix 2 because Kiwix should work properly anyway and macOS15 will be released within a few month (so we can just bump the version of Swift... but maybe better in 3.5.0.

@kelson42 kelson42 added this to the 3.4.0 milestone Jun 7, 2024
@benoit74
Copy link

benoit74 commented Jun 7, 2024

Fix 1 is not easy in good conditions in warc2zim. It would means removing the trailing / in all entries which have one (quite possible) but it means we have a risk of conflict, probably rare but mostly impossible to solve. It is perfectly legit for a website to have two different resources at two different URLs differing only by a trailing / (e.g. www.example.com/myresource and www.example.com/myresource/).

Since ZIM entries are just a "string", mostly mapped to a URL, and URLs ending with a slash are legit, I don't get why we would impose more constraints that necessary on ZIM entries.

@benoit74
Copy link

benoit74 commented Jun 7, 2024

Plus removing a trailing slash in the path of a ZIM entry corresponding to an HTML document means that all relative URLs in the document have to be adapted because we are not anymore in a "folder" but an a "file".

@kelson42
Copy link
Contributor

kelson42 commented Jun 7, 2024

@BPerlakiH The we have the point 2 left!

@benoit74
Copy link

benoit74 commented Jun 7, 2024

Why not point 3 for 3.4, and point 2 for 3.5? (I mean, the code of point 3 seems a bit custom but rather plain simple to write)

@rgaudin
Copy link
Member Author

rgaudin commented Jun 7, 2024

Why not point 3 for 3.4, and point 2 for 3.5? (I mean, the code of point 3 seems a bit custom but rather plain simple to write)

😇

@kelson42
Copy link
Contributor

kelson42 commented Jun 7, 2024

Why not point 3 for 3.4, and point 2 for 3.5? (I mean, the code of point 3 seems a bit custom but rather plain simple to write)

😇

Sounds a bit like a hack to me, which will be later rolled back. this is why I prefer #2 straight.

@BPerlakiH BPerlakiH linked a pull request Jun 8, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants