Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error/Warning for unescaped filenames in manifest #819

Closed
kmaginot opened this issue Dec 5, 2017 · 9 comments
Closed

Error/Warning for unescaped filenames in manifest #819

kmaginot opened this issue Dec 5, 2017 · 9 comments
Assignees
Labels
status: ready for implem The issue is ready to be implemented type: feature The issue describes a new feature request

Comments

@kmaginot
Copy link

kmaginot commented Dec 5, 2017

Hi,
I've noticed that Epubcheck emits the warning "Filename contains spaces, therefore URI escaping is necessary. Consider removing spaces from filename." (PKG_010) for filenames containing spaces, but does not produce an error if these filenames are not URI-escaped when they are referenced in the manifest.

Apple Transporter will reject epub files with unescaped spaces in paths in the manifest (but does not mind spaces in the actual filenames as long as they are escaped in the manifest), so it would be great if you could consider adding this as a validation error.

Please see the attached file for an example (Github does not allow epubs, therefore renamed to .zip)
alice.zip

Many thanks!

@tofi86 tofi86 added status: needs review Needs to be reviewed by a team member before further processing type: bug The issue describes a bug labels Dec 5, 2017
@kmaginot kmaginot changed the title Error/Warning für unescaped filenames in manifest Error/Warning for unescaped filenames in manifest Dec 19, 2017
@vlfors
Copy link

vlfors commented Feb 20, 2018

Hi all! I’m doing this the task. After the fix some tests have begun to fall. Most of these tasks I know how fix, but test “interesting_paths_epub2” no have check parameters. It needs change package. It is permissible?

@rdeltour
Copy link
Member

I'm finally looking into this (sorry for the late answer!). Unfortunately I couldn't reproduce: with alice sample EPUB, EPUBCheck does report the spaces as PKG_010:

alice.epub: Errors: 0; Warnings: 2
WARNING: alice.epub:OEBPS/chapter 1.html: Filename contains spaces, therefore URI escaping is necessary. Consider removing spaces from filename.
WARNING: alice.epub:OEBPS/chapter 2.html: Filename contains spaces, therefore URI escaping is necessary. Consider removing spaces from filename.

I'm closing as invalid. Feel free to reopen if the issue persists on your side or if you think I misinterpreted it!

@rdeltour rdeltour added type: not an issue The issue is rejected (not an actual issue or not relevant) and removed status: needs review Needs to be reviewed by a team member before further processing type: bug The issue describes a bug labels Jan 18, 2019
@rdeltour rdeltour self-assigned this Jan 18, 2019
@mattgarrish
Copy link
Member

Isn't the problem that epubcheck warns that the filenames should be URI escaped, but then doesn't report that the names are used without URI escaping in the package document?

The warning doesn't go away if you percent encode the spaces in the manifest entries, so I'm assuming the package document is not where the warning is coming from (if it is, it's confusing that it references the files themselves with line and offset of -1,-1).

@rdeltour
Copy link
Member

Oh right I see. I assumed the warning came from the OPF, but yes you're right it doesn't go away when the URLs are properly encoded there.

Isn't it a bit overzealous to report these as warnings however?

In any case, yes a check should be added to report the non-escaped URI in OPF: I assume that should be an error, correct?

@rdeltour rdeltour reopened this Jan 19, 2019
@rdeltour rdeltour removed the type: not an issue The issue is rejected (not an actual issue or not relevant) label Jan 19, 2019
@rdeltour
Copy link
Member

Isn't it a bit overzealous to report these as warnings however?

I mean, as far as I understand this only comes from a Note, saying:

Authors (…) might find it is best to restrict their File Names to the [US-ASCII] range.

@mattgarrish
Copy link
Member

mattgarrish commented Jan 19, 2019

Isn't it a bit overzealous to report these as warnings however?

There was a long discussion about banning spaces in file names during 3.1 because they tend to cause problems, but we decided not to. That sprang up because we had some examples in OCF where the container.xml references had spaces that weren't percent encoded. Did this warning maybe spring out of that discussion?

The lack of URI encoding is definitely a problem that needs reporting, but if we implement that then I'm not sure it matters that we also warn about the file names themselves - at least not until OCF makes some statement about their status.

@rdeltour rdeltour added the type: feature The issue describes a new feature request label Jan 19, 2019
@rdeltour
Copy link
Member

OK, so I suggest:

  • keep the WARNING when a file name contains spaces (PKG_010), and change the message to "File name contains spaces, which is discouraged.". The message’s location is the resource which file name is in question (and column/line -1).
  • add a new check for the OPF item elements, and report an ERROR when the URI is not properly escaped. The message’s location is the OPF item element.

@mattgarrish OK for you?

@rdeltour rdeltour added the status: in discussion The issue is being discussed by the development team label Jan 19, 2019
@mattgarrish
Copy link
Member

Sure, that works for now. Let me reopen that issue and see if we can get better guidance within the spec.

@rdeltour rdeltour added status: ready for implem The issue is ready to be implemented and removed status: in discussion The issue is being discussed by the development team labels Jan 19, 2019
bmaupin added a commit to bmaupin/go-epub that referenced this issue Jun 23, 2021
This allows us to test the actual contents of the manifest in the package file, and allows us to move the filename with space test out of the EPUB validation test, which is desirable because it throws a validation warning even when the filename is properly escaped (w3c/epubcheck#819)

Also move fixXMLId closer to where it's used
@rdeltour
Copy link
Member

rdeltour commented Dec 8, 2022

The URI parsing aspect is covered since #1396. The warning on non-recommended character will be dealt in #1384. Closing.

@rdeltour rdeltour closed this as completed Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready for implem The issue is ready to be implemented type: feature The issue describes a new feature request
Projects
None yet
Development

No branches or pull requests

5 participants