Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Content-Disposition header in case of filenames with unicode characters or spaces #68

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jbuchermn
Copy link

@jbuchermn jbuchermn commented Jul 13, 2022

Fixes #54

The implementation is not the most elegant but should be quite easy to understand. Feel free to suggest improvements.

Edit: After recent changes no longer related to #54 but exclusively about Unicode filenames.

Instead of

    Content-Disposition: "attachment; file=a b.pdf"

this outputs

    Content-Disposition: attachment; file="a b.pdf"

The quotation marks are introduced always; therefore this breaks
compatibility if filenames are quoted manually. Then again, the
quotation marks around "attachment; file=..." where introduced
before this change already if e.g. the filename contained umlaute
and didn't respect quoted filenames.
@snoyberg
Copy link
Owner

This change introduces a breaking API change, which I'd rather avoid. It also doesn't seem strictly necessary: from what I can tell, doing the encoding when generating the headers with a filename= value would work as well. But I'm not entirely certain this change is correct: is it actually valid to use the same encoding mechanism inside a header value like you're doing, as opposed to for the entire header value?

@jbuchermn
Copy link
Author

jbuchermn commented Jul 14, 2022

By breaking API change you mean users calling the method with quoted filenames I assume? I get that that might break stuff (although the parameter is named filename and a string with quotes around the actual filename is something else in my opinion) - I could add in a sanitize method removing possibly inserted quotes?

About what the various relevant RFCs state, I have to admit I don't know, what is valid and correct in theory. Possibly, only providing ASCII filenames in quotes (Content-Disposition: attachment; filename="ascii_name.xyz") is valid...

However, for the practical part: I have tested various cases and have not been able to successfully send a unicode filename to GMail. GMail does not seem to support escaping the whole header (in one case the filename was quoted before encoding, in the other case not - both don't work)

Content-Disposition: =?utf-8?Q?attachment=3B_filename=3Dunic=C3=B6de_filename=2Epdf?=
Content-Disposition: =?utf-8?Q?attachment=3B_filename=3D=22quoted_unic=C3=B6de_filename=2Epdf=22?=

but has no issues with:

Content-Disposition: attachment; filename="=?utf-8?Q?unic=C3=B6de_filename=2Epdf?="

which means with the current version of mime-mail there is no possibility of sending attachments with unicode filenames to a gmail client (unless I'm missing something else...)

Another way could be provide a way to use filename* instead of filename, but that's probably a bigger change.

@jbuchermn
Copy link
Author

I did some more tests without adding explicit quotes. With that there should not be any behaviour change for ASCII filenames.

Using the following filename parameters:

"ascii filename.txt"
"\"quoted ascii filename.txt\""
"unicöde filename.txt"
"\"quoted unicöde filename.txt\""

I sent mails to three different accounts with and without my PR:

  • On GMail and Outlook, using the proposed change I got all 4 attachments, without it only the ascii ones.
  • On ionos, the change did not matter - I only got the quoted attachments.

@snoyberg
Copy link
Owner

By breaking API change you mean users calling the method with quoted filenames I assume?

No, I mean the fact that the public API for this package includes the Header type and its definition is changing. This can break existing user code, which we'd like to avoid.

I still think it should be possible to do the escaping without changing that API, specifically by changing how the original filename= values are generated. And it would be great to be able to document the relevant standard that specifies that this escaping can happen inside the filename value (though, admittedly, I've been pretty bad about that up until now).

@jbuchermn
Copy link
Author

Okay, I get it - changing the Header type indeed does not seem to be justified by this corner case.

What I can tell (or guess), skipping over (some of the) relevant RFCs and more pragmatically testing it:

  • The filename* parameter has been introduced explicitly to handle different encodings. It's got a quite weird syntax and I don't know how well it is supported but should be the theoretical way to go.
  • I don't know if encoding the whole header attachment; filename=unicöde.txt is theoretically valid, but in practice it does not work for at least some widely used mail clients.
  • I don't think encoding just the filename= parameter (like I'm doing) is conforming to the standard, but it works well in practice.

Considering these points you are right to oppose the change. Although it would be nice to have a way of pragmatically allowing for unicode filenames. Maybe just adding encodeIfNeeded to the public API so that users can encode filenames before passing them to simpleMailInMemory?

@snoyberg
Copy link
Owner

I don't mind including the encodeIfNeeded export, but I really do think there's a possibility to do this encode in the current codepaths. Regardless, if you want this included, please:

  • Add a minor version bump
  • Add a changelog entry
  • Add a @since comment on the newly exported identifier

@jbuchermn
Copy link
Author

jbuchermn commented Jul 20, 2022

It's possible to do it by changing a detail in partToPair (my original PR minus the Header API change). However I'm no longer that convinced that this is valid and standard behaviour, although it works in practice.

Do you have in mind another way of enabling this encode?

@snoyberg
Copy link
Owner

No I don’t have anything else in mind

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Space in filename causes error when using mime-mail-ses
2 participants