Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sendgrid does not encode non-ascii attachment filenames correctly #362

Open
hypirion opened this issue Jun 11, 2019 · 2 comments

Comments

@hypirion
Copy link

commented Jun 11, 2019

When sending attachments with non-ASCII filenames, they are delivered to the
recipient without being correctly encoded. Consider the following sample
program:

package main

import (
	"encoding/base64"
	"fmt"
	// "mime"
	"os"

	"github.com/sendgrid/sendgrid-go"
	"github.com/sendgrid/sendgrid-go/helpers/mail"
)

func main() {
	from := mail.NewEmail("FromUser", "from@email.com")
	subject := "Broken attachment filename"
	myUser := "my@email.here"
	to := mail.NewEmail("ToUser", myUser)
	htmlContent := "<strong>check the filename</strong>"
	message := mail.NewV3MailInit(from, subject, to, mail.NewContent("text/html", htmlContent))
	client := sendgrid.NewSendClient(os.Getenv("SENDGRID_API_KEY"))

	att := mail.NewAttachment()
	att.SetContent(base64.StdEncoding.EncodeToString([]byte("test æøå abc")))
	filename := "ø-ÆØÅ.txt"
	// filename = mime.QEncoding.Encode("utf-8", filename)
	att.SetFilename()
	att.SetType("text/plain")
	att.SetDisposition("attachment")

	message.AddAttachment(att)

	response, err := client.Send(message)
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
	} else {
		fmt.Println(response.StatusCode)
		fmt.Println(response.Body)
		fmt.Println(response.Headers)
	}
}

It successfully terminates and sends an email to the recipient. However, the
message encodes the attachment as follows:

--a8ed743f89bdbb97282bba7019318bdb5aa645e8322e6981a166e42c312b
Content-Disposition: attachment; filename="ø-ÆØÅ.txt"
Content-Transfer-Encoding: base64
Content-Type: text/plain; name="ø-ÆØÅ.txt"

dGVzdCDDpsO4w6UgYWJj
--a8ed743f89bdbb97282bba7019318bdb5aa645e8322e6981a166e42c312b--

As header information must be encoded in us-ascii according to the email spec,
this may cause wonky behaviour in clients, especially old ones. It seems likely
to be the root cause of sendgrid/sendgrid-csharp#667 and
sendgrid/sendgrid-nodejs#941.

I would expect sendgrid to automatically translate the attachment name in both
the content-disposition and the content-type to a valid encoding scheme.

The equivalent curl command is

#!/usr/bin/env bash

toUser=my@email.here
curl -X "POST" "https://api.sendgrid.com/v3/mail/send" -H "Authorization: Bearer ${SENDGRID_API_KEY}" -H "Content-Type: application/json" -d '{"from":{"name":"FromUser","email":"from@email.com"},"subject":"Broken attachment filename","personalizations":[{"to":[{"name":"ToUser","email":"'"${toUser}"'"}]}],"content":[{"type":"text/html","value":"\u003cstrong\u003echeck the filename\u003c/strong\u003e"}],"attachments":[{"content":"dGVzdCDDpsO4w6UgYWJj","type":"text/plain","filename":"ø-ÆØÅ.txt","disposition":"attachment"}]}'

and yields the exact same result.

I tried to word encode the filename as a Q encoded word: The filename is changed from
"ø-ÆØÅ.txt" to "=?utf-8?q?=C3=B8-=C3=86=C3=98=C3=85.txt?=" in the request
body, and the result is (perhaps not surprisingly) literally encoded in:

--73df3983a5d7e1192126ed5e684669c4e80ec74ce0b211af42d238466dd2
Content-Disposition: attachment; filename="=?utf-8?q?=C3=B8-=C3=86=C3=98=C3=85.txt?="
Content-Transfer-Encoding: base64
Content-Type: text/plain; name="=?utf-8?q?=C3=B8-=C3=86=C3=98=C3=85.txt?="

dGVzdCDDpsO4w6UgYWJj
--73df3983a5d7e1192126ed5e684669c4e80ec74ce0b211af42d238466dd2--

This seems to work for most email clients, but encoded words are not really
allowed inside quotes. To support as much as possible, other mail providers I've
seen tend to let the content-type be on the quoted encoded word form, and the
content-disposition on the parameter value form like so:

--73df3983a5d7e1192126ed5e684669c4e80ec74ce0b211af42d238466dd2
Content-Type: text/plain; name="=?utf-8?q?=C3=B8-=C3=86=C3=98=C3=85.txt?="
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename*=utf-8''%C3%A6-%C3%86%C3%98%C3%85.txt

dGVzdCDDpsO4w6UgYWJj
--73df3983a5d7e1192126ed5e684669c4e80ec74ce0b211af42d238466dd2--

What is the right path forward to fix this? I would assume this is an issue in SendGrid, as application/json payloads are by default UTF-8 (and application/json; charset=UTF-8 seems to do nothing), and I would expect SendGrid to handle the encoding for me. But if you think the onus is on the clients, then I'd happily submit a patch for this if I know what the server expects as encoding scheme for attachment filenames.

@thinkingserious

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

Hello @hypirion,

The best way forward right now is to vote on those issues you references and open a ticket with support to gain more visibility.

Thank you for taking the time to write out this issue is such detail!

With Best Regards,

Elmer

@hypirion

This comment has been minimized.

Copy link
Author

commented Jun 12, 2019

@thinkingserious: Thank you for the pointers, I've added a support request and will follow it up with them :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.