Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/mail: ParseAddress doesn't handle ISO-8859-15, windows-1252 etc charsets #7079

Closed
gopherbot opened this issue Jan 8, 2014 · 10 comments
Closed
Milestone

Comments

@gopherbot
Copy link

by Famcool:

Before filing a bug, please check whether it has been fixed since the
latest release. Search the issue tracker and check that you're running the
latest version of Go:

What steps will reproduce the problem?
If possible, include a link to a program on play.golang.org.
http://play.golang.org/p/rH-eI5y1f8

What is the expected output?
The address decoded.

What do you see instead?
Error: mail: missing word in phrase

Which compiler are you using (5g, 6g, 8g, gccgo)?
6g

Which operating system are you using?
linux

Which version are you using?  (run 'go version')
1.2

Please provide any additional information below.
@rsc
Copy link
Contributor

rsc commented Mar 3, 2014

Comment 1:

Labels changed: added release-none.

Status changed to Accepted.

@gopherbot
Copy link
Author

Comment 2:

CL https://golang.org/cl/101330049 mentions this issue.

@griesemer
Copy link
Contributor

Comment 3:

Labels changed: added repo-main.

@gopherbot
Copy link
Author

Comment 4:

CL https://golang.org/cl/132680044 mentions this issue.

@bradfitz bradfitz changed the title net/mail: enable ISO-8859-15 charset support net/mail: ParseAddress doesn't handle ISO-8859-15, windows-1252 etc charsets Mar 30, 2015
@rsc rsc added this to the Unplanned milestone Apr 10, 2015
@gopherbot
Copy link
Author

CL https://golang.org/cl/7890 mentions this issue.

@mikioh mikioh modified the milestones: Unplanned, Go1.5 May 15, 2015
@alexcesaro
Copy link
Contributor

This issue is not really fixed since decoding an address before parsing it does not always work:

to := "=?ISO-8859-15?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>"

// Parsing the address still fails because of the unhandled charset.
addr, err := mail.ParseAddress(to)
fmt.Println(addr) // <nil>
fmt.Println(err)  // mail: missing word in phrase: charset not supported: "iso-8859-15"

// Decoding the address before parsing it also fails because mail.ParseAddress then
// fails parsing it.
var dec mime.WordDecoder
dec.CharsetReader = func(charset string, input io.Reader) (io.Reader, error) {
    return input, nil
}

d, err := dec.DecodeHeader(to)
fmt.Println(err) // <nil>

addr, err = mail.ParseAddress(d)
fmt.Println(addr) // <nil>
fmt.Println(err)  // mail: no angle-addr

Possible solutions:

  • Stop returning unhandled charset errors in mail.ParseAddress and return the undecoded name so the user can decode it with the new mime functions.
  • Add a new function to net/mail like ParseAddress but that also needs a CharsetReader as an argument.
  • Add a new function to net/mail to register a global CharsetReader.

/cc @bradfitz

@bradfitz
Copy link
Contributor

I'm not sure I care enough for Go 1.5.

Perhaps the minimum thing we can do is return an exported error type from mail.ParseAddress for unknown charset (instead for using fmt.Errorf) so callers can know to use mime.WordDecoder with a CharsetReader with a specific charset.

@alexcesaro
Copy link
Contributor

That does not work as mail.ParseAddress then throws mail: no angle-addr because of the special characters (see the end of the snippet above).

@bradfitz
Copy link
Contributor

I meant from the first ParseAddress call.

I was going to say let's make the mail.ParseAddress func act identically as it did in Go 1.4 for unknown charsets, but it just returned an error then too, so it's already identical.

I also considered just saying that we return the unable-to-decode part as the Name in e.g. &mail.Address{Name: "=?ISO-8859-15?Q?Keld_J=F8rn_Simonsen?=", Address: "<keld@dkuug.dk>"} and document on mail.Address.Name that if it contains "=?" then the word-encoded part(s) couldn't be decoded, but that's ambiguous: the encoding might decode successfully to something containing =?.

Yeah, maybe we just add a new func like `mail.ParseAddressWithDecoder' and take a *mime.WordDecoder. But I feel that that will both be too word and too limiting.

How about this:

package mail  // in net/mail

type AddressParser struct {
    // WordDecoder optionally specifies a decoder for RFC 2047 encoded-words.
    WordDecoder *mime.WordDecoder

   // any future options here (inevitable) like sloppiness modes
}

func (p *AddressParser) Parse(address string)  (*Address, error)
func (p *AddressParser) ParseList(list string)  ([]*Address, error)

I know we already have a private type addrParser []byte in that package, which might be confusing. I'm not proposing to export that. Maybe we rename it, though. It would have to be changed to a different representation in any case, to have a pointer to the *AddressParser, or at least the WordDecoder.

And the addrParser (lowercase) would still be made per-call (and consumed). I don't now why it's backed by a []byte instead of the input string. That seems like a mistake.

@bradfitz bradfitz reopened this May 21, 2015
@gopherbot
Copy link
Author

CL https://golang.org/cl/10392 mentions this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants