Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mime.ParseMediaType can't handle non-ASCII filename #1119

Closed
gopherbot opened this issue Sep 17, 2010 · 6 comments
Closed

mime.ParseMediaType can't handle non-ASCII filename #1119

gopherbot opened this issue Sep 17, 2010 · 6 comments

Comments

@gopherbot
Copy link
Contributor

by maddogfyg:

What steps will reproduce the problem?
1.Upload a file from <input type="file" name="someinput" ......
2.If the file name include non-ASCII character(such as "付云阁.jpg", my
chinese name), the post data is like:

Content-Disposition: form-data; name="someinput";
filename="���.jpg"

But mime.ParseMediaType just return "",nil


In mime package, two funcs cause this problem:

func consumeValue(v string) (value, rest string) {
    if !strings.HasPrefix(v, `"`) {
        return consumeToken(v)
    }

    // parse a quoted-string
    rest = v[1:] // consume the leading quote
    buffer := new(bytes.Buffer)
    var idx, rune int
    var nextIsLiteral bool
    for idx, rune = range rest {
        switch {
        case nextIsLiteral:
            if rune >= 0x80 {
                return "", v
            }
            buffer.WriteRune(rune)
            nextIsLiteral = false
        case rune == '"':
            return buffer.String(), rest[idx+1:]
        case IsQText(rune):
            buffer.WriteRune(rune)
        case rune == '\\':
            nextIsLiteral = true
        default:
            return "", v
        }
    }
    return "", v
}

// IsQText returns true if rune is in 'qtext' as defined by RFC 822.
func IsQText(rune int) bool {
    // CHAR        =  <any ASCII character>        ; (  0-177,  0.-127.)
    // qtext       =  <any CHAR excepting <">,     ; => may be folded
    //                "\" & CR, and including
    //                linear-white-space>
    switch rune {
    case '"', '\\', '\r':
        return false
    }
    //return rune < 0x80
    return true
}

The RFC 822 is published in 1982, it just designed for ASCII-base system. Now
"Go" can handle utf8 very well, the 0x80 limitation is unnecessary.
So the last sentence in func IsQText, "return rune < 0x80", should be
"return true".
@adg
Copy link
Contributor

adg commented Sep 21, 2010

Comment 1:

Owner changed to steph...@golang.org.

@gopherbot
Copy link
Contributor Author

Comment 2 by stephenm@golang.org:

rfc 2616 (the http/1.1 spec) states that quoted strings in the header values can contain
only iso 8859-1 characters, and also states that non iso 8859-1 text needs to be encoded
using rfc 2047. This is repeated in rfc 5987.
Having said that, support for utf-8 encoding using rfc 5987 should be added. Also, if
there's a way in which we can correctly handle iso 8859-1 characters in the filename, it
would make sense to implement that as well.
For exhaustive background reading, see http://greenbytes.de/tech/tc2231/

@gopherbot
Copy link
Contributor Author

Comment 3 by maddogfyg:

Hi,
Thanks for your reply.
So far, I have no problem with non-latin-1 character after I change the
source code, anyway, I'll check it and inform to you if something wrong,
thanks.
Yunge

@rsc
Copy link
Contributor

rsc commented Oct 11, 2010

Comment 4:

Status changed to Accepted.

@bradfitz
Copy link
Contributor

Comment 5:

Owner changed to @bradfitz.

Status changed to Started.

@bradfitz
Copy link
Contributor

Comment 6:

This issue was closed by revision 98176b7.

Status changed to Fixed.

@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants