Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: mime: handling duplicate media parameters #28618

Closed
neganovalexey opened this issue Nov 6, 2018 · 6 comments

Comments

Projects
None yet
4 participants
@neganovalexey
Copy link
Contributor

commented Nov 6, 2018

It is possible to receive an email that does not follow the specification, i. e. an email with the Content-Type header like

text/plain; charset=UTF-8; charset=UTF-8; format=flowed

Golang standard library does not allow duplicate media parameters so it is impossible to parse such a header. But two instances of the 'charset' parameter have the same value here, so it can be determined unambiguously.
I suggest making the mime.ParseMediaType function more tolerant to such errors. It should not stop the execution if it detected duplicate parameters have the same value. In order not to change behavior of any existing Go program, some special error value may be returned along with parsed media type and parameters.

@gopherbot gopherbot added this to the Proposal milestone Nov 6, 2018

@gopherbot gopherbot added the Proposal label Nov 6, 2018

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Nov 6, 2018

It would help if you could point to packages that generate this invalid information, and if you could point to how other MIME parsing code, in other languages, handles this case. That is, we want to accept data that is out there in the wild, but we want to avoid being unnecessarily loose. Thanks.

@neganovalexey

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

@ianlancetaylor I have found the following examples on Python, Java and C#. All of them returns charset as "UTF-8" correctly, none throws an error.

Example in Python 3:

>>> import cgi
>>> mimetype, options = cgi.parse_header("text/plain; charset=UTF-8; charset=UTF-8; format=flowed")
>>> print(mimetype)
text/plain
>>> print(options)
{'charset': 'UTF-8', 'format': 'flowed'}

Example in Java:

import org.apache.http.entity.ContentType;
import java.nio.charset.Charset;

class TestContentTypeParsing {
        public static void main (String args []) {
                ContentType contentType = ContentType.parse("text/plain; charset=UTF-8; charset=UTF-8; format=flowed");
                Charset charset = contentType.getCharset();
                System.out.println (charset);
    }
}

prints "UTF-8"

Example in C#:

using System;
using System.Net.Mime;

public class TestContentTypeParsing
{
    static public void Main ()
    {
        var contentType = new ContentType("text/plain; charset=UTF-8; charset=UTF-8; format=flowed");
        Console.WriteLine("{0} ({1})", contentType.MediaType, contentType.CharSet);
    }
}

prints "text/plain (UTF-8)"

@neganovalexey

This comment has been minimized.

Copy link
Contributor Author

commented Nov 7, 2018

The example in Go:

package main

import (
	"fmt"
	"mime"
)

func main() {
	mediaType, params, err := mime.ParseMediaType("text/plain; charset=UTF-8; charset=UTF-8; format=flowed")
	fmt.Printf("mediaType = '%s', params = %+v, err = %v\n", mediaType, params, err)
}

prints "mediaType = '', params = map[], err = mime: duplicate parameter name"
https://play.golang.org/p/kEMGv60ElWz

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 5, 2018

Thanks, that wasn't quite what I was asking. Can you describe which programs generate the duplicate information? I'm trying to understand why we should make this change. If no program generates duplicates, then it seems to me that stricter is better, as it avoids any confusion about which parameter applies.

@rsc rsc added the WaitingForInfo label Dec 12, 2018

@rsc

This comment has been minimized.

Copy link
Contributor

commented Dec 12, 2018

@neganovalexey you wrote "It is possible to receive an email ...". The important question is "is it likely?" Do you have instances of this happening in real use cases? If not, then we are unlikely to add what might end up being some kind of security hole (by picking one or the other differently from other software) for purely speculative motivations.

@gopherbot

This comment has been minimized.

Copy link

commented Jan 12, 2019

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@gopherbot gopherbot closed this Jan 12, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.