Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove “Content-Type: application/x-www-form-urlencoded; charset” advice #69762

Closed
vadmium opened this issue Nov 7, 2015 · 6 comments
Closed
Labels
docs Documentation in the Doc dir

Comments

@vadmium
Copy link
Member

vadmium commented Nov 7, 2015

BPO 25576
Nosy @orsenthil, @bitdancer, @vadmium
Files
  • urlencoded-charset.patch
  • urlencoded-charset.2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-11-24.23:38:26.470>
    created_at = <Date 2015-11-07.08:43:40.644>
    labels = ['docs']
    title = 'Remove \xe2\x80\x9cContent-Type: application/x-www-form-urlencoded; charset\xe2\x80\x9d advice'
    updated_at = <Date 2015-11-24.23:38:26.469>
    user = 'https://github.com/vadmium'

    bugs.python.org fields:

    activity = <Date 2015-11-24.23:38:26.469>
    actor = 'martin.panter'
    assignee = 'docs@python'
    closed = True
    closed_date = <Date 2015-11-24.23:38:26.470>
    closer = 'martin.panter'
    components = ['Documentation']
    creation = <Date 2015-11-07.08:43:40.644>
    creator = 'martin.panter'
    dependencies = []
    files = ['40970', '40983']
    hgrepos = []
    issue_num = 25576
    keywords = ['patch']
    message_count = 6.0
    messages = ['254263', '254316', '254332', '254347', '254361', '255302']
    nosy_count = 5.0
    nosy_names = ['orsenthil', 'r.david.murray', 'docs@python', 'python-dev', 'martin.panter']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue25576'
    versions = ['Python 3.4', 'Python 3.5', 'Python 3.6']

    @vadmium
    Copy link
    Member Author

    vadmium commented Nov 7, 2015

    I understand using a “charset” parameter with “Content-Type: application/x-www-form-urlencoded” is not standardized. Since bpo-11082, the documentation advises to use it, but I propose to remove this advice.

    HTML 5 mentions setting a _charset_ parameter, and mentions decoding with a default of UTF-8 (not Latin-1!), but does not mention any Content-Type parameters.

    There seems to be confusion about what encoding it actually represents. According to <https://bugzilla.mozilla.org/show_bug.cgi?id=7533\>, Mozilla briefly set this “charset” parameter a long time ago, but it would have corresponded to the urlencode(encoding=...) argument. The Python documentation currently suggests calling data.encode("utf-8"), which is misleading, because the urlencode() output is already guaranteed to be ASCII text. Any non-ASCII characters and bytes will already be character-encoded and percent-encoded by urlencode(). So I also propose to change the examples to data.encode("ascii").

    @vadmium vadmium added the docs Documentation in the Doc dir label Nov 7, 2015
    @bitdancer
    Copy link
    Member

    Although I didn't read through the whole thing, the mozilla bug discussion indicates this is the correct way to specify the charset, it's just that there was lots of buggy software that didn't handle setting it to latin-1. Is the same true for setting it to utf-8?

    Agreed about the encode call.

    @vadmium
    Copy link
    Member Author

    vadmium commented Nov 8, 2015

    I think the server bugs referenced by the Mozilla bug are mainly about servers that do not recognize the content type at all, due the the presence of any charset parameter. They probably do something like “if headers['Content-Type'] == 'application/x-www-form-urlencoded' ” without checking for parameters first. So it wouldn’t matter if it was charset=latin-1 or charset=utf-8.

    A couple comments in the Mozilla bug say that including “charset” is specified by a HTTP standard, but I suspect this may be a mistake. Perhaps this is the best evidence for my argument, from <http://www.w3.org/TR/html/forms.html#url-encoded-form-data\>:

    '''
    Parameters on the “application/x-www-form-urlencoded” MIME type are ignored. In particular, this MIME type does not support the “charset” parameter.
    '''

    @bitdancer
    Copy link
    Member

    OK, I'll accept that as authoritative :)

    One very minor comment in the review, otherwise looks good to me.

    @vadmium
    Copy link
    Member Author

    vadmium commented Nov 8, 2015

    The second version of the patch changes some more examples in the how-to to data.encode("ascii"). I’ll leave this open for a bit in case Senthil is around and wants to comment (seeing as he added the text I am removing).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 24, 2015

    New changeset 16fec577fd8b by Martin Panter in branch '3.4':
    Issue bpo-25576: Remove application/x-www-form-urlencoded charset advice
    https://hg.python.org/cpython/rev/16fec577fd8b

    New changeset 95ae5262d27c by Martin Panter in branch '3.5':
    Issue bpo-25576: Merge www-form-urlencoded doc from 3.4 into 3.5
    https://hg.python.org/cpython/rev/95ae5262d27c

    New changeset d52521d13a64 by Martin Panter in branch 'default':
    Issue bpo-25576: Merge www-form-urlencoded doc from 3.5
    https://hg.python.org/cpython/rev/d52521d13a64

    New changeset 671429cc1d96 by Martin Panter in branch 'default':
    Issue bpo-25576: Apply fix to new urlopen() doc string
    https://hg.python.org/cpython/rev/671429cc1d96

    @vadmium vadmium closed this as completed Nov 24, 2015
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants