Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CVE-2019-18348: CRLF injection via the host part of the url passed to urlopen() #82757

Closed
ret2libc mannequin opened this issue Oct 24, 2019 · 13 comments
Closed

CVE-2019-18348: CRLF injection via the host part of the url passed to urlopen() #82757

ret2libc mannequin opened this issue Oct 24, 2019 · 13 comments
Assignees
Labels
release-blocker stdlib Python modules in the Lib dir type-security A security issue

Comments

@ret2libc
Copy link
Mannequin

ret2libc mannequin commented Oct 24, 2019

BPO 38576
Nosy @gpshead, @vstinner, @larryhastings, @benjaminp, @ned-deily, @mcepl, @koobs, @stratakis, @miss-islington, @tirkarthi, @epicfaace, @ware, @ret2libc, @tapakund, @b1tninja
PRs
  • bpo-38576: Disallow control characters in hostnames in http.client #18995
  • [3.8] bpo-38576: Disallow control characters in hostnames in http.client (GH-18995) #19000
  • [3.7] bpo-38576: Disallow control characters in hostnames in http.client (GH-18995) #19001
  • [3.6] bpo-38576: Disallow control characters in hostnames in http.client (GH-18995) #19002
  • [2.7] closes bpo-38576: Disallow control characters in hostnames in http.client. #19052
  • [3.5] closes bpo-38576: Disallow control characters in hostnames in h… #19231
  • [3.5] closes bpo-38576: Disallow control characters in hostnames in h… #19300
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/benjaminp'
    closed_at = <Date 2020-03-19.01:35:47.815>
    created_at = <Date 2019-10-24.07:51:18.864>
    labels = ['type-security', 'library', 'release-blocker']
    title = 'CVE-2019-18348: CRLF injection via the host part of the url passed to urlopen()'
    updated_at = <Date 2022-02-28.20:23:55.129>
    user = 'https://github.com/ret2libc'

    bugs.python.org fields:

    activity = <Date 2022-02-28.20:23:55.129>
    actor = 'ned.deily'
    assignee = 'benjamin.peterson'
    closed = True
    closed_date = <Date 2020-03-19.01:35:47.815>
    closer = 'benjamin.peterson'
    components = ['Library (Lib)']
    creation = <Date 2019-10-24.07:51:18.864>
    creator = 'rschiron'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38576
    keywords = ['patch']
    message_count = 13.0
    messages = ['355294', '357073', '357442', '362353', '364190', '364191', '364192', '364193', '364207', '364208', '364499', '364584', '371922']
    nosy_count = 18.0
    nosy_names = ['gregory.p.smith', 'vstinner', 'larry', 'benjamin.peterson', 'ned.deily', 'mcepl', 'python-dev', 'koobs', 'cstratak', 'miss-islington', 'xtreak', 'epicfaace', 'ware', 'rschiron', 'tapakund', 'Anselmo Melo', 'b1tninja', 'kim']
    pr_nums = ['18995', '19000', '19001', '19002', '19052', '19231', '19300']
    priority = 'release blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue38576'
    versions = ['Python 2.7']

    @ret2libc
    Copy link
    Mannequin Author

    ret2libc mannequin commented Oct 24, 2019

    Copy-pasted from https://bugs.python.org/issue30458#msg347282

    ================
    The commit b7378d7 is incomplete: it doesn't seem to check for control characters in the "host" part of the URL, only in the "path" part of the URL. Example:
    ---
    try:
    from urllib import request as urllib_request
    except ImportError:
    import urllib2 as urllib_request
    import socket
    def bug(*args):
    raise Exception(args)
    # urlopen() must not call create_connection()
    socket.create_connection = bug
    urllib_request.urlopen('http://127.0.0.1\\r\\n\\x20hihi\\r\\n :11211')
    ---

    The URL comes from the first message of this issue:
    https://bugs.python.org/issue30458#msg294360

    Development branches 2.7 and master produce a similar output:
    ---

    Traceback (most recent call last):
     ...
    Exception: (('127.0.0.1\r\n hihi\r\n ', 11211), ..., None)

    So urllib2/urllib.request actually does a real network connection (DNS query), whereas it should reject control characters in the "host" part of the URL.


    A second problem comes into the game. Some C libraries like glibc strip the end of the hostname (strip at the first newline character) and so HTTP Header injection is still possible is this case:
    https://bugzilla.redhat.com/show_bug.cgi?id=1673465


    According to the RFC 3986, the "host" grammar doesn't allow any control character, it looks like:

       host          = IP-literal / IPv4address / reg-name

    ALPHA (letters)
    DIGIT (decimal digits)
    unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
    pct-encoded = "%" HEXDIG HEXDIG
    sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
    / "*" / "+" / "," / ";" / "="
    reg-name = *( unreserved / pct-encoded / sub-delims )

    IP-literal = "[" ( IPv6address / IPvFuture ) "]"
    IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
    IPv6address = 6( h16 ":" ) ls32
    / "::" 5( h16 ":" ) ls32
    / [ h16 ] "::" 4( h16 ":" ) ls32
    / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
    / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
    / [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32
    / [ *4( h16 ":" ) h16 ] "::" ls32
    / [ *5( h16 ":" ) h16 ] "::" h16
    / [ *6( h16 ":" ) h16 ] "::"
    h16 = 1*4HEXDIG
    ls32 = ( h16 ":" h16 ) / IPv4address
    IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet
    ================

    CVE-2019-18348 was assigned to this flaw, which is similar to CVE-2019-9947 and CVE-2019-9740 but it is about the *host* part of a url.

    @ret2libc ret2libc mannequin added the type-security A security issue label Oct 24, 2019
    @vstinner vstinner changed the title CVE-2019-18348 CRLF injection via the host part of the url passed to urlopen() CVE-2019-18348: CRLF injection via the host part of the url passed to urlopen() Oct 24, 2019
    @vstinner vstinner added stdlib Python modules in the Lib dir 3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes labels Nov 19, 2019
    @b1tninja
    Copy link
    Mannequin

    b1tninja mannequin commented Nov 20, 2019

    Can't see the specifics of that "restricted" redhat bug, but this was interesting bug and I wanted to ask if perhaps the domain in such cases should be IDN / punycoded ://xn--n28h.ws/ for example is ://💩.la

    @ret2libc
    Copy link
    Mannequin Author

    ret2libc mannequin commented Nov 25, 2019

    The glibc issue mentioned in the first comment is CVE-2016-10739 .

    @mcepl
    Copy link
    Mannequin

    mcepl mannequin commented Feb 20, 2020

    Just to say this is reproducible only on rather old enterprise Linux distributions, where CVE-2016-10739 bug in glibc has not been fixed. I believe it means RHEL-6, SUSE SLE-10, 11, 12 (not sure whether it applies to some old Debian as well).

    @gpshead
    Copy link
    Member

    gpshead commented Mar 14, 2020

    New changeset 9165add by Ashwin Ramaswami in branch 'master':
    bpo-38576: Disallow control characters in hostnames in http.client (GH-18995)
    9165add

    @gpshead
    Copy link
    Member

    gpshead commented Mar 14, 2020

    Thanks for the PR Ashwin!

    @gpshead gpshead self-assigned this Mar 14, 2020
    @miss-islington
    Copy link
    Contributor

    New changeset 34f85af by Miss Islington (bot) in branch '3.7':
    bpo-38576: Disallow control characters in hostnames in http.client (GH-18995)
    34f85af

    @miss-islington
    Copy link
    Contributor

    New changeset ff69c9d by Miss Islington (bot) in branch '3.8':
    bpo-38576: Disallow control characters in hostnames in http.client (GH-18995)
    ff69c9d

    @ned-deily
    Copy link
    Member

    New changeset 83fc701 by Miss Islington (bot) in branch '3.6':
    bpo-38576: Disallow control characters in hostnames in http.client (GH-18995) (GH-19002)
    83fc701

    @gpshead
    Copy link
    Member

    gpshead commented Mar 15, 2020

    If anyone cares about 2.7, the *final* release is coming up in a few weeks. They'll need to figure out what it looks like there and get a 2.7 PR reviewed by the release manager.

    @gpshead gpshead removed 3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes labels Mar 15, 2020
    @gpshead gpshead closed this as completed Mar 15, 2020
    @gpshead
    Copy link
    Member

    gpshead commented Mar 18, 2020

    marking as a 2.7 release blocker just to get benjamin's RM attention before the final 2.7.

    @gpshead gpshead reopened this Mar 18, 2020
    @gpshead gpshead assigned benjaminp and unassigned gpshead Mar 18, 2020
    @benjaminp
    Copy link
    Contributor

    New changeset e176e0c by Matěj Cepl in branch '2.7':
    [2.7] closes bpo-38576: Disallow control characters in hostnames in http.client. (GH-19052)
    e176e0c

    @larryhastings
    Copy link
    Contributor

    New changeset 09d8172 by Tapas Kundu in branch '3.5':
    [3.5] closes bpo-38576: Disallow control characters in hostnames in http.client. (bpo-19300)
    09d8172

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    release-blocker stdlib Python modules in the Lib dir type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants