Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urllib2 authentication problem #37910

Closed
gazzadee mannequin opened this issue Feb 5, 2003 · 9 comments
Closed

urllib2 authentication problem #37910

gazzadee mannequin opened this issue Feb 5, 2003 · 9 comments
Labels
stdlib Python modules in the Lib dir

Comments

@gazzadee
Copy link
Mannequin

gazzadee mannequin commented Feb 5, 2003

BPO 680577
Nosy @birkenfeld
Files
  • urllib2_proxy_auth.py: Demonstrates authentication problem
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2006-05-03.05:33:17.000>
    created_at = <Date 2003-02-05.00:22:00.000>
    labels = ['library']
    title = 'urllib2 authentication problem'
    updated_at = <Date 2006-05-03.05:33:17.000>
    user = 'https://bugs.python.org/gazzadee'

    bugs.python.org fields:

    activity = <Date 2006-05-03.05:33:17.000>
    actor = 'georg.brandl'
    assignee = 'none'
    closed = True
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2003-02-05.00:22:00.000>
    creator = 'gazzadee'
    dependencies = []
    files = ['781']
    hgrepos = []
    issue_num = 680577
    keywords = []
    message_count = 9.0
    messages = ['14438', '14439', '14440', '14441', '14442', '14443', '14444', '14445', '14446']
    nosy_count = 4.0
    nosy_names = ['georg.brandl', 'jjlee', 'ghaering', 'gazzadee']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue680577'
    versions = ['Python 2.3']

    @gazzadee
    Copy link
    Mannequin Author

    gazzadee mannequin commented Feb 5, 2003

    I've found a problem using the authentication in urllib2.

    When matching up host-names in order to find a
    password, then putting the protocol in the address
    makes it seem like a different address. eg...

    I create a HTTPBasicAuthHandler with a
    HTTPPasswordMgrWithDefaultRealm, and add the tuple
    (None, "http://proxy.blah.com:17828", "foo", "bar") to it.

    I then setup the proxy to use
    http://proxy.blah.com:17828 (which requires
    authentication).

    When I connect, the password lookup fails, because it
    is trying to find a match for "proxy.blah.com:17828"
    rather than "http://proxy.blah.com:17828"

    This problem doesn't exist if I pass
    "proxy.blah.com:17828" to the password manager.

    There seems to be some stuff in HTTPPasswordMgr to deal
    with variations on site names, but I guess it's not
    working in this case (unless this is intentional).

    Version Info:
    Python 2.2 (#1, Feb 24 2002, 16:21:58)
    [GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)]
    on linux-i386

    @gazzadee gazzadee mannequin closed this as completed Feb 5, 2003
    @gazzadee gazzadee mannequin added the stdlib Python modules in the Lib dir label Feb 5, 2003
    @gazzadee gazzadee mannequin closed this as completed Feb 5, 2003
    @gazzadee gazzadee mannequin added the stdlib Python modules in the Lib dir label Feb 5, 2003
    @ghaering
    Copy link
    Mannequin

    ghaering mannequin commented Feb 7, 2003

    Logged In: YES
    user_id=163326

    Can you please retry with Python 2.2.2?

    It seems that a related bug was fixed for 2.2.2:
    http://python.org/2.2.2/NEWS.txt has an entry:

    """

    • In urllib2.py: fix proxy config with user+pass
      authentication. [SF
      patch 527518]
      """

    @gazzadee
    Copy link
    Mannequin Author

    gazzadee mannequin commented Feb 9, 2003

    Logged In: YES
    user_id=693152

    Okay, the same problem crops up in Python 2.2.2 running
    under cygwin on Win XP

    Version Info:
    Python 2.2.2 (#1, Dec 31 2002, 12:24:34)
    [GCC 3.2 20020927 (prerelease)] on cygwin

    Here's the pertinent section of my test file (passwords and
    URL changed to protect the innocent):

        # Setup proxy
        proxy_handler = ProxyHandler({"http" :
    "http://blah.com:17828"})
        
        # Setup authentication
        pass_mgr = HTTPPasswordMgrWithDefaultRealm()
        for passwd in [ \
                       (None, "http://blah.com:17828", "foo",
    "bar"), \
    #                   (None, "blah.com:17828", "foo",
    "bar"), \	# Works if this line is uncommented
                       (None, "blah.com", "foo", "bar"), \
                      ]:
            print("Adding password set (%s, %s, %s, %s)" % passwd)
            pass_mgr.add_password(*passwd)
        auth_handler = HTTPBasicAuthHandler(pass_mgr)
        proxy_auth_handler = ProxyBasicAuthHandler(pass_mgr)
        
        # Now build a new URL opener and install it
        opener = build_opener(proxy_handler, proxy_auth_handler,
    auth_handler, HTTPHandler)
        install_opener(opener)
        
        # Now try to open a file and see what happens
        request = Request("http://www.google.com")
        try:
            remotefile = urlopen(request)
        except HTTPError, ex:
            print("Unable to download file due to HTTP Error %d
    (%s)." % (ex.code, ex.msg))
            return

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented Dec 1, 2003

    Logged In: YES
    user_id=261020

    The problem seems to be with the port (:17828), not the URL
    scheme (http:), because HTTPPasswordMgr.reduce_uri()
    removes the scheme.

    RFC 2617 (top of page 3) says nothing about removing the
    port from the URI. urllib2 does not remove the port, so this
    doesn't appear to be a bug.

    I guess gazzadee was doing a urlopen with a different
    canonical root URI (RFC 2617, top of page 3 again) to the one
    he gave in add_password (ie. the URL he passed to urlopen()
    had no explicit port number).

    @gazzadee
    Copy link
    Mannequin Author

    gazzadee mannequin commented Dec 16, 2003

    Logged In: YES
    user_id=693152

    This was a while ago, and my memory has faded. I'll try to
    respond intelligently.

    I think the question was with the way the password manager
    looks up passwords, rather than anything else.

    I am pretty sure that the problem is not to do with the URI
    passed to urlopen(). In the code shown below, the problem
    was solely dependent on whether I added the line:
    (None, "blah.com:17828", "foo", "bar")
    ...to the HTTPPasswordMgrWithDefaultRealm object.

    If that password set was added, then the password lookup for
    the proxy was successful, and urlopen() worked. If that
    password set was not included, then the password lookup for
    the proxy was unsuccessful (despite the inclusion of the
    other 2, similar, password sets - "http://blah.com:17828"
    and "blah.com"), and urlopen() would fail. Hence my
    suspicion that the password manager did not fully remove the
    scheme, despite attempts to do so.

    I'll see if I can set it up on the latest python and get it
    to happen again.

    Just as an explanation, the situation was that I was running
    an authenticating proxy on a non-standard port (in order to
    avoid clashing with the normal proxy), in order to test out
    how my download code would work through an authenticating proxy.

    @gazzadee
    Copy link
    Mannequin Author

    gazzadee mannequin commented Dec 16, 2003

    Logged In: YES
    user_id=693152

    Okay, I have attached a file that replicates this problem.

    If you run it as is (replacing the proxy name and address
    with something suitable), then it will fail (requiring proxy
    authentication).

    If you uncomment line 23 (which specifies the password
    without the scheme), then it will work successfully.

    Technical Info:

    • For a proxy, I am using Squid Cache version 2.4.STABLE7
      for i586-mandrake-linux-gnu...
    • I have replicated the problem with Python 2.2.2 on Linux,
      and Python 2.3.2 on Windows XP.

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented Dec 16, 2003

    Logged In: YES
    user_id=261020

    Thanks! 
     
    It seems .reduce_uri() tries to cope with hostnames as well as 
    absoluteURIs.  I don't understand why it wants to do that, but it 
    fails, because it doesn't anticipate what urlparse does when a 
    port is present: 
     
    >>> urlparse.urlparse("foo.bar.com") 
    ('', '', 'foo.bar.com', '', '', '') 
    >>> urlparse.urlparse("foo.bar.com:80") 
    ('foo.bar.com', '', '80', '', '', '') 
     
    I haven't checked, but I assume it's just incorrect use of 
    urlparse to pass it a hostname. 
     
    Of course, if it's "fixed" to only accept absoluteURIs, it will 
    break existing code, so I guess it must be fixed for 
    hostnames. :-(( 
     
    Also, I think .is_suburi("/foo/spam", "/foo/eggs") should return 
    False, but returns True, and .http_error_40x() use 
    req.get_host() when they should be using req.get_full_url() 
    (from a quick look at RFC 2617).

    @jjlee
    Copy link
    Mannequin

    jjlee mannequin commented Apr 15, 2006

    Logged In: YES
    user_id=261020

    This issue is fixed by patch 1470846.

    @birkenfeld
    Copy link
    Member

    Logged In: YES
    user_id=849994

    Closing accordingly.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant