Skip to content

Incorrect return type hints in urllib module #10466

@cushionbadak

Description

@cushionbadak

I found four incorrect return type hints in urllib. I hope this will contribute to improving annotations.

1. Incorrect type hint in urllib.request.HTTPPasswordMgr.reduce_uri

Overview

Target for Modification: HTTPPasswordMgr.reduce_uri in urllib/request.pyi
Existing Type Hint: (self, uri: str, default_port: bool = True) -> str
Suggested Type Hint: (self, uri: str, default_port: bool = True) -> tuple[str, str]

Explanation

The method HTTPPasswordMgr.reduce_uri returns two values.

class HTTPPasswordMgr:
    def reduce_uri(self, uri, default_port=True):
        ...
        return authority, path

The return type should be tuple[str, str] as it returns a pair of strings ('127.0.0.1:60709', '/') in the CPython test case BasicAuthTests.test_basic_auth_success in Lib/test/test_urllib2_localnet.py, through the add_password method call (add_password -> reduce_uri).
Note that the port number in the return value is determined dynamically.

2. Incorrect type hint in urllib.request.AbstractDigestAuthHandler.get_authorization

Overview

Target for Modification: AbstractDigestAuthHandler.get_authorization in urllib/request.pyi
Existing Type Hint: (self, req: Request, chal: Mapping[str, str]) -> str
Suggested Type Hint: (self, req: Request, chal: Mapping[str, str]) -> str | None

Explanation

The method AbstractDigestAuthHandler.get_authorization can return None in certain exception cases.

class AbstractDigestAuthHandler:
    def get_authorization(self, req, chal):
        try:
            ...
        except KeyError:
            return None

        H, KD = self.get_algorithm_impls(algorithm)
        if H is None:
            return None

        user, pw = self.passwd.find_user_password(realm, req.full_url)
        if user is None:
            return None
        ...

The occurrence of None as a return value, though hard to track, can be seen in the test case ProxyAuthTests.test_proxy_with_no_password_raises_httperror in test_urllib2_localnet.py. The test case asserts that the method call self.opener.open(self.URL) should raise an HTTPError.

Here's the traceback how open can call get_authorization method:

[Traceback Results at the beginning of the get_authorization]

(urllib/request.py) OpenerDirector.open
        response = meth(req, response)
-> HTTPErrorProcessor.http_response
        response = self.parent.error
-> OpenerDirector.error
        result = self._call_chain(*args)
-> ProxyDigestAuthHandler.http_error_407
        retry = self.retry_error_auth_reqed('proxy-authenticate', ...
-> AbstractDigestAuthHandler.http_error_auth_reqed
        return self.retry_http_digest_auth(req, authreq)
-> AbstractDigestAuthHandler.retry_http_digest_auth
        auth = self.get_authorization(req, chal)
-> AbstractDigestAuthHandler.get_authorization

3. Incorrect type hint in urllib.request.HTTPPasswordMgrWithPriorAuth.is_authenticated.

Overview

Target for Modification: HTTPPasswordMgrWithPriorAuth.is_authenticated in urllib/request.pyi
Existing Type Hint: (self, authuri: str) -> bool
Suggested Type Hint: (self, authuri: str) -> bool | None

Explanation

The method HTTPPasswordMgrWithPriorAuth.is_authenticated can return None in cases where it reaches the end of the function without hitting any return statement.

class HTTPPasswordMgrWithPriorAuth(HTTPPasswordMgrWithDefaultRealm):
    def is_authenticated(self, authuri):
        for default_port in True, False:
            reduced_authuri = self.reduce_uri(authuri, default_port)
            for uri in self.authenticated:
                if self.is_suburi(uri, reduced_authuri):
                    return self.authenticated[uri]

This edge case can be seen in the test case HandlerTests.test_basic_prior_auth_auto_send in test_urllib2.py, where the erroneous call is_authenticated(request_url + 'plain') returns None and still passes the assertion.

4. Incorrect type hint in urllib.request.ftpwrapper.retrfile

Overview

Target for Modification: ftpwrapper.retrfile in urllib/request.pyi
Existing Type Hint: (self, file: str, type: str) -> tuple[addclosehook, int]
Suggested Type Hint: (self, file: str, type: str) -> tuple[addclosehook, int | None]

Explanation

The second element in the return value of the method ftpwrapper.retrfile can be None value though the type hint suggesting it should be an integer.

class ftpwrapper:
    def init(self):
        import ftplib
        self.ftp = ftplib.FTP()

    def retrfile(self, file, type):
        ...
            conn, retrlen = self.ftp.ntransfercmd(cmd)
        ...
        return (ftpobj, retrlen)

The implementation of FTP.ntransfercmd demonstrates that its second return value can indeed be None, a fact currently not reflected in the type hint.

# Lib/ftplib.py
class FTP:
    def ntransfercmd(self, cmd, rest=None):
        size = None
        ...
        if resp[:3] == '150':
            size = parse150(resp)
        return conn, size

def parse150(resp):
    ...
    if not m:
        return None
    return int(m.group(1))

def ntransfercmd(self, cmd: str, rest: int | str | None = None) -> tuple[socket, int]: ...

The test case TimeoutTest.test_ftp_basic in test_urllib2net.py has how to occur this type hint mismatch by calling _urlopen_with_retry (which is wrapped urlopen function).

# Connecting to remote hosts is flaky.  Make it more robust by retrying
# the connection several times.
_urlopen_with_retry = _wrap_with_retry_thrice(urllib.request.urlopen,
                                              urllib.error.URLError)

class TimeoutTest(unittest.TestCase):
    def test_ftp_basic(self):
        self.assertIsNone(socket.getdefaulttimeout())
        with socket_helper.transient_internet(self.FTP_HOST, timeout=None):
            u = _urlopen_with_retry(self.FTP_HOST)
            ...

In the context of the above test case, urlopen may call retrfile with the following stack trace:

[Traceback Results at the beginning of the retrfile]

(urllib/request.py) urlopen
        return opener.open(url, data, timeout)
-> OpenerDirector.open
        response = self._open(req, data)
-> OpenerDirector._open
        result = self._call_chain(self.handle_open, protocol, protocol + ...
-> OpenerDirector._call_chain
        result = func(*args)
-> FTPHandler.ftp_open
        fp, retrlen = fw.retrfile(file, type)
-> ftpwrapper.retrfile

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions