-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
urllib.splitport -- is it official or not? #71672
Comments
I've seen and written some code that uses urllib.splitport() [1], but it's not in the export list, nor in the docs. However I see no easy other way to perform the same function. Should we make it official, or get rid of it? It's used internally in urllib/request.py [2]. There's a test for it in test_urlparse.py [3], but another test [4] also acknowledges that it's "undocumented" (which suggests that the author of that test didn't know what to do with it either). Same question for the others in that list [4]: References: |
splitport() doesn't work with IPv6 ("[::1]", see bpo-18191), nor with authority ("user:password@example.com"). Note that there is a almost duplicate function splitnport(). The existence of two similar functions that behave differently in corner cases looks confusing. And seems splitport() and splitnport() not always used correctly internally (see bpo-20271). |
Previous discussion: bpo-1722, bpo-11009. In Python 2, most of the split- functions _have_ been in urllib.__all__ since revision 5d68afc5227c (2.1). Also, since revision c3656dca65e7 (bpo-1722, 2.7.4), the RST documentation does mention that at least some of them are deprecated in favour of the “urlparse” module. However there are no index entries, and splitport() is not mentioned by name. In Python 3, these functions wandered into urllib.parse. There is no RST documentation, and the functions are not in __all__ (which was added for bpo-13287 in 3.3). I think you can use the documented urllib.parse API instead of splitport(), but it is borderline unwieldy: >>> netloc = "[::1]:80"
>>> urllib.parse.splitport(netloc) # [Brackets] kept!
('[::1]', '80')
>>> split = urlsplit("//" + netloc); (split.hostname, split.port)
('::1', 80)
>>> split = SplitResult("", netloc, path="", query="", fragment=""); (split.hostname, split.port)
('::1', 80) I opened bpo-23416 with a suggestion that would make SplitResult a bit simpler to use here. But maybe it makes the implementation too complicated. I don’t think the non-split-names (Quoter, etc) are in much doubt. They were never in __all__. |
Aha. I see you are referring to this note in the 2.7 docs for urllib:
This is somewhat ironic because those functions still exist in urllib.parse. I've rewritten my code using your suggestions of using urllib.parse.urlparse(). Shall we just close this issue or is there still an action item? (Maybe actually delete those functions whose deletion has been promised so long ago, or at least rename them to _splitport() etc.?) |
Probably a rename is good. Question then becomes whether the old names should raise an DeprecationWarning for a release? |
I think that we use encourage everyone to use the higher level functions like urlparse() or urlsplit() and then get the .port from the named tuple result. Two things to do.
|
Would it be OK for me to work on this? |
Go for it, Cheryl! |
Skimming the issue I can't even figure out what the task is -- Cheryl, I suppose you have, could you post a brief summary of your plan here? |
Thank you. From my understanding, urllib didn't officially supported the split* functions (splittype, splithost, splitport, splinport, splituser, splitpasswd, splitattr, splitquery, splitvalue, splittag) even though they were migrated to urllib.parse. The 2.7 documentation recommended using urlparse and stated that these split* functions would not be exposed in Python 3, but they are. The proposal would be as Senthil suggested - to raise a DeprecationWarning if the current names are used and to rename them with a single underscore (to _split*). However, I did have some questions.
|
I don't think it is worth changing the implementations to be in terms of urlsplit or urlparse. This is proposed for splithost in <https://github.com/python/cpython/pull/1849\>, but I suspect it would change the behaviour in some corner cases. See bpo-22852 for some deficiencies with urlsplit.
|
Martin, thank you for the information and for pointing out those other related issues. It makes sense to separate the security or bug issues from this change. |
Thanks! This is now fixed for Python 3.8 \o/ |
This change made test_urlparse failing when ran with -We. Or just producing a lot of warnings in the default mode. ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1113, in test_splitattr
self.assertEqual(splitattr('/path;attr1=value1;attr2=value2'),
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1103, in splitattr
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitattr() is deprecated as of 3.8, use urllib.parse.urlparse() instead ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1006, in test_splithost
self.assertEqual(splithost('//www.example.org:80/foo/bar/baz.html'),
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 977, in splithost
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splithost() is deprecated as of 3.8, use urllib.parse.urlparse() instead ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1077, in test_splitnport
self.assertEqual(splitnport('parrot:88'), ('parrot', 88))
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1049, in splitnport
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitnport() is deprecated as of 3.8, use urllib.parse.urlparse() instead ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1050, in test_splitpasswd
self.assertEqual(splitpasswd('user:ab'), ('user', 'ab'))
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1013, in splitpasswd
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitpasswd() is deprecated as of 3.8, use urllib.parse.urlparse() instead ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1066, in test_splitport
self.assertEqual(splitport('parrot:88'), ('parrot', '88'))
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1026, in splitport
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitport() is deprecated as of 3.8, use urllib.parse.urlparse() instead ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1091, in test_splitquery
self.assertEqual(splitquery('http://python.org/fake?foo=bar'),
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1073, in splitquery
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitquery() is deprecated as of 3.8, use urllib.parse.urlparse() instead ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1101, in test_splittag
self.assertEqual(splittag('http://example.com?foo=bar#baz'),
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1088, in splittag
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splittag() is deprecated as of 3.8, use urllib.parse.urlparse() instead ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 998, in test_splittype
self.assertEqual(splittype('type:opaquestring'), ('type', 'opaquestring'))
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 956, in splittype
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splittype() is deprecated as of 3.8, use urllib.parse.urlparse() instead ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1035, in test_splituser
self.assertEqual(splituser('User:Pass@www.python.org:080'),
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1000, in splituser
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splituser() is deprecated as of 3.8, use urllib.parse.urlparse() instead ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1124, in test_splitvalue
self.assertEqual(splitvalue('foo=bar'), ('foo', 'bar'))
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 1117, in splitvalue
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.splitvalue() is deprecated as of 3.8, use urllib.parse.parse_qsl() instead ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1131, in test_to_bytes
result = urllib.parse.to_bytes('http://www.python.org')
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 920, in to_bytes
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.to_bytes() is deprecated as of 3.8 ====================================================================== Traceback (most recent call last):
File "/home/serhiy/py/cpython-gc/Lib/test/test_urlparse.py", line 1137, in test_unwrap
url = urllib.parse.unwrap('<URL:type://host/path>')
File "/home/serhiy/py/cpython-gc/Lib/urllib/parse.py", line 940, in unwrap
DeprecationWarning, stacklevel=2)
DeprecationWarning: urllib.parse.unwrap() is deprecated as of 3.8 |
Serhiy, Thanks for finding this. I've submitted a PR to fix the tests. |
Please refer to bpo-35891 for a description of an important use-case broken by the planned removal of splituser. |
Follow-up: I created bpo-45084 to remove these undocumented and deprecated functions in Python 3.11. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: