Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urlunsplit for itms-services scheme returns invalid url #104139

Closed
AndyQ opened this issue May 3, 2023 · 4 comments
Closed

urlunsplit for itms-services scheme returns invalid url #104139

AndyQ opened this issue May 3, 2023 · 4 comments
Assignees
Labels
3.12 bugs and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error type-feature A feature request or enhancement

Comments

@AndyQ
Copy link

AndyQ commented May 3, 2023

Relating to a Werkzueg issue (pallets/werkzeug#2691), when parsing an iOS App install url e.g.
itms-services:action=download-manifest&url=https://theacmeinc.com/abcdefeg, urlunpslit returns an invalid url.

e.g.

vals = urlparse( "itms-services://?action=download-manifest&url=https://theacmeinc.com/abcdefeg" )
print(vals)

newURL = urlunsplit((vals.scheme, vals.netloc, vals.path, vals.query, vals.params))
print( newURL )

prints:

ParseResult(scheme='itms-services', netloc='', path='', params='', query='action=download-manifest&url=https://theacmeinc.com/abcdefeg', fragment='')

itms-services:?action=download-manifest&url=https://theacmeinc.com/abcdefeg

Note the newURL is missing the // after the itms-services scheme.

This scheme is used to install ad-hoc and enterprise iOS apps.

Your environment

Tested on Apple M1 Max - 13.4 Beta (22F5049e)
Python: 3.10.10

For more details on the scheme here is a link to the Apple documentation (look for the "Use a website to distribute the app" section).
https://support.apple.com/en-gb/guide/deployment/depce7cefc4d/web

Linked PRs

@AndyQ AndyQ added the type-bug An unexpected behavior, bug, or error label May 3, 2023
@arhadthedev arhadthedev added the stdlib Python modules in the Lib dir label May 3, 2023
@davidism
Copy link

davidism commented May 3, 2023

I don't think this scheme is standard. https://url.spec.whatwg.org/#url-serializing says "If url's host is non-null: Append "//" to output." This itms-services URL doesn't have a host, so it shouldn't require a //. I think there's some other special casing for schemes with the uses_netloc list, a similar "add //" could be added.

# A classification of schemes.
# The empty string classifies URLs with no scheme specified,
# being the default value returned by “urlsplit” and “urlparse”.
uses_relative = ['', 'ftp', 'http', 'gopher', 'nntp', 'imap',
'wais', 'file', 'https', 'shttp', 'mms',
'prospero', 'rtsp', 'rtspu', 'sftp',
'svn', 'svn+ssh', 'ws', 'wss']
uses_netloc = ['', 'ftp', 'http', 'gopher', 'nntp', 'telnet',
'imap', 'wais', 'file', 'mms', 'https', 'shttp',
'snews', 'prospero', 'rtsp', 'rtspu', 'rsync',
'svn', 'svn+ssh', 'sftp', 'nfs', 'git', 'git+ssh',
'ws', 'wss']
uses_params = ['', 'ftp', 'hdl', 'prospero', 'http', 'imap',
'https', 'shttp', 'rtsp', 'rtspu', 'sip', 'sips',
'mms', 'sftp', 'tel']
# These are not actually used anymore, but should stay for backwards
# compatibility. (They are undocumented, but have a public-looking name.)
non_hierarchical = ['gopher', 'hdl', 'mailto', 'news',
'telnet', 'wais', 'imap', 'snews', 'sip', 'sips']
uses_query = ['', 'http', 'wais', 'imap', 'https', 'shttp', 'mms',
'gopher', 'rtsp', 'rtspu', 'sip', 'sips']
uses_fragment = ['', 'ftp', 'hdl', 'http', 'gopher', 'news',
'nntp', 'wais', 'https', 'shttp', 'snews',
'file', 'prospero']

@davidism
Copy link

davidism commented May 8, 2023

@gpshead I know you were looking at some other urllib issues recently, could you comment on this?

@gpshead gpshead self-assigned this May 8, 2023
@gpshead
Copy link
Member

gpshead commented May 8, 2023

I agree with @davidism that the WhatWG URL spec does not require // when there is no pathname.

Regardless, behavior wise this seems to match our existing uses_netloc special case so we can just add it to that list in Lib/urllib/parse.py and add test coverage in Lib/test/test_urlparse.py. I made a PR.

Workaround for this to "work" on existing Pythons:

if "itms-services" not in urllib.parse.uses_netloc:
    urllib.parse.uses_netloc.append("itms-services")

I'm calling this a feature as code will have to deal with Python's that do not list it as such for a long time anyways via a code snippet like that, so backporting doesn't seem consistently helpful.

@gpshead gpshead added type-feature A feature request or enhancement 3.12 bugs and security fixes labels May 8, 2023
gpshead added a commit that referenced this issue May 9, 2023
Teach unsplit to retain the `"//"` when assembling `itms-services://?action=generate-bugs` style
[Apple Platform Deployment](https://support.apple.com/en-gb/guide/deployment/depce7cefc4d/web) URLs.
@gpshead
Copy link
Member

gpshead commented May 9, 2023

fixed in 3.12.

@gpshead gpshead closed this as completed May 9, 2023
carljm added a commit to carljm/cpython that referenced this issue May 9, 2023
* main:
  pythongh-97696 Add documentation for get_coro() behavior with eager tasks (python#104304)
  pythongh-97933: (PEP 709) inline list/dict/set comprehensions (python#101441)
  pythongh-99889: Fix directory traversal security flaw in uu.decode() (python#104096)
  pythongh-104184: fix building --with-pydebug --enable-pystats (python#104217)
  pythongh-104139: Add itms-services to uses_netloc urllib.parse. (python#104312)
  pythongh-104240: return code unit metadata from codegen (python#104300)
carljm added a commit to carljm/cpython that referenced this issue May 9, 2023
* main: (156 commits)
  pythongh-97696 Add documentation for get_coro() behavior with eager tasks (python#104304)
  pythongh-97933: (PEP 709) inline list/dict/set comprehensions (python#101441)
  pythongh-99889: Fix directory traversal security flaw in uu.decode() (python#104096)
  pythongh-104184: fix building --with-pydebug --enable-pystats (python#104217)
  pythongh-104139: Add itms-services to uses_netloc urllib.parse. (python#104312)
  pythongh-104240: return code unit metadata from codegen (python#104300)
  pythongh-104276: Make `_struct.unpack_iterator` type use type flag instead of custom constructor (python#104277)
  pythongh-97696: Move around and update the whatsnew entry for asyncio eager task factory (python#104298)
  pythongh-103193: Fix refleaks in `test_inspect` and `test_typing` (python#104320)
  require-pr-label.yml: Add missing "permissions:" (python#104309)
  pythongh-90656: Add platform triplets for 64-bit LoongArch (LA64) (python#30939)
  pythongh-104180: Read SOCKS proxies from macOS System Configuration (python#104181)
  pythongh-97696 Remove unnecessary check for eager_start kwarg (python#104188)
  pythonGH-104308: socket.getnameinfo should release the GIL (python#104307)
  pythongh-104310: Add importlib.util.allowing_all_extensions() (pythongh-104311)
  pythongh-99113: A Per-Interpreter GIL! (pythongh-104210)
  pythonGH-104284: Fix documentation gettext build (python#104296)
  pythongh-89550: Buffer GzipFile.write to reduce execution time by ~15% (python#101251)
  pythongh-104223: Fix issues with inheriting from buffer classes (python#104227)
  pythongh-99108: fix typo in Modules/Setup (python#104293)
  ...
carljm added a commit to carljm/cpython that referenced this issue May 9, 2023
* main: (35 commits)
  pythongh-97696 Add documentation for get_coro() behavior with eager tasks (python#104304)
  pythongh-97933: (PEP 709) inline list/dict/set comprehensions (python#101441)
  pythongh-99889: Fix directory traversal security flaw in uu.decode() (python#104096)
  pythongh-104184: fix building --with-pydebug --enable-pystats (python#104217)
  pythongh-104139: Add itms-services to uses_netloc urllib.parse. (python#104312)
  pythongh-104240: return code unit metadata from codegen (python#104300)
  pythongh-104276: Make `_struct.unpack_iterator` type use type flag instead of custom constructor (python#104277)
  pythongh-97696: Move around and update the whatsnew entry for asyncio eager task factory (python#104298)
  pythongh-103193: Fix refleaks in `test_inspect` and `test_typing` (python#104320)
  require-pr-label.yml: Add missing "permissions:" (python#104309)
  pythongh-90656: Add platform triplets for 64-bit LoongArch (LA64) (python#30939)
  pythongh-104180: Read SOCKS proxies from macOS System Configuration (python#104181)
  pythongh-97696 Remove unnecessary check for eager_start kwarg (python#104188)
  pythonGH-104308: socket.getnameinfo should release the GIL (python#104307)
  pythongh-104310: Add importlib.util.allowing_all_extensions() (pythongh-104311)
  pythongh-99113: A Per-Interpreter GIL! (pythongh-104210)
  pythonGH-104284: Fix documentation gettext build (python#104296)
  pythongh-89550: Buffer GzipFile.write to reduce execution time by ~15% (python#101251)
  pythongh-104223: Fix issues with inheriting from buffer classes (python#104227)
  pythongh-99108: fix typo in Modules/Setup (python#104293)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants