Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host/host:port and path aren't definitively separated by a slash #8

Closed
eode opened this issue Oct 24, 2018 · 7 comments
Closed

Host/host:port and path aren't definitively separated by a slash #8

eode opened this issue Oct 24, 2018 · 7 comments
Milestone

Comments

@eode
Copy link

eode commented Oct 24, 2018

>>> uri = URI('http://foo.com')
>>> uri.path = 'blah'
>>> str(uri)
http://foo.comblah

Expected: Either reject paths with an initial slash, or reject paths without one.

@eode
Copy link
Author

eode commented Oct 24, 2018

btw -- nice work, overall!

@eode
Copy link
Author

eode commented Oct 24, 2018

Actually, you can't reject paths with one. What an awkward state to be in. Is the extra slash in http://foo.bar//blah possibly meaningful?

I suppose there's no route to rejection. I guess expected behavior would be to implicitly add a slash, and if path is set manually with a slash, it results in a url like http://foo.com//blah.

@eode eode changed the title Host and path are merged if path has no '/' Host/host:port and path aren't definitively separated by a slash Oct 24, 2018
@amcgregor
Copy link
Member

amcgregor commented Oct 25, 2018

Hmm; it is true that you can provide paths that have no root. Technically, your own code above is buggy, though the URI class is not helping. Specifically, assign /blah, not blah, or use the division helper:

>>> uri = URI('http://foo.com') # no path part at all
>>> uri.path
PurePosixPath('.')
>>> uri.path = '/foo'
>>> uri
URI('http://foo.com/foo')
>>> uri = URI('http://foo.com')

# "natural" syntax for "extending" the path / resolving relative references
>>> uri / 'foo'
URI('http://foo.com/foo')

# scheme + authority part, plus a path part = rooted path at that authority over that protocol
>>> URI('http://foo.com') / URI('bar/baz')
URI('http://foo.com/bar/baz')

I'll leave this open for now as a reminder that non-root paths should be rooted for presentation, or, if there is an authority present, to only accept rooted paths. foo/bar isn't a valid URI path, where /foo/bar is. Good catch!

Edited to add: non-rooted paths remain useful, at least for URI without other, earlier components defined. This would allow you to, for example, use the division operator to combine two URI, the second being a URI fragment. (The second having uri.relative == True.). Think of relative links on a page; those are very much fragments, usually of non-root paths, and part of the reason why the division operator was added, reference issue #3.

Edited to add: I'm not seeing double slashes if you assign a rooted path. How are you getting that as a result?

@amcgregor amcgregor added this to the 2.0.1 milestone Oct 29, 2018
@amcgregor
Copy link
Member

>>> from uri import URI
>>> uri = URI('http://foo.com')  # not a fragment; has an authority
>>> uri.path = '/foo'; uri  # rooted path assignment is OK
URI('http://foo.com/foo')
>>> uri.path = 'foo'  # rootless path assignment is NOT OK

ValueError                                Traceback (most recent call last)
<ipython-input-3-d765cb2bdfb3> in <module>()
----> 1 uri.path = 'foo'

~/Projects/marrow/web/uri/uri/part/path.py in __set__(self, obj, value)
     28 
     29                 if obj.authority and not value.startswith('/'):
---> 30                         raise ValueError("Can only assign rooted paths to URI with authority.")
     31 
     32                 super(PathPart, self).__set__(obj, value)

ValueError: Can only assign rooted paths to URI with authority.

>>> uri = URI('http://foo.com')
>>> uri2 = URI()  # let's construct a fragment this time
>>> uri2.path = 'foo'; uri2
URI('foo')
>>> uri.resolve(uri2)  # relative reference resolution against an authority becomes rooted
URI('http://foo.com/foo')

Available in 84e0787 for testing prior to release. (Tests added in 74a16f6.)

@eode
Copy link
Author

eode commented Nov 1, 2018 via email

@amcgregor
Copy link
Member

amcgregor commented Nov 1, 2018

Marking as resolved. (Rejection is preferable to me; from the Zen of Python: errors should never pass silently, esp. as if you aren't expecting a relative path it might have security implications.). Edited to add: URI are not just URL. This is not a URL object, nor a URL-handling library, specifically. Your bug was intentionally trying to write "http://foo.com" + "blah", which URI happily (and incorrectly) allowed you to do. (2+2==42 → if you want this to be true, it's not a bug in Python's addition operator…)

@eode
Copy link
Author

eode commented Nov 2, 2018

I most definitely would not want that to be true -- what I want is for there to be an implicit slash if an unrooted path is attached to a host, or better yet, for the library to reject unrooted paths in that case. Which it does now, so everything's roly-poly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants