New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-19916: Rewrite URI handling within Location and LocationFactory #167
Conversation
I didn't ran tests and just assumed the original would work. It didn't. Problem were paths and uri's ending in '#predicted' which got scrubbed from the Location but shouldn't have. Added support for uri fragments, which will now preserve the '#<fragment>' at the end of paths. urllib.parse.urljoin also kept scrubbing important bits from paths so it was replaced with a string concatenation which should work as long as people don't start sending uri's or otherwise non-POSIX paths to `LocationFactory.fromPath` method.
Changed the `Location.fromUri` path dealing functionality to explicitly use `posixpath` module so that there are no misshaps across platforms since URI paths are guaranteed to be POSIX compliant. Splitting is also done on an explicitly given separator instead of relying on `posixpath.sep` or `os.path.sep`. The remaining local path related functionality is left OS dependant. Trimmed some unnecessary comments.
cc/ @DinoBektesevic This supersedes #160 |
Previously we were a bit loose with our definition of URI. This rewrite changes the API to Location such that it now takes a URI ParseResult as the root of the datastore and a relative path reflecting the position in the datastore. This means that we no longer have to deal with strange relative path URIs that are non-standard for files. This change removes fromUri from the LocationFactory API since it makes no sense to define a root URI for the datastore and then allow another URI to be provided. No butler code called fromUri. It also changes the Location constructor but only LocationFactory was calling it. There are some shenanigans with trying to deal with the difference between a posixpath and an os.path but given that I am testing this on a posix system there might be some inconsistencies still. Tests have been added.
Location reimplemented using this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK, though OS vs POSIX handling feels confusing in places (not sure how it can be improved). Few comments below.
return path | ||
|
||
# Have to restore the root directory after splitting | ||
posixpaths = path.split(posixpath.sep) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could path
be empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty as in giving a relative path of ""?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know, I guess it depends on caller, my question was what happens in this code if path is empty (maybe accidentally).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new implementation if you pass in the empty string you get back the empty string (which seems correct to me).
Makes more usage of pathlib.PurePath now. Also added more tests.
I've cleaned up the os2posix/posix2os code a bit, making use of PurePath and PurePosixPath. I think it might be a bit more robust now (with the caveat that it never gets called because I'm not on Windows). |
@DinoBektesevic I've now merged this PR so feel free to rebase #159 on top of master. Hopefully this will let you remove a lot of URI mangling from your PR. |
Previously we were a bit loose with our definition of URI. This rewrite changes the API to Location such that it now takes a URI ParseResult as the root of the datastore and a relative path reflecting the position in the datastore. This means that we no longer have to deal with strange relative path URIs that are non-standard for files.
This change removes fromUri from the LocationFactory API since it makes no sense to define a root URI for the datastore and then allow another URI to be provided. No butler code called fromUri. It also changes the Location constructor but only LocationFactory was calling it.
There are some shenanigans with trying to deal with the difference between a posixpath and an os.path but given that I am testing this on a posix system there might be some inconsistencies still.
Tests have been added. All tests pass.