Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-19916: Rewrite URI handling within Location and LocationFactory #167

Merged
merged 7 commits into from Jun 7, 2019

Conversation

timj
Copy link
Member

@timj timj commented May 30, 2019

Previously we were a bit loose with our definition of URI. This rewrite changes the API to Location such that it now takes a URI ParseResult as the root of the datastore and a relative path reflecting the position in the datastore. This means that we no longer have to deal with strange relative path URIs that are non-standard for files.

This change removes fromUri from the LocationFactory API since it makes no sense to define a root URI for the datastore and then allow another URI to be provided. No butler code called fromUri. It also changes the Location constructor but only LocationFactory was calling it.

There are some shenanigans with trying to deal with the difference between a posixpath and an os.path but given that I am testing this on a posix system there might be some inconsistencies still.

Tests have been added. All tests pass.

I didn't ran tests and just assumed the original would work. It
didn't. Problem were paths and uri's ending in '#predicted' which
got scrubbed from the Location but shouldn't have. Added support
for uri fragments, which will now preserve the '#<fragment>' at
the end of paths.

urllib.parse.urljoin also kept scrubbing important bits from paths
so it was replaced with a string concatenation which should work as
long as people don't start sending uri's or otherwise non-POSIX
paths to `LocationFactory.fromPath` method.
Changed the `Location.fromUri` path dealing functionality
to explicitly use `posixpath` module so that there are no misshaps
across platforms since URI paths are guaranteed to be POSIX
compliant. Splitting is also done on an explicitly given separator
instead of relying on `posixpath.sep` or `os.path.sep`. The
remaining local path related functionality is left OS dependant.

Trimmed some unnecessary comments.
@timj
Copy link
Member Author

timj commented May 30, 2019

cc/ @DinoBektesevic

This supersedes #160

Previously we were a bit loose with our definition of
URI.  This rewrite changes the API to Location such that
it now takes a URI ParseResult as the root of the datastore
and a relative path reflecting the position in the datastore.
This means that we no longer have to deal with strange
relative path URIs that are non-standard for files.

This change removes fromUri from the LocationFactory
API since it makes no sense to define a root URI for the
datastore and then allow another URI to be provided.
No butler code called fromUri.
It also changes the Location constructor but
only LocationFactory was calling it.

There are some shenanigans with trying to deal
with the difference between a posixpath and an os.path
but given that I am testing this on a posix system
there might be some inconsistencies still.

Tests have been added.
Location reimplemented using this class.
Copy link
Contributor

@andy-slac andy-slac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK, though OS vs POSIX handling feels confusing in places (not sure how it can be improved). Few comments below.

python/lsst/daf/butler/core/location.py Outdated Show resolved Hide resolved
python/lsst/daf/butler/core/location.py Show resolved Hide resolved
python/lsst/daf/butler/core/location.py Outdated Show resolved Hide resolved
python/lsst/daf/butler/core/location.py Outdated Show resolved Hide resolved
return path

# Have to restore the root directory after splitting
posixpaths = path.split(posixpath.sep)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could path be empty?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty as in giving a relative path of ""?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know, I guess it depends on caller, my question was what happens in this code if path is empty (maybe accidentally).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new implementation if you pass in the empty string you get back the empty string (which seems correct to me).

python/lsst/daf/butler/core/location.py Outdated Show resolved Hide resolved
python/lsst/daf/butler/core/location.py Outdated Show resolved Hide resolved
python/lsst/daf/butler/core/location.py Outdated Show resolved Hide resolved
python/lsst/daf/butler/core/location.py Outdated Show resolved Hide resolved
tests/test_location.py Show resolved Hide resolved
timj added 2 commits June 7, 2019 10:51
Makes more usage of pathlib.PurePath now. Also added more tests.
@timj
Copy link
Member Author

timj commented Jun 7, 2019

I've cleaned up the os2posix/posix2os code a bit, making use of PurePath and PurePosixPath. I think it might be a bit more robust now (with the caveat that it never gets called because I'm not on Windows).

@timj timj merged commit d531e82 into master Jun 7, 2019
@timj
Copy link
Member Author

timj commented Jun 7, 2019

@DinoBektesevic I've now merged this PR so feel free to rebase #159 on top of master. Hopefully this will let you remove a lot of URI mangling from your PR.

@timj timj deleted the tickets/DM-19916 branch June 7, 2019 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants