-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ingest/s3): Converting windows style path to posix one on local fs #8757
fix(ingest/s3): Converting windows style path to posix one on local fs #8757
Conversation
# We need to make sure the path is in posix style which is not true on windows | ||
full_path = PurePath( | ||
os.path.normpath(os.path.join(root, file)) | ||
).as_posix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need both os.path.normpath and PurePath.as_posix ?
From docs, both do similar things.
https://docs.python.org/3/library/os.path.html#os.path.normpath
https://docs.python.org/3/library/pathlib.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I get it right doesn't do the same thing even though there are similarities:
- normpath ->
os.path.normpath() method in Python is used to normalize the specified path. All redundant separator and up-level references are collapsed in the process of path normalization.
For example: A//B, A/B/, A/./B and A/foo/../B all will be normalized to A/B.
On Windows operating system, any forward slash (‘/’) in the path is converted to backslash (‘\’).
This is needed to make the dir separator consistent and also it simplifies the path (which I think is a good thing)
2. PurePath's as_posix turns the path to a UNIX path and uses unix type dir separators which is what s3 source and the urn expects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right they aren't same. Sounds like normpath does something more than as_posix, but yet it already does everything that as_posix might be doing ? (linux like / separators) .
I'm okay to keep both, just trying to understand this better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately normpath
just sets the dir separators consistently but on Windows, it is the opposite as on UNIX (which we actually need).
Purepath creates the path based on the os you run (on windows, it will treat the path as WindowsPath while on Unix, it will be UnixPath), and if you run as_posix
on it then you will get the UNIX path separators that we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't understand the value add of PurePath.as_posix over os.path.normpath but the change seems better than original code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Checklist