Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[train/tune] Use posix paths throughout library code #38319

Merged
merged 6 commits into from
Aug 15, 2023

Conversation

krfricke
Copy link
Contributor

Why are these changes needed?

We're currently using os.path in many places, but sometimes mix it with default unix paths, such as ~/ray_results. This can lead to conflicts in Windows.

For instance, os.path.expanduser("~/ray_results") will result in C:\\Users\\username/ray_results on windows systems.

While this works in practice (python's stdlib can handle these mixed paths), it is confusing in the output. Thus, we try to replace most occurrences with a posix-compatible path.

Second, and central to this PR, our detection of local/remote paths did not work on Windows. urllib.parse.urlparse detects a local windows path such as C:\\Users\\username as scheme C, so Tune thinks it's a URI.

This effectively lead to a situation where windows users experienced that a remote path was detected, but it was in fact a local path. Ray Tune tried to "sync up" local results to the same local folder. This lead to errors in the syncer code.

We thus update the _is_local_windows_path utility and use it in _split_remote_local_path to detect local windows directories.

Related issue number

Closes #36448

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Kai Fricke added 3 commits August 10, 2023 15:07
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Copy link
Contributor

@justinvyu justinvyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, have a few comments:

The main place where paths are stored now is StorageContext. I've actually had a TODO about how this handles windows and relative paths, so would be a good time to fix it up.

StorageContext takes in the exact user storage_path that's passed, as well as an optional storage_filesystem. Here's what I was imagining the path handling could look like:

class StorageContext:
    def __init__(self, storage_path: Tuple[str, os.PathLike], storage_filesystem: Optional[pyarrow.fs.FileSystem]):
        if storage_filesystem provided:
            # just convert storage_path to posix path (don't convert to an absolute path)
            # this is because the path should stay relative to the custom_fs
        elif local path (scheme == file or no scheme):
            # expand and convert to *absolute* posix path
        else: # do nothing for URIs, which will get handled by pyarrow.
            assert is_uri(storage_path)

Then, once we've converted it once in the constructor, we can keep using os.path.join using these paths elsewhere.

I wanted to avoid checking is_local_path in this class, but it seems like it's necessary to support relative / windows paths, since pyarrow will raise an error from the input to pyarrow.fs.from_uri.

Comment on lines +256 to +269
if (
self._legacy_local_experiment_path
and self._legacy_remote_experiment_path
and Path(self._legacy_local_experiment_path)
== Path(self._legacy_remote_experiment_path)
):
warnings.warn(
"The local experiment path is the same as the remote "
"experiment path. Set a different `storage_path` or raise an "
"issue on GitHub if this issue persists. Deactivating the"
"remote experiment path."
)
self._legacy_remote_experiment_path = None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the issue here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Windows, we resolved the default storage path to C:/Users/username/ray_results. When passing the path on, it was interpreted as a remote path in downstream objects (because we used urlparse to check for local/non-local paths and that doesn't work with windows paths).

So we end up in a situation where the remote path is C:/Users/username/ray_results, and the local path gets resolved again to the same path, C:/Users/username/ray_results. We write to the path locally and the syncer tries to sync up to that path. This results in various different error messages.

@@ -1153,14 +1153,14 @@ def _create_logger(
def _open_logfiles(self, stdout_file, stderr_file):
"""Create loggers. Open stdout and stderr logfiles."""
if stdout_file:
stdout_path = os.path.expanduser(os.path.join(self._logdir, stdout_file))
stdout_path = (Path(self._logdir) / stdout_file).expanduser().as_posix()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can just add this Path(...).expanduser().as_posix() logic once at the user entrypoint, then just assume posix paths everywhere else in our own code, rather than need to patch it everywhere internally.

Ex:

RunConfig(storage_path)
Tuner.restore(...), Trainer.restore(...)
RAY_AIR_LOCAL_CACHE_DIR
ExperimentAnalysis(path)

In the case of self._logdir, it should already be expanded/as_posix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem comes up when Path is used again on windows, which will convert a posix path into a WindowsPath. If we only want to have as_posix() in one place, we can't use pathlib downstream.

@krfricke
Copy link
Contributor Author

I wanted to avoid checking is_local_path in this class, but it seems like it's necessary to support relative / windows paths, since pyarrow will raise an error from the input to pyarrow.fs.from_uri.

The crux is this line:

elif local path (scheme == file or no scheme):

the scheme check is insufficient on windows systems as the drive letter gets interpreted as a scheme:

>>> urllib.parse.urlparse("c:/foo/bar")
ParseResult(scheme='c', netloc='', path='/foo/bar', params='', query='', fragment='')

I'm also not convinced that _is_local_windows_path is a very elegant way to check for windows paths, but it should cover most cases and may be good enough.

So yes, let's use is_local_path. Feel free to move it to a more appropriate location.

Copy link
Contributor

@justinvyu justinvyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Still need to update StorageContext for the new codepath, but that can be in the follow-up with more general relative/windows path-handling support.

@krfricke krfricke merged commit 5bf5df5 into ray-project:master Aug 15, 2023
67 of 71 checks passed
@krfricke krfricke deleted the tune/posix-paths branch August 15, 2023 07:57
NripeshN pushed a commit to NripeshN/ray that referenced this pull request Aug 15, 2023
We're currently using `os.path` in many places, but sometimes mix it with default unix paths, such as `~/ray_results`. This can lead to conflicts in Windows.

For instance, `os.path.expanduser("~/ray_results")` will result in `C:\\Users\\username/ray_results` on windows systems.

While this works in practice (python's stdlib can handle these mixed paths), it is confusing in the output. Thus, we try to replace most occurrences with a posix-compatible path.

Second, and central to this PR, our detection of local/remote paths did not work on Windows. `urllib.parse.urlparse` detects a local windows path such as `C:\\Users\\username` as scheme `C`, so Tune thinks it's a URI.

This effectively lead to a situation where windows users experienced that a remote path was detected, but it was in fact a local path. Ray Tune tried to "sync up" local results to the same local folder. This lead to errors in the syncer code.

We thus update the `_is_local_windows_path` utility and use it in `_split_remote_local_path` to detect local windows directories.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: NripeshN <nn2012@hw.ac.uk>
harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023
We're currently using `os.path` in many places, but sometimes mix it with default unix paths, such as `~/ray_results`. This can lead to conflicts in Windows.

For instance, `os.path.expanduser("~/ray_results")` will result in `C:\\Users\\username/ray_results` on windows systems.

While this works in practice (python's stdlib can handle these mixed paths), it is confusing in the output. Thus, we try to replace most occurrences with a posix-compatible path.

Second, and central to this PR, our detection of local/remote paths did not work on Windows. `urllib.parse.urlparse` detects a local windows path such as `C:\\Users\\username` as scheme `C`, so Tune thinks it's a URI.

This effectively lead to a situation where windows users experienced that a remote path was detected, but it was in fact a local path. Ray Tune tried to "sync up" local results to the same local folder. This lead to errors in the syncer code.

We thus update the `_is_local_windows_path` utility and use it in `_split_remote_local_path` to detect local windows directories.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: harborn <gangsheng.wu@intel.com>
harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023
We're currently using `os.path` in many places, but sometimes mix it with default unix paths, such as `~/ray_results`. This can lead to conflicts in Windows.

For instance, `os.path.expanduser("~/ray_results")` will result in `C:\\Users\\username/ray_results` on windows systems.

While this works in practice (python's stdlib can handle these mixed paths), it is confusing in the output. Thus, we try to replace most occurrences with a posix-compatible path.

Second, and central to this PR, our detection of local/remote paths did not work on Windows. `urllib.parse.urlparse` detects a local windows path such as `C:\\Users\\username` as scheme `C`, so Tune thinks it's a URI.

This effectively lead to a situation where windows users experienced that a remote path was detected, but it was in fact a local path. Ray Tune tried to "sync up" local results to the same local folder. This lead to errors in the syncer code.

We thus update the `_is_local_windows_path` utility and use it in `_split_remote_local_path` to detect local windows directories.

Signed-off-by: Kai Fricke <kai@anyscale.com>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
We're currently using `os.path` in many places, but sometimes mix it with default unix paths, such as `~/ray_results`. This can lead to conflicts in Windows.

For instance, `os.path.expanduser("~/ray_results")` will result in `C:\\Users\\username/ray_results` on windows systems.

While this works in practice (python's stdlib can handle these mixed paths), it is confusing in the output. Thus, we try to replace most occurrences with a posix-compatible path.

Second, and central to this PR, our detection of local/remote paths did not work on Windows. `urllib.parse.urlparse` detects a local windows path such as `C:\\Users\\username` as scheme `C`, so Tune thinks it's a URI.

This effectively lead to a situation where windows users experienced that a remote path was detected, but it was in fact a local path. Ray Tune tried to "sync up" local results to the same local folder. This lead to errors in the syncer code.

We thus update the `_is_local_windows_path` utility and use it in `_split_remote_local_path` to detect local windows directories.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
We're currently using `os.path` in many places, but sometimes mix it with default unix paths, such as `~/ray_results`. This can lead to conflicts in Windows.

For instance, `os.path.expanduser("~/ray_results")` will result in `C:\\Users\\username/ray_results` on windows systems.

While this works in practice (python's stdlib can handle these mixed paths), it is confusing in the output. Thus, we try to replace most occurrences with a posix-compatible path.

Second, and central to this PR, our detection of local/remote paths did not work on Windows. `urllib.parse.urlparse` detects a local windows path such as `C:\\Users\\username` as scheme `C`, so Tune thinks it's a URI.

This effectively lead to a situation where windows users experienced that a remote path was detected, but it was in fact a local path. Ray Tune tried to "sync up" local results to the same local folder. This lead to errors in the syncer code.

We thus update the `_is_local_windows_path` utility and use it in `_split_remote_local_path` to detect local windows directories.

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ray component: RLlib: Last sync command failed: Sync process failed warning
2 participants