Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(directory): Introduce logical-path argument which allows a shell to explicitly specify both a logical and physical filesystem path #2104

Conversation

deadalusai
Copy link
Contributor

@deadalusai deadalusai commented Jan 9, 2021

Description

Introduce logical-path argument which allows a shell to explicitly specify both a logical and physical filesystem path.

Fix directory::module to consume both path and logical-path (if provided). The "logical" path is preferred when rendering the "display path", while the "physical" path is used to resolve the "read only" flag. Repo- and home-directory contraction is based on the logical path if it is set, or the physical path if it is not.

The custom "get_current_dir" logic has been removed entirely, and the directory module now relies on context.current_dir / context.logical_dir directly.

Changes have been made to init/starship.ps1 to work with this new flag:

  • Calculate and pass "physical" and "logical" paths explicitly (as other shells do not pass --logical-path they always rendering the based on the physical path)
  • Moved the "powershell provider prefix" cleanup code to the PowerShell prompt script itself - this code should now support any kind of powershell path prefix.

Motivation and Context

I discovered this issue while testing starship with the Nu shell. It's a WIP shell, so still has some bugs. I switched to passing --path to work around one such issue and found that starship was inconsistent with how it handled the argument.

This screenshot demonstrates the problem:
starship_current_dir_resolution
Specifically, invoking the command starship prompt --path C:\Windows:

  • When called with no PWD environment variable, prints C:/Windows {lock symbol} ✔️
  • When called with a PWD environment variable, prints {the PWD env variable} {lock symbol}

The issue is caused by the directory module function get_current_dir which looks to the PWD environment variable to select the path it is going to display to the user, overriding the provided --path argument.

To resolve this, I've introduced a new prompt argument --logical-path - rather than setting the PWD environment variable, a shell with "logical" path display requirements can instead provide --logical-path to set the logical path.

The same command as above, but with the fix applied:
image

See discussion on #2095 for more context.
This (abandoned?) PR #1209 also looked to introduce --logical-path as an alternative to PWD.

How Has This Been Tested?

  • I have tested using MacOS
  • I have tested using Linux
  • I have tested using Windows

I tested manually using the following scripts:

Windows/PowerShell:

function test ($Path, $LogicalPath) {
    "Testing: starship prompt --path '$Path' --logical-path '$LogicalPath'"
    target\debug\starship.exe prompt --path $Path --logical-path $LogicalPath
    "`n"
}

New-PSDrive -Name Sys32 -PSProvider FileSystem -Root 'C:\Windows\System32' -ErrorAction SilentlyContinue | Out-Null
New-PSDrive -Name Dev -PSProvider FileSystem -Root 'C:\Users\Ben Fox\Dev' -ErrorAction SilentlyContinue | Out-Null

test 'C:\Users\Ben Fox\Dev\starship' 'C:\Users\Ben Fox\Dev\starship'
test 'C:\Users\Ben Fox\Dev\starship' 'Dev:\starship'
test 'C:\Users\Ben Fox\Dev\starship' 'whatever\basically'
test 'C:\Windows\System32\' 'C:\Windows\System32\'
test 'C:\Windows\System32\' 'Sys32:\'
test 'HKEY_LOCAL_MACHINE\SOFTWARE' 'HKLM:\SOFTWARE'

(results)
image

Linux/bash:

function test() {
    echo Testing: starship prompt --path "$1" --logical-path "$2"
    target/debug/starship prompt --path "$1" --logical-path "$2"
    echo
}

test '~/Dev/starship' '~/Dev/starship'
test '~/Dev/starship' 'logical/path'
test '/' '/'
test '/' 'logical/path'

(results)
image

Checklist:

  • I have updated the documentation accordingly.
  • I have updated the tests accordingly.

… shell to explicitly specify both a logical and physical filesystem path

Fix `directory::module` to consume both path and logical-path (if provided).  The "logical" path is preferred when rendering the "display path", while the "physical" path is used to resolve the "read only" flag. Repo- and home-directory contraction behavior is maintained, based on the logical path if it is set, or the physical path if it is not.

The custom "get_current_dir" logic has been removed entirely, and the `directory` module now relies on `context.current_dir` / `context.logical_dir` entirely.

Changes have been made to `init/starship.ps1` to work with this new flag:
- Calculate and pass "physical" and "logical" paths explicitly (as other shells do not pass `--logical-path` that they fall back to rendering the physical path)
- Moved the "powershell provider prefix" cleanup code to the PowerShell script - this code _should_ now support any kind of powershell path prefix.
@deadalusai deadalusai force-pushed the master--refactor-directory-logical-path-handling branch from b5ea856 to 7ccb0c5 Compare January 9, 2021 04:04
@deadalusai
Copy link
Contributor Author

deadalusai commented Jan 9, 2021

Something I noted during testing is that navigating to a repository directory with a non-physical logical directory "breaks" the repo truncation behavior:

image

I don't think this is a regression, but something we could improve.

Edit:
A small change produces this result

diff --git a/src/modules/directory.rs b/src/modules/directory.rs
index 85da1c3..80a9a82 100644
--- a/src/modules/directory.rs
+++ b/src/modules/directory.rs
@@ -56,7 +56,7 @@ pub fn module<'a>(context: &'a Context) -> Option<Module<'a>> {
     let dir_string = repo
         .and_then(|r| r.root.as_ref())
         .filter(|root| *root != &home_dir)
-        .and_then(|root| contract_repo_path(&logical_dir, root))
+        .and_then(|root| contract_repo_path(&physical_dir, root))
         .unwrap_or_else(|| contract_path(&logical_dir, &home_dir, HOME_SYMBOL));

     log::debug!("Dir string: {}", dir_string);

image

… causing command line parsing issues.

This is a bit of a footgun!
The work-around chosen is to append a trailing space when a path string ends with a backslash, and then trim any extra whitespace away in the Context constructor.
Other alternatives considered and rejected:
1. Always trim trailing backslashes as the filesystem generally doesn't need them.
2. Escape trailing backslashes with another backslash. This proved complex as PS only quotes string args when the string includes some whitespace, and other backslashes within the string apparently don't need to be escaped.
@andytom andytom requested a review from a team January 9, 2021 09:30
src/modules/directory.rs Outdated Show resolved Hide resolved
src/configs/directory.rs Outdated Show resolved Hide resolved
src/init/starship.ps1 Outdated Show resolved Hide resolved
@davidkna
Copy link
Member

davidkna commented Jan 9, 2021

Something I noted during testing is that navigating to a repository directory with a non-physical logical directory "breaks" the repo truncation behavior:

image

I don't think this is a regression, but something we could improve.

Edit:
A small change produces this result

diff --git a/src/modules/directory.rs b/src/modules/directory.rs
index 85da1c3..80a9a82 100644
--- a/src/modules/directory.rs
+++ b/src/modules/directory.rs
@@ -56,7 +56,7 @@ pub fn module<'a>(context: &'a Context) -> Option<Module<'a>> {
     let dir_string = repo
         .and_then(|r| r.root.as_ref())
         .filter(|root| *root != &home_dir)
-        .and_then(|root| contract_repo_path(&logical_dir, root))
+        .and_then(|root| contract_repo_path(&physical_dir, root))
         .unwrap_or_else(|| contract_path(&logical_dir, &home_dir, HOME_SYMBOL));

     log::debug!("Dir string: {}", dir_string);

image

But this could cause unexpected behavious outside powershell (symlinks etc). Maybe a third --display-path option is needed after all.

@deadalusai
Copy link
Contributor Author

deadalusai commented Jan 10, 2021

But this could cause unexpected behavious outside powershell (symlinks etc).

Other shells do not pass --logical-path at all, which means that contract_repo_path will always be called against the physical path anyway.

… executables with strings which may contain characters which need to be escaped carefully.
These were in place to clean up extra whitespace sometimes injected by starship.ps1::prompt, and are no longer required with the new Invoke-Native helper in place.
@davidkna
Copy link
Member

In traditional shells:

The physical path is the path on disk with all symlinks resolved. (std::env::current_dir, std::fs::canonicalize($PWD))
The logical path is the path without any symlinks resolved. If you cd into a symlinked directory, the parent hierachy stays the same. ($PWD)

You can't just remove the logical handling for the traditional shells.

I think something like this might work for logical_dir:

  • --logical-path (if use_logical_path is false std::fs::canonicalize)
  • --path (if use_logical_path is false std::fs::canonicalize)
  • if use_logical_path $PWD else std::env::current_dir()
  • fall back to current_dir, (if use_logical_path is false std::fs::canonicalize)

@deadalusai
Copy link
Contributor Author

deadalusai commented Jan 10, 2021

In traditional shells:

The physical path is the path on disk with all symlinks resolved. (std::env::current_dir, std::fs::canonicalize($PWD))
The logical path is the path without any symlinks resolved. If you cd into a symlinked directory, the parent hierachy stays the same. ($PWD)

You can't just remove the logical handling for the traditional shells.

Ok, I think I understand what you're getting at here now.

It seems like we should aim to standardize what data current_dir and logical_dir hold, and populate them explicitly as part of the context population step. Would the following work?

  • current_dir - holds the phyisical current directory.
    Initialised from --path, $PWD or std::env::current_dir and canonicalized (if it is possible to canonicalize).

  • logical_dir - holds the logical current directory.
    Initialised from --logical-path, $PWD, or std::env::current_dir and not canonicalized.

use_logical_path would just select between those two values.

@davidkna
Copy link
Member

current_dir - holds the phyisical current directory.
Initialised from --path, $PWD or std::env::current_dir and canonicalized (if it is possible to canonicalize).

That would work, but current_dir doesn't really need to be canonicalized unless use_logical_dirs is false, because it only matters in the directory module. (std::env::current_dir is always canonicalized as far as I know.) Not sure how expensive canonicalize is.

logical_dir - holds the logical current directory.
Initialised from --logical-path, $PWD, or std::env::current_dir and not canonicalized.

I think part of the aim for this pr for logical_dir to prefer --path unless --logical-path is also given? Do you intend to keep that behaviour?

@deadalusai
Copy link
Contributor Author

deadalusai commented Jan 11, 2021

That would work, but current_dir doesn't really need to be canonicalized unless use_logical_dirs is false, because it only matters in the directory module. (std::env::current_dir is always canonicalized as far as I know.) Not sure how expensive canonicalize is.

Looking at the existing code, I can only see one call to canonicalize as part of real_path which is used by contract_repo_path. I haven't touched any of that code - I'm not sure we need extra canonicalize calls here?

I think part of the aim for this pr for logical_dir to prefer --path unless --logical-path is also given? Do you intend to keep that behaviour?

My initial goal at least was just to make it possible to override the logical path from the command line and the addition of --logical-path solves that issue. But I think you're right in that it might fit the principal of least surprise for --path to imply --logical-path by default as that's definitely what confused me initially.

@deadalusai deadalusai force-pushed the master--refactor-directory-logical-path-handling branch from 4986a1b to 7e9a8b4 Compare January 11, 2021 22:50
…ng it to `current_dir` but overridable by the `--logical-dir` flag.

- Restore `use_logical_path` config flag.
- Always attempt to contract repo paths from the `current_dir`.
@deadalusai deadalusai force-pushed the master--refactor-directory-logical-path-handling branch from 7e9a8b4 to 892ce25 Compare January 11, 2021 22:51
@davidkna
Copy link
Member

Looking at the existing code, I can only see one call to canonicalize as part of real_path which is used by contract_repo_path. I haven't touched any of that code - I'm not sure we need extra canonicalize calls here?

With the old code std::env::current_dir would have taken care of that, unless a fallback was used, which should almost never happen. All the other path sources need calls to canonicalize unless use_logical_path is false.

My initial goal at least was just to make it possible to override the logical path from the command line and the addition of --logical-path solves that issue. But I think you're right in that it might fit the principal of least surprise for --path to imply --logical-path by default as that's definitely what confused me initially.

I think having --path imply --logical-path would be best.

@deadalusai
Copy link
Contributor Author

@davidkna On further investigation it may not be safe to always canonicalize paths on Windows:

On Windows, this converts the path to use extended length path syntax

std::env::current_dir does not return extended length path syntax under Windows.

This keeps the two calls to contract_path in sync.
@deadalusai deadalusai force-pushed the master--refactor-directory-logical-path-handling branch from 54874f5 to 6e0a083 Compare January 11, 2021 23:23
@davidkna
Copy link
Member

davidkna commented Jan 12, 2021

@davidkna On further investigation it may not be safe to always canonicalize paths on Windows:

On Windows, this converts the path to use extended length path syntax

std::env::current_dir does not return extended length path syntax under Windows.

This might have the upside of getting starship to support long paths on windows! (I think it might be best to run that on current_dir after all) I think removing the \\?\ prefix on windows for display should work.

… when use_logical_path = false

- This requires some clean-up to remove the extended-path prefix on Windows
- The configured logical_dir is ignored entirely in this mode - we calculate a new logical_dir by cleaning up the physical_dir path for display.
- Test coverage
@deadalusai
Copy link
Contributor Author

This might have the upside of getting starship to support long paths on windows! (I think it might be best to run that on current_dir after all) I think removing the \\?\ prefix on windows for display should work.

Ok, I've implemented this now and I think it works pretty well, given how I understand what you're aiming for. Tested on Linux and Windows and added test coverage.

The use_logical_path flag now switches between these two routines for resolving the physical/logical paths:

  • use_logical_path = true
    • Physical path: current_dir
    • Logical path: logical_dir or current_dir if it's not set
  • use_logical_path = false
    • Physical path current_dir canonicalized (which resolves symlinks)
    • Logical path: current_dir canonicalized (and cleaned up when on Windows)

Example of the path canonicalization with use_logical_path = false on Windows:
image

And the same on Linux:
image

src/modules/directory.rs Outdated Show resolved Hide resolved
src/modules/directory.rs Outdated Show resolved Hide resolved
src/modules/directory.rs Outdated Show resolved Hide resolved
src/modules/directory.rs Outdated Show resolved Hide resolved
@deadalusai
Copy link
Contributor Author

@davidkna I think I've found a robust solution for dealing with with comparisons between verbatim and non-verbatim paths (verbatim is the Rust std lib's name for the Windows extended-path prefixes).

I've added two routines for doing equals and starts_with comparisons with Paths which knows how to compare verbatim and non-verbatim path prefixes. The Rust Path library is pretty robust here and does most of the heavy lifting.

If there are any other modules doing "starts with" checks on context.current_dir we would need to fix them.

@deadalusai
Copy link
Contributor Author

I've raised a PR to fix path-slash: rhysd/path-slash#5

…h` with variations of verbatim and non-verbatim paths
@deadalusai deadalusai force-pushed the master--refactor-directory-logical-path-handling branch from 81d5a40 to 0274ad7 Compare January 15, 2021 01:35
This fixes the issue with the trailing character of some Windows paths being truncated, e.g. `\\server\share` and `C:`
@deadalusai
Copy link
Contributor Author

Updated path-slash to latest. This fixes the issue with the trailing character of some Windows paths being truncated, e.g. \\server\share and C:.

src/init/starship.ps1 Outdated Show resolved Hide resolved
src/init/starship.ps1 Outdated Show resolved Hide resolved
src/init/starship.ps1 Outdated Show resolved Hide resolved
src/modules/utils/path.rs Outdated Show resolved Hide resolved
src/init/starship.ps1 Outdated Show resolved Hide resolved
- Use `ProcessStartInfo` to launch native executable, replacing manual UTF8 output encoding handling
- If we detect we're on PWSH6+ use the new `System.Diagnostics.ProcessStartInfo.ArgumentList` parameter, otherwise manually escape the argument string
- Move `Get-Cwd` and `Invoke-Native` into the prompt function scope so that they don't leak into the user's shell scope
@deadalusai deadalusai force-pushed the master--refactor-directory-logical-path-handling branch from 86bed93 to 00309ae Compare January 17, 2021 00:08
Copy link
Member

@davidkna davidkna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But because this PR changes some of the fundamentals I'd prefer a second review.

@davidkna davidkna requested a review from a team January 18, 2021 11:42
…-path-handling

Conflicts:
- src/modules/directory.rs - Took local, ported remote changes (use context.get_home).
…l-path-handling

Conflicts:
- src/modules/directory.rs - ported new home_symbol handling
@deadalusai
Copy link
Contributor Author

@davidkna Is there still interest in merging this?

@davidkna
Copy link
Member

davidkna commented Feb 7, 2021

I agree that this should be merged soon. Unless someone finds any issues or objects to merging I think I'll merge this tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants