New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a cmdlet for manipulation of list strings #7975

Closed
mklement0 opened this Issue Oct 8, 2018 · 3 comments

Comments

Projects
None yet
4 participants
@mklement0
Contributor

mklement0 commented Oct 8, 2018

Note:

  • Should this suggestion gain traction, an RFC is presumably needed.
  • By list string I mean a single string composed of items (tokens) separated by a separator, such as you would find in $env:PATH, which contains a list of directories.
  • This suggestion is in part motivated by wanting a platform-independent mechanism for manipulating environment variables such as $env:PATH safely and conveniently - see #5340 and
    PowerShell/PowerShell-RFC#92

Manipulating such lists with text parsing is cumbersome, error-prone, and sometimes requires platform-specific separators, so a dedicated cmdlet, say Update-ListString (name negotiable) could ease the burden, by offering the following operations, with support for adding/removing arrays of items:

  • adding items:
    • prepend (-Prepend)
    • append (-Append)
    • insert before or after a given item (-InsertBefore / -InsertAfter) - same as -Append, if the item doesn't exist
  • replacing items:
    • by name (item string) of an existing item (-Replace) - same as -Append, if the item doesn't exist.
  • removing items:
    • by name (item string) (-Remove)
    • removing duplicate items on request (-RemoveDuplicate), keeping only the first occurrence
    • remove empty items (duplicate separators) on request (-RemoveEmpty)

Operations should be idempotent by default (desired-state logic):

  • on adding, quietly do nothing if the item(s) already exist
  • on removing, ignore the case where item(s) don't exist

With -Force, add operations should enforce the specified position - -Prepend, -Append, -InsertAt, -InsertBefore/After - by moving a preexisting item accordingly.

Separators can either be specified:

  • freely, as a regex (as you would pass to -split), with -Separator
  • or via switch -AsDirList (name negotiable), in which case [IO.Path]::PathSeparator is used, representing the platform-appropriate separator used to separate entries in $env:PATH (and similar variables such as $env:CLASSPATH).

Open questions:

  • Is it necessary to also support by-index operations (-InsertAt, -RemoveAt, -ReplaceAt)? Conceivably, this could be added later.
  • How should escaping of separators embedded in items be handled? The simplest case would be to simply disallow embedded separators in items (assume that existing items don't have embedded separators, and report an error on trying to set items containing them). Conceivably, support for escaping could be added later via an -EscapeChar parameter.

Examples:

# WISHFUL THINKING

PS> Update-ListString -Separator ';' 'a;b;c' -Append 'd', 'e'
a;b;c;d;e

PS> Update-ListString -Separator ';' 'a;b;c' -Prepend 'd'
d;a;b;c

PS> Update-ListString -Separator ';' 'a;b;c;d' -Remove 'd', 'b'
a;c

PS> Update-ListString -Separator ';' 'a;;c' -RemoveEmpty
a;c

PS> Update-ListString -Separator ';' 'a;b;a;c' -RemoveDuplicate
a;b;c

PS> Update-ListString -Separator ';' 'a;b;c' -Replace 'b', 'd'
a;d;c

# No-op, if entry already present
PS> Update-ListString -Separator ';' 'a;b;c' -Append 'b'
a;b;c

# Enforce specified position
PS> Update-ListString -Separator ';' 'a;b;c' -Append 'b' -Force
a;c;b

Environment data

Written as of:

PowerShell Core 6.1.0
@HumanEquivalentUnit

This comment has been minimized.

Show comment
Hide comment
@HumanEquivalentUnit

HumanEquivalentUnit Oct 8, 2018

You've given it a general name, but the behaviour seems very specific to "$PATH variable". Are there any other use cases anyone can come up with (not counting CLASSPATH, but outside the world of environment variables for lists of paths)?

You provisionally call it Update-ListString, but lists in PS (Arrays, ArrayLists and Generic.Lists) don't have operations exactly comparable to this; no InsertBefore or InsertAfter, no RemoveDuplicates. And where we do have types with comparable operations, they don't behave like these described operations do. So that name feels a bit misleading.

ArrayList.Replace() does not append to the list in situations where the item is not found, string replace "words here".Replace('test', 'potato') does not produce "words here potato" (I wouldn't guess that 'replace' also doubled as 'Append').

ArrayList and Generic List have a Remove() method which takes out the first occurrence, and leaves the rest. If you filter a list for unique items with Select-Unique, then you keep the first entry and remove the subsequent ones. Your cmdlet matches this behaviour - but is that right? If I want to get rid of Java and I -Remove 'c:\program files\java\bin' then the -Remove will get rid of one entry but leave possible duplicates, and -RemoveDuplicates will get rid of duplicates and leave one. In order to get rid of Java completely, if I don't know whether there are duplicates or not, must I run the cmdlet twice? Once to RemoveDuplicates and then again to Remove the remaining entry?

Then what happens if I have two entries for Java and I -InsertBefore, does it get inserted twice, once for each duplicate? If it's specific for PATH variables it should probably go before the first one only, but if it's a general "ListString" concept, should it go before every instance? Similarly, what if I -Replace a duplicate entry, does that replace both duplicates? I guess it would have to in order to make sense, but then I intuit that -Remove has -RemoveDuplicates so -Replace should have a matching -ReplaceDuplicates.

ArrayLists and Generic Lists have Insert(index, value) so they don't need an InsertAt method, but your description here is strange:

With -Force, add operations should enforce the specified position - -Prepend, -Append, -InsertAt, -InsertBefore/After - by moving a preexisting item accordingly.

If I try to prepend, how could it ever fail such that I need to -Force it? If I try to insert at position 10 but there's only 3 things in the list, what does forcing it do? Make 7 blank entries in between? If I try to insert at position 2 and I don't force it, it has to move the following items +1 index in order to work normally, so in what situation might I need to -Force it, and what would be different?

(-InsertBefore / -InsertAfter) - same as -Append, if the item doesn't exist

If I'm giving instructions based on my assumption that something will exist, but it doesn't exist, should that really be "make the change somewhere else instead", or should it really be an error?

I'm not against the idea overall, but it feels tailored to the behaviours which makes sense for environment variables full of paths, and not "Lists" in string form. Escaping rules are likely to also be specific to environment variables. Unless there are other use cases, I think I'd prefer if that specific nature was reflected in whatever name it ends up with.

HumanEquivalentUnit commented Oct 8, 2018

You've given it a general name, but the behaviour seems very specific to "$PATH variable". Are there any other use cases anyone can come up with (not counting CLASSPATH, but outside the world of environment variables for lists of paths)?

You provisionally call it Update-ListString, but lists in PS (Arrays, ArrayLists and Generic.Lists) don't have operations exactly comparable to this; no InsertBefore or InsertAfter, no RemoveDuplicates. And where we do have types with comparable operations, they don't behave like these described operations do. So that name feels a bit misleading.

ArrayList.Replace() does not append to the list in situations where the item is not found, string replace "words here".Replace('test', 'potato') does not produce "words here potato" (I wouldn't guess that 'replace' also doubled as 'Append').

ArrayList and Generic List have a Remove() method which takes out the first occurrence, and leaves the rest. If you filter a list for unique items with Select-Unique, then you keep the first entry and remove the subsequent ones. Your cmdlet matches this behaviour - but is that right? If I want to get rid of Java and I -Remove 'c:\program files\java\bin' then the -Remove will get rid of one entry but leave possible duplicates, and -RemoveDuplicates will get rid of duplicates and leave one. In order to get rid of Java completely, if I don't know whether there are duplicates or not, must I run the cmdlet twice? Once to RemoveDuplicates and then again to Remove the remaining entry?

Then what happens if I have two entries for Java and I -InsertBefore, does it get inserted twice, once for each duplicate? If it's specific for PATH variables it should probably go before the first one only, but if it's a general "ListString" concept, should it go before every instance? Similarly, what if I -Replace a duplicate entry, does that replace both duplicates? I guess it would have to in order to make sense, but then I intuit that -Remove has -RemoveDuplicates so -Replace should have a matching -ReplaceDuplicates.

ArrayLists and Generic Lists have Insert(index, value) so they don't need an InsertAt method, but your description here is strange:

With -Force, add operations should enforce the specified position - -Prepend, -Append, -InsertAt, -InsertBefore/After - by moving a preexisting item accordingly.

If I try to prepend, how could it ever fail such that I need to -Force it? If I try to insert at position 10 but there's only 3 things in the list, what does forcing it do? Make 7 blank entries in between? If I try to insert at position 2 and I don't force it, it has to move the following items +1 index in order to work normally, so in what situation might I need to -Force it, and what would be different?

(-InsertBefore / -InsertAfter) - same as -Append, if the item doesn't exist

If I'm giving instructions based on my assumption that something will exist, but it doesn't exist, should that really be "make the change somewhere else instead", or should it really be an error?

I'm not against the idea overall, but it feels tailored to the behaviours which makes sense for environment variables full of paths, and not "Lists" in string form. Escaping rules are likely to also be specific to environment variables. Unless there are other use cases, I think I'd prefer if that specific nature was reflected in whatever name it ends up with.

@rkeithhill

This comment has been minimized.

Show comment
Hide comment
@rkeithhill

rkeithhill Oct 9, 2018

Contributor

I think the 90%+ use case is to manage environment variables. Plus there is already a separate need to surface the [System.Environment]::Get/SetEnvironmentVariable methods as cmdlets. I think it makes a lot of sense to create PathVariable management commands as part of that effort as described in PowerShell/PowerShell-RFC#92

Contributor

rkeithhill commented Oct 9, 2018

I think the 90%+ use case is to manage environment variables. Plus there is already a separate need to surface the [System.Environment]::Get/SetEnvironmentVariable methods as cmdlets. I think it makes a lot of sense to create PathVariable management commands as part of that effort as described in PowerShell/PowerShell-RFC#92

@mklement0

This comment has been minimized.

Show comment
Hide comment
@mklement0

mklement0 Oct 9, 2018

Contributor

Yes, managing $env:PATH-like variables was definitely the motivation for this proposal.

I can currently think of only $env:PATH, $env:CLASSPATH and $env:PATHEXT (Windows only), and for the latter use of a cmdlet is probably overkill.

I was hoping it would generalize to other use cases where a list of items must be kept inside a string, but perhaps that's too rare a use case to worry about.

I can now see how one/multiple *-PathVariable cmdlets - that optionally allow a variable of a different name to be targeted - is probably the better solution, so I'll close this.


It is now a moot point, but as for your concerns, @HumanEquivalentUnit:

I specifically chose the noun "ListString" to avoid confusion with .NET list types - comparing their methods with what I'm proposing should not enter the picture.

The guiding principle for the suggested operations was desired-state logic, which, for instance, means that something targeted is allowed not to exist, with reasonable fallback behavior.

To address one question of yours as an example:

If I try to prepend, how could it ever fail such that I need to -Force it?

It couldn't fail, but the suggested default behavior was to leave a preexisting item alone, irrespective of its current position; -Force would then be needed to force moving it to the beginning.

Contributor

mklement0 commented Oct 9, 2018

Yes, managing $env:PATH-like variables was definitely the motivation for this proposal.

I can currently think of only $env:PATH, $env:CLASSPATH and $env:PATHEXT (Windows only), and for the latter use of a cmdlet is probably overkill.

I was hoping it would generalize to other use cases where a list of items must be kept inside a string, but perhaps that's too rare a use case to worry about.

I can now see how one/multiple *-PathVariable cmdlets - that optionally allow a variable of a different name to be targeted - is probably the better solution, so I'll close this.


It is now a moot point, but as for your concerns, @HumanEquivalentUnit:

I specifically chose the noun "ListString" to avoid confusion with .NET list types - comparing their methods with what I'm proposing should not enter the picture.

The guiding principle for the suggested operations was desired-state logic, which, for instance, means that something targeted is allowed not to exist, with reasonable fallback behavior.

To address one question of yours as an example:

If I try to prepend, how could it ever fail such that I need to -Force it?

It couldn't fail, but the suggested default behavior was to leave a preexisting item alone, irrespective of its current position; -Force would then be needed to force moving it to the beginning.

@mklement0 mklement0 closed this Oct 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment