Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-split should not return string[] if the input string is not split #19404

Closed
237dmitry opened this issue Mar 25, 2023 · 13 comments
Closed

-split should not return string[] if the input string is not split #19404

237dmitry opened this issue Mar 25, 2023 · 13 comments
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-By Design The reported behavior is by design.

Comments

@237dmitry
Copy link

Summary of the new feature / enhancement

The split operator returns an array of strings in all cases, even if no separator is found and the input string is not split into substrings.

PS > $a = 'abc' -split '\d'
PS > $a.GetType().Name
String[]

PS > $a + 'def'
abc
def

It would be better if the output is of the same type as the input, if it has not changed as a result of the operation.

Proposed technical implementation details (optional)

No response

@237dmitry 237dmitry added Issue-Enhancement the issue is more of a feature request than a bug Needs-Triage The issue is new and needs to be triaged by a work group. labels Mar 25, 2023
@rhubarb-geek-nz
Copy link

rhubarb-geek-nz commented Mar 25, 2023

Please leave this basic operator as it is. If you want to know that it was not split then do a count of the elements in the returned array. This is the very definition of a breaking change.

PS> ( 'foo/bar' -split '/').length
2
PS> ( 'foo' ).length              
3

If you returned a single string and then you check the length, you get the number of characters in the string, not the fact that one string was returned.

@237dmitry
Copy link
Author

I do not look for workaround.

@rhubarb-geek-nz
Copy link

rhubarb-geek-nz commented Mar 25, 2023

I do not look for workaround.

Are you interested in the breaking change this would cause? For example the simple case of finding the last component in a file path to get the filename

Pass in a long path

PS> $x = ("my/long/path" -split '/')
PS> $x[$x.length -1]                
path

and it gives the last part of the path

If you just pass in the last part, the same code works

PS> $x = ("path" -split '/')        
PS> $x[$x.length -1]        
path

But with the change to returning a single string if no splitting occurred

PS> $x = "path"                     
PS> $x[$x.length -1]
h

So it would break all existing code relying on the current documented behaviour.

@rhubarb-geek-nz
Copy link

I do not look for workaround.

PowerShell is in good company simply returning a list of strings when splitting a string. Reference dotnet, java script, java, perl and python

Think of the advantages of the simple "workaround" of checking the number of returned strings, your code could work on current and previous versions of PowerShell without having to wait for a new release with this change.

@jhoneill
Copy link

I can almost guarantee that if it had worked like his in PowerShell version 1 , right away someone would for a change

Consider:

PS>  $s = "Single"

PS>  $m = "many words"

PS>  $existing = "cat", "dog"

PS>  ($m  -split " ") + $existing
many
words
cat
dog

and

PS>  $s + $existing
Singlecat dog

Which way should ($s -split " ") + $existing work ? - if it had been other way people would be writing "I shouldn't need to test whether split returned an array it should work like this

PS>  ($s  -split " ") + $existing
Single
cat
dog

And I'm sure someone would have said "but you can wrap in @() as a work round" And there would have been the counter argument why does the operator work differently from the .split methods of [regex] and [string] ?

As @rhubarb-geek-nz says, other languages, .net itself, and 15 years of PowerShell all say the choice should be one way. You may have a found one of the cases where it would be more convenient if the choice had gone against all those precedents.
But changing any existing operator is too big a breakage to be considered.
Adding a wholly new operator with different behaviour is unlikely. If it were my call, I would decline this in less time than it has taken to type this, but it isn't my call so I'll mark it for engine WG to look at.

@jhoneill jhoneill added WG-Engine core PowerShell engine, interpreter, and runtime and removed Needs-Triage The issue is new and needs to be triaged by a work group. labels Mar 25, 2023
@doctordns
Copy link
Contributor

I am not sure that this is something needed today and agree that it's probably too late to make this change - as it would inevitably break way too much! I can don't see the value in making this change.

And I kind of like -split always producing a string[]. I like consistency.

@237dmitry
Copy link
Author

o it would break all existing code relying on the current documented behaviour.

No! If the string has not been split, the result value must not be an array. This behavior will save users from additional checks.

@237dmitry
Copy link
Author

I am not sure that this is something needed today

What then is needed? Wrong table representation of PSCustomObject or maybe experimental $PSStyle.Formatting.CustomTableHeaderLabel? Today is an experiment, tomorrow nothing will change.

And I kind of like -split always producing a string[]. I like consistency.

I do not view consistency at all. I view experiments on people.

@jhoneill
Copy link

jhoneill commented Mar 25, 2023

o it would break all existing code relying on the current documented behaviour.

No! If the string has not been split, the result value must not be an array. This behavior will save users from additional checks.

It is an array containing the parts of the string, whether it has been divided 0 times, once, or 100 times.

As a change it will increase the number of checks. (a) If you had a time machine and could change this from PowerShell 1.0, for every user who wants an unsplit item NOT be in an array (and if null is split, should it become an empty string? If input is a number should remain a number?) there will be another user who wants it in array, who needs to write a test.
(b) You don't have a time machine, old scripts need to work, this would break them. New scripts which might to run on old versions would have to explicitly handle something which splits into a single piece.

@237dmitry
Copy link
Author

You don't have a time machine, old scripts need to work, this would break them.

Yes, I watch it every day, your PR has already hit my pocket and took a lot of time. Thank you personally for the tables floatings.

@jhoneill
Copy link

And I kind of like -split always producing a string[]. I like consistency.

I do not view consistency at all. I view experiments on people.

If you are going to argue that the operator always producing the same output type, regardless of input is not consistency, that working like other languages and like other .net types is not consitency, and supporting the documented behaviour from 2007 is not consistency, then we will have to conclude that you are only here to argue.

This certainly comes under "disruptive behaviour" in the code of conduct. Your behaviour around the change to formatting numbers by default saw you asked to comply with the code. Since we can expect you to continue to argue whatever you are told I'm going to remove the tag for the engine wg to look at this and mark it as by design.

You don't have a time machine, old scripts need to work, this would break them.

Yes, I watch it every day, your PR has already hit my pocket and took a lot of time. Thank you personally for the tables floatings.

For the record: I made a suggestion that said format-table should have a parameter for number format. The product team said really it should go beyond that and use the default number formatting for all floats. Personally, I don't think that is as good - what if I want all floats to be four decimal places or turn thousand separators off - but I did not argue. I didn't make the PR.

So far you have complained that if you put constants into a custom object where the constant you give varies the formatting, consistent formatting should not apply. No other negative feedback has been received. No one else has supported your complaint. A pedantic reading of the code of conduct would say your post above is breach of it.

Please re-read the code of conduct before posting further.

@jhoneill jhoneill added Resolution-By Design The reported behavior is by design. and removed WG-Engine core PowerShell engine, interpreter, and runtime labels Mar 25, 2023
@237dmitry
Copy link
Author

I don't think that is as good - what if I want all floats to be four decimal places or turn thousand separators off - but I did not argue. I didn't make the PR.

I made a suggestion that said format-table should have a parameter for number format

It is very difficult for me to discuss this issue with you, I have to translate ten times from one language to another and I'm not sure that I express my thoughts correctly, one thing is clear, you are involved in a crime against users.

@jhoneill
Copy link

Please re-read the code of conduct before posting further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-By Design The reported behavior is by design.
Projects
None yet
Development

No branches or pull requests

4 participants