Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let "".split() split according to _any_ whitespace #86

Open
kseistrup opened this issue Dec 27, 2018 · 4 comments
Open

Let "".split() split according to _any_ whitespace #86

kseistrup opened this issue Dec 27, 2018 · 4 comments
Labels
enhancement New feature or request
Milestone

Comments

@kseistrup
Copy link

Python's "".split() method has a useful feature: When invoked without any arguments, the string is split according to any whitespace, and empty strings are discarded from the result:

>>> s = "X Y\tZ"
>>> s.split(" ")
['X', 'Y\tZ']
>>> s.split("\t")
['X Y', 'Z']
>>> s.split()
['X', 'Y', 'Z']

Having a similar logic in ABS would make it much easier to split strings without having to resort to calling .trim() on the resulting elements.

@odino
Copy link
Collaborator

odino commented Dec 27, 2018

On the fence on this. JavaScript uses "" as default when no argument is passed, so I was thinking of doing the same, but I'm open to suggestion -- especially considering I'm more inclined to lean towards what Python does in general over other languages :)

What are the characters that are considered "empty strings"? I see \t in the example, had a look at the doc but it's not clear to me. EDIT: found it.

@odino odino added the enhancement New feature or request label Dec 27, 2018
@odino odino added this to the cauldron milestone Dec 27, 2018
@kseistrup
Copy link
Author

Caveat: I'm not a big fan of JavaScript, so I'm biased on this issue.

From a practical standpoint, if I wanted to do something with every character in a string in Python, I would use

for c in "xyz":
    do_something(c)

or

chars = list("xyz")

Also from a practical standpoint: If ABS were to use JavaScript's understanding of the default, in order to split a simple string like "X Y\tZ" I would have to loop over the string and its substrings with multiple calls to .split() and .trim(), which seems a bit over the top.

Also, if we look how Bash does it, it would split an unquoted environment variable at the characters in $IFS, that defaults to " \t\n":

s=$(printf 'X Y\tZ')
for c in $s; do echo $c; done
X
Y
Z

PS: If ABS decides to go with Python or Bash's strategy, ABS should possibly consider letting .trim() without any arguments follow the same logic. E.g., in Python:

>>> '\t \nX\f\v'.strip()
'X'

@kseistrup
Copy link
Author

PPS: I just tried JavaScript:

$ js52
js> "XYZ".split()
["XYZ"]
js> "XYZ".split("")
["X", "Y", "Z"]
js> ^D
$ node
> "XYZ".split()
[ 'XYZ' ]
> "XYZ".split("")
[ 'X', 'Y', 'Z' ]
> 

@odino
Copy link
Collaborator

odino commented Dec 27, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants