Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-thinking the command substitution operators #3924

Closed
daniel-shimon opened this issue Oct 23, 2020 · 55 comments · Fixed by #5377
Closed

Re-thinking the command substitution operators #3924

daniel-shimon opened this issue Oct 23, 2020 · 55 comments · Fixed by #5377

Comments

@daniel-shimon
Copy link
Contributor

daniel-shimon commented Oct 23, 2020

Abstract

This is a proposal for re-designing the command substitution (a.k.a command pipelines) mechanisms in xonsh.

A simple example for command substitution in bash:

echo hi $(whoami)

NOTE: This example is very basic and naïve. Please look at the usage examples below.

Motivation

Command substitutions have a range of usage scenarios, and we want them all to be easy and concise.
These usages differ by two factors - stripping and splitting.

  • Stripping: Most unix and windows cli commands append newlines to their output, which we might want to strip.
  • Splitting: A program might output multiple values separated by spaces, newlines, tabs or other tokens, which we might want to split to multiple arguments.

Some of the more common combinations aren't easily expressed in xonsh currently.

Examples for each scenario (commands with the corresponding output format):

Strip trailing newline No stripping Strip all whitespace
No splitting whoami, uname, hostname, nproc echo -n ... | wc -l
Splitting by newline find, git branch/log/diff --name-only
Splitting by whitespace Meaningless Meaningless groups, yay -Qdtq, pkg-config
Splitting by other token parsing /etc/passwd

Current xonsh behavior

Operator Strip Split Return value Mode
$() No No String Both
!() No No CommandPipeline object Python
@$() Whitespace Whitespace List of args Command
$[] No No None Both
![] No No CommandPipeline object Both

Current state and comparisons with other shells

The first three options are the more common ones according to our current examples.

Strip Split Xonsh Bash Zsh Fish
Trailing newlines No @($(CMD).rstrip('\n')) - "$(CMD)" -
Trailing newlines Newlines @($(CMD).splitlines('\n')) - - (CMD)
Whitespace Whitespace @$(CMD) $(CMD) $(CMD) (CMD | string split ' ' --no-empty)
No No $(CMD) - - -
Whitespace No @($(CMD).strip()) "$(CMD)" - -
No Newlines @($(CMD).split('\n')) - - -
Whitespace Newlines @($(CMD).strip().split('\n')) - - (CMD | string trim | string split ' ' --no-empty)
Trailing newlines Token @($(CMD).rstrip('\n').split(TOKEN)) - - (CMD | string split TOKEN --no-empty)
No Token @($(CMD).split(TOKEN)) - - -
Whitespace Token @($(CMD).strip().split(TOKEN)) - - (CMD | string split TOKEN --no-empty | string trim)

NOTE: As far as I can tell, none of the shells has an option to only strip the single last newline.

Current proposals

XEP-2 (by @anki-code)

See details and examples in the XEP - https://github.com/anki-code/xonsh-operators-proposal/blob/main/XEP-2.rst.
In short:

Mode Current Proposed
Python $() returns output string. $() returns CommandPipeline object.
Subproc $() returns output string. $() returns list of lines from output without trailing new line symbol.
Python $[] returns None. $[] returns HiddenCommandPipeline object.
Both !() exists. !() removed. UPD: discussion.
Both ![] exists. ![] removed. UPD: discussion.

CommandPipeline (CP) class changes:

  • Add str representation the same as CP.out.
  • Remove trailing new lines in CP.out, CP.lines and CP.__iter__.
  • Add all string methods i.e. the $().split() will return CP.out.split() that is IFS analogue in fact.
  • Add all string methods for lines i.e. $().lines_find(txt) will return [l.find(txt) for l in CP.lines].

This issue's initial proposal with influence from @adqm

  • Strip trailing newline in $().
  • Strip trailing newlines in !().__iter__ (i.e. use .splitlines()).

Pros:

  • No type-changing back-compatibility issues.
  • !() is consistantly iterated through.
  • Splitting by whitespace remains unchanged (@$()).
  • The complete output is still available easily via !().out.

Cons:

  • Python code that uses $() might need to change if the trailing newline is important.

laloch@98363df (by @laloch)

Add @$[] operator to split by newlines.

Pros:

  • Completely backwards-compatible.

Cons:

  • More arbitrary syntax.
  • Doesn't really extend @$() and $[] in a meaningful way.
  • Still doesn't solve the most common No split, Strip newline scenario.

Rejected proposals

Initial proposal in this issue

Run .rstrip('\n') in $().

Cons:

  • No easy way to use the complete output if a user needs it.

Add a config option to strip $()

Cons:

this is a case where configuration would be bad, because xonsh scripts would either run or fail based on how the environment was configured.

For community

⬇️ Please click the 👍 reaction instead of leaving a +1 or 👍 comment

@laloch
Copy link
Member

laloch commented Oct 23, 2020

Hi @daniel-shimon, this has been discussed (and rejected) several times already. The reasoning, if I remember correctly, was that the trailing newline is part of the captured output and therefore should be kept as is. What if the terminating newline character is significant to your pipeline? For instance echo "foo" | wc -l vs. echo "foo\n" | wc -l? What about outputs terminated by several newline characters? How many of them would you want to strip? rstrip(...) strips all of them...

Btw, I use git rebase -i @$(git merge-base HEAD master) several times a day and it does its thing just fine 😉

There's a little trick using abbrevs: You can define abbrevs["$$"] = '@($(<edit>).rstrip("\\n"))'.

@daniel-shimon
Copy link
Contributor Author

daniel-shimon commented Oct 23, 2020

I understand what you're saying but it still doesn't feel very helpful. I'm talking of course only about subprocesses which are substituted inside another command, in which case I can't really see how that's helpful.

In any case I guess this issue is a pretty opinionated thing if it has been discussed so many times.
Do you think adding a config option like $XONSH_STRIP_SUBPROC would be acceptable?

@scopatz
Copy link
Member

scopatz commented Oct 23, 2020

In any case I guess this issue is a pretty opinionated thing if it has been discussed so many times.

Yeah, it really is. The fundamental issue is that many CLIs add a trailing newline without detecting if they are connected to a TTY or not. Bash and other shells effectively add an rstrip operation when capturing (or really a split operation), effectively deciding what the output of subprocesses should be.

I personally think that a lot of Bash's syntax and magic is particularly confusing to new users.

Do you think adding a config option like $XONSH_STRIP_SUBPROC would be acceptable?

I wish it was. I think this is a case where configuration would be bad, because xonsh scripts would either run or fail based on how the environment was configured. It seems like it could cause a lot of downstream bugs. So I think that this is a scenario where a strong decision is needed.

I'm talking of course only about subprocesses which are substituted inside another command

Now this is an interesting idea. I am open to considering it. To-date all operators work exactly the same independent of their calling context. Things that would need to be fleshed out for this proposal.

  • What happens with the other subprocess operators depending on their calling modes: ![], !(), $[]
  • What do we do with the @$() operator?

I do like to keep an open mind, though! So I can be convinced away from the above opinions.

@anki-code
Copy link
Member

anki-code commented Oct 23, 2020

@scopatz what if create distinct operator for this?

@scopatz
Copy link
Member

scopatz commented Oct 23, 2020

what if create distinct operator for this?

That is what @$() was supposed to be 😉 I am in favor or removing syntax at this point, rather than adding. I really want to avoid the issues that Bash has where there are so many specialized syntax elements that no one can remember them. Xonsh should be easy for people to learn and use and consistent in its implementation.

If anything I think we should either fully implement @$() or remove it. The initial idea for @$() what that you could register transformation functions (like a decorator), that would modify output. For example, @upper$() would uppercase the output. Or you could apply many times, like @split@upper(). Then what we have now would just be the default value: @split$() == @$().

However, this was never fully done, so maybe it is better to drop the syntax entirely.

@daniel-shimon
Copy link
Contributor Author

I do like to keep an open mind, though! So I can be convinced away from the above opinions.

Good to hear! I was fearing this is a dead end.

  1. I agree with @anki-code that @$() should remain as-is since it's very xonsh-specific in nature and the ability to spread arguments is pretty cool
  2. I think the other subprocess operators shouldn't change as well, since if we're using them we're probably doing something more advanced than just command substitution.

The basic jist is that $() is used daily for command substitution, and keeping the trailing newlines isn't robust enough for day to day use (e.g. when dealing with paths you have to use @$() which is cumbersome and won't work for paths with spaces).

If we're using $() in more complex python code, we probably want to be accurate, and the command line(s) will probably be bigger such that adding a .rsplit() is more legitimate

@laloch
Copy link
Member

laloch commented Oct 23, 2020

I already have @$[] operator here long forgotten, that splits the captured output on newlines only, yielding

$ showcmd @$[echo "space here\nbreak here"]
['space here', 'break here']

instad of

$ showcmd @$(echo "space here\nbreak here")
['space', 'here', 'break', 'here']

I can open a PR if there's general interest in the functionality.
I do, however, agree with @scopatz in that we should not add more syntax for no good reason.

@anki-code
Copy link
Member

anki-code commented Oct 23, 2020

@laloch could you please share your PR? I want to understand the process of creating additional operators =)

@daniel-shimon
Copy link
Contributor Author

Another option is to only have $() strip the trailing newlines, while !() doesn't.
This way $() is robust for day-to-day command substitution but for full accuracy one can use !().out.

This resonates with $() being intended for simple usages (you can't use it to get the exit code etc), so I think adding the rsplit to make it more robust and useful is pretty sound

@daniel-shimon
Copy link
Contributor Author

Yeah I agree about not adding more syntax in this area

@laloch
Copy link
Member

laloch commented Oct 23, 2020

could you please share your PR? I want to understand the process of creating additional operators =)

Yeah, of course. Here you go: laloch/xonsh@98363df. It's pretty trivial, since it copies almost everything from @$().

@daniel-shimon
Copy link
Contributor Author

I like this logic when I thought about users from bash world and fast diving into xonsh. Looks consistent.

It's not just for users coming from a different shell, it's really more useful and intuitive since you want information from some standard unix command (e.g. whoami, uname, nproc etc), and they all append newlines to their outputs.

Here's another example from a classical daily shell use - make -j $(nprocs) fails. Yeah, you could use @$() but it doesn't really express what you want (stripping, not splitting), and I don't think this kind of simple classic shell usage should cost a lot of syntax overhead.

@anki-code
Copy link
Member

anki-code commented Oct 23, 2020

I have only one concern #3394 :)

@laloch
Copy link
Member

laloch commented Oct 23, 2020

#3394 is a bug - sometimes you need to call !(cmd).end() in order to get the resulting CommandPipeline attributes populated.

@scopatz
Copy link
Member

scopatz commented Oct 23, 2020

My main point here is that if we are going to make one subprocess operator modal (ie Python vs Subprocess), it is worth thinking about how all of them should behave in a modal fashion.

Also, if as part of this, we should decide if we can get deprecate @$() or if we should fully implement it, and how this operator would behave modally too.

@anki-code's suggestion here gets close to that, but is probably not the complete answer.

I guess what I want to see is a table with all 5 subprocess operators and how they would behave (ie what their return values would be) in both Python and subprocess mode. I think this would be a good exercise because such a tutorial would need to make it into the docs eventually anyway, if we proceeded down this path.

@anki-code
Copy link
Member

anki-code commented Oct 23, 2020

The PR - #3926

@daniel-shimon
Copy link
Contributor Author

daniel-shimon commented Oct 23, 2020

@anki-code if I understand correctly with your suggestion $() isn't modal?
Meaning $() always returns the rstripped output, and !() returns the complete output in subprocess mode and a CommandPipeline object in python mode?

This sounds like a pretty good idea to me - !() is used for fine-grained usages and $() just works for substitutions.

@daniel-shimon
Copy link
Contributor Author

@scopatz In this suggestion, the !() operator doesn't really act differently wrt to the current mode, it just returns a helpful thing in subprocess mode and returns the complete CommandPipeline object in python mode.

@anki-code
Copy link
Member

anki-code commented Oct 24, 2020

I've edited the description of the solution.

@daniel-shimon
Copy link
Contributor Author

@anki-code very nice!
Can you clarify that the behavior should be .rstrip('\n) like in other shells?
Also can you add the make - j $(nproc) example?

@scopatz
Copy link
Member

scopatz commented Oct 25, 2020

In my opinion, $() should return a string in both Python & subprocess mode, not a list. The other operators are free to return various other objects

@anki-code
Copy link
Member

anki-code commented Oct 25, 2020

@scopatz could you give an arguments why?

@scopatz
Copy link
Member

scopatz commented Oct 25, 2020

Anyway, on the technical side, the original design was for $ to indicate that a built-in Python object was returned (str or None), while ! indicated a more rich object, now CommandPipeline. () indicated capturing, ie nothing would be printed to stdout. [] means that output flows through to the screen.

So in this light, $() means capture the output and return a basic Python object. It is supposed to be the same as a the Bash operator $().

Now, I think there might be some confusion about what Bash does. In Bash, because everything is a string*, the $() just returns a string of stdout. I am not even sure that Bash strips the trailing newline. However, in most Bash usage, the Internal Field Separator $IFS variable actually determines how text strings are split by the Bash parser. By default this is set to do whitespace stripping. So in effect, $() in Bash works like @($(cmd).split()) would in xonsh.

In my opinion, $IFS is a bad idea because it can lead to all sorts of strange behavior, like 3rd party scripts not running because the calling process has set this variable incorrectly. $IFS also makes $() seem like it is doing more than it is.

I think using $() in xonsh to split into a list of arguments is a neat idea, but it would necessitate the addition of some default or configurable way to split those arguments. For example, should $() be split by lines or by whitespace (like effectively what Bash does)?

Like with Bash, splitting $() into a list of str and being able to configure how this split happens would have the following issues:

  • Some scripts would not be able to run with each other or
  • Every script would need to be very aware of how $() was being split.

This is why I think that $ simple return operators like $() and $[] should pick one uniform way of returning and stick to that. It builds a consistent ecosystem of code that can be run with each other. The approach in xonsh so far has been to thus allow the users / authors to explicitly split however they want. @($(cmd).split()) may seem like more work to write, but it is less work to inspect and debug later on.

So while I keep an open mind on what these various operators should return, I think we should also try to learn from some of the bad design of other shells.

I hope this makes sense!

*Yes, you can make arrays, but it is hard and annoying to do so.


On the less technical side, I think there are a lot of things that we all want fixed before 1.0. This is another problem that probably should be resolved (one way or the other).

Speaking only for myself, xonsh is something that I can only code on during the nights and weekends. I have other things to do in my free time as well. Because I am trying to support all parts of xonsh (review all PRs, put out releases, fix fundamental issues with the project, etc), I don't have as much time as even I would like to spend on each issue.

There are other forks of xonsh out there. Which is great! Xonsh has made the process of creating high quality shells so much easier, that I am happy that people have made improvements in the direction that they want to go. Many years worth of effort have gone into xonsh from a fairly big group of people.

I am sorry if you feel development is too slow. It is going as fast as it possibly can. An single person working on their own is always able to work faster than a group trying to make decisions.

On the other hand, working to improve xonsh as a group, even though it is harder and slower, does improve the shell for everyone in the long run. If a change is important and valuable, it will make it in. But the bigger the effects of the change, like with subprocess syntax, the more people are going to need to weigh in on the pros/cons and the more forethought will need to be put into it.

@daniel-shimon
Copy link
Contributor Author

I agree we should improve the efficiency of the discussion, and document all the arguments in a centralized way.

Following what you guys said, since this is such a major design decision in xonsh I suggest we take a step back from implementation details, gather concise examples, compare to other shells and document the pros-and-cons for each proposal.

I'll try to make this issue a mini-PEP for the subject.
@scopatz if you prefer some other medium, a separate repo could be cool.

I've updated the description to express all the things we've talked about.
@anki-code @laloch please let me know if I missed stuff and give us more examples so that I'll update the description.

Let's take a few days to get more examples, discuss more options and let more people from the community express their opinions 😄

@daniel-shimon daniel-shimon changed the title Strip away newlines in subcommand output $(...) Re-thinking the command substitution operators ($(), !(), etc) Oct 26, 2020
@daniel-shimon daniel-shimon changed the title Re-thinking the command substitution operators ($(), !(), etc) Re-thinking the command substitution operators Oct 26, 2020
@adqm
Copy link
Member

adqm commented Oct 26, 2020

Chiming in since I was one of those who was spam-mentioned in the other thread and because I was involved in the discussion and implementation of these operators the first time around, though I've not had a horse in this race for several years...

(BTW, hi again, @scopatz et al! It's been a long time; I hope all is well!)


Years ago when this conversation started, I was a proponent of stripping a trailing newline from the output of $(...) but leaving the output as a string; and leaving !(...) as-is. And I'm still a proponent of that change, FWIW (was that the original proposal in this thread? I haven't been following the conversation).

I'm not crazy about the implementation in #3926, since $(...) returning a list just doesn't feel right to me. You can always do list(!(...)) if you want something like that. I'd rather see fewer keystrokes for common operations, and more keystrokes for less common things, so I think it makes the most sense for $(...) to return a string with trailing newline stripped, and !(...) to return whatever fancy object xonsh returns nowadays for that kind of operation. So I would like to see:

$ user = $(whoami)

# easy, and common, and (i'm pretty sure) matches bash in terms of what is captured...
# if not, it at least matches my expectation and doesn't surprise me when i switch between shells
$ user = !(whoami).out 

# if you want the trailing newline for some reason
$ files  = list(!(ls))

# if you want a list instead of a generator for some reason

This is the behavior I have in my fork, and I'm happy with it; I've never been surprised by what I get out of either of these operators, and examples like the make -j $(nproc) work just fine. I have never used the middle example (with the trailing newline), though I do loop over !(...) directly regularly, and I occasionally make a list out of it (as above) for later use.


I feel like this small change is also all that one needs to provide a relatively easy way to get just about anything you would want from a command. In Python mode:

  • If you want the output as a string, you can do $(...), no need for any indexing, stripping, joining, etc
  • If that deleted some whitespace that you wanted, you have to type a bit more, but you can do !(...).out
  • If you want something to loop over the lines, you can do !(...)

Then, in terms of interpreting as arguments to a subprocess-mode command:

  • If you want the whole result as a single argument, you can do command $(...)
  • If you want the result as multiple arguments split by whitespace, you can do command @$(...)
  • If you want the result as multiple arguments split by newlines, you can do command @(!(...))

I feel like this structure, on the whole, is less work for the more common operations than the current behavior, and also when compared to the structure where $(...) returns a list-like thing. And if you want something more complicated than the above, I feel like you should be able to build it from those primitives.

Basically, I feel like the current behavior is mostly consistent, and that the single small change of r-stripping newlines by default in $(...) reduces end-user surprise. I'm not sure the additional changes of the structure proposed in #3926 provide much additional benefit, particularly when weighed against the cost of such a big (and backwards-incompatible) change.


I don't have the time or the energy to jump into the fray here, though, other than to say that I am probably -1 on #3926 if I still get a vote, to agree with @daniel-shimon that these operators are indeed worth thinking about more, and to present my view of how I would want things to work.

@jnoortheen
Copy link
Member

@scopatz I am in favor of removing @$. We could find some pythonic ways to do it instead. Having too many operators leads to confusion. Python is as simple as one can get. We could use/abuse the pipe operator https://pypi.org/project/pipe/ 😄. We can discuss that in another issue I think.

We can blame bash for all we want but it is everywhere, even dockerfilss, CI configs etc, Copying the syntax but not the behaviour leds to confusion and frustration. I myself copy pasted code and expected it to run (my bad it was in bash😄). I understand that Xonsh wants to establish high standards. But come on, bash is not going to be replaced anytime soon and xonsh is primarily interactive usage (Though the scripts can be in xonsh, I find myself using plain python+sh. It is way more maintainable and has complete toolchain to lint,format etc.,). So why not make it more usable and less surprising.

@anki-code Having said that, it is not fair to add such disruptive behaviour to an old project and have it by default. Many people would be having their scripts doing things using this behaviour. The best bet would be to make this a xontrib. Since this is a core code, making it modular, maybe it can be achieved.

@anki-code
Copy link
Member

anki-code commented Oct 26, 2020

Having said that, it is not fair to add such disruptive behaviour to an old project

@jnoortheen thanks for your attention and time! The backwards compatibility is not a strong argument and xonsh is not so old (5 years slow xonsh development vs 15 years of fast fish development vs 30 years zsh/bash). Bash becomes ugly because thinks about backwards compatibility every day.

@anki-code
Copy link
Member

anki-code commented Nov 1, 2020

image

Hello all! Hope your weekend going well.

I made a lot of homework during the week of silence. I took into account all the comments and suggestions in this and other topics. I want to told you about the result.

  1. I figured out one new approach that looks the best. I described this approach in distinct page as Xonsh Enrichment Proposal #2 (XEP-2). I'll not repeat it here just read it.

  2. I implemented the XEP-2 approach to allow everyone try it in action.

  3. To achieve point 2 I crated my-xonsh-fork script that allows completely rename xonsh source code to make new package name.

  4. As result I created xonsh2 repository that could be installed via pip, contains XEP-2 approach changes and will not affected to your installed xonsh on the host.

    You can install xonsh2 and: run it, create .xonshrc_2, add to the shebang (#!/usr/bin/env xonsh2), run scripts (xonsh2 my.xsh), put xontribs to xontrib2 directory and load it via xontrib2 load. Original xonsh will live shoulder to shoulder with xonsh2 in fact.

Finally, I want to ask maintainers and any people who want to help read the XEP-2 proposal, try this approach in action and give a feedback. Feel free to report bugs and features to the XEP-2 repo. Thanks!

PS: first feedback on XEP-2:

image

@anki-code
Copy link
Member

anki-code commented Nov 7, 2020

After creating XEP-2 I want to read the @scopatz comment and answer more detailed.

the original design was for $ to indicate that a built-in Python object was returned (str or None), while ! indicated a more rich object, now CommandPipeline

This just an idea. You are not described the benefits behind or arguments why we should support this idea.

We deleted ! operator in XEP-2 and have the same functionality with better use cases.

So in this light, $() means capture the output and return a basic Python object. It is supposed to be the same as a the Bash operator $().
...
In my opinion, $IFS is a bad idea because it can lead to all sorts of strange behavior

I agree with arguments against the IFS. But this can not be used as an argument that the $() operator should return string.

Yes, IFS is bad because changes behavior of $() in Bash. This only means that we shoud do the $() operator stable in xonsh. It could be a string or list or something else but the main requirement is: it should be unchangeable.

I think using $() in xonsh to split into a list of arguments is a neat idea, but it would necessitate the addition of some default or configurable way to split those arguments.

Here you contradict yourself. You criticize the IFS above for this. No, we shouldn't change the $() behavior externally. We should choose the most usable behavior and stay on it.

For example, should $() be split by lines or by whitespace (like effectively what Bash does)?

Rephrasing this questions: what $() should do by default? So we are here for discuss it. Our current suggestion is in XEP-2.

Like with Bash, splitting $() into a list of str and being able to configure how this split happens would have the following issues
...
This is why I think that $ simple return operators like $() and $[] should pick one uniform way of returning and stick to that.

We all already agree that IFS is bad and we should choose the good unchangeable behavior for $() by default.

We still not have an arguments why it should be a string.

The approach in xonsh so far has been to thus allow the users / authors to explicitly split however they want. @($(cmd).split())

This statement tells to us that we should have an ability to do this. But it does not tell to us that we should use $() operator for this.

There are four counter arguments against that $() operator should be a string in subprocess mode:

  1. This operator have a long and well known history - it used to split output to arguments.

  2. The Zen of Python told us: Beautiful is better than ugly, Simple is better than complex, Flat is better than nested, Readability counts. Now look here: id $(whoami) vs id @($(whoami).rsplit()). The left command is obviously in line with Python Zen principles. Even if you try to argue using the principle Explicit is better than implicit, you will lose to other principles and points 1,3,4 in this list.

  3. This operator does not return pure output. The output is modified (read below). You can not refer to use case where user get pure output and split it as he want.

  4. There is a technical and in most cases backwards compatible way to make $() pretty agile to solve explicitly splitting for complex usage. Proposed in XEP-2.

  5. Added later: Statistics regro/rever shows that getting string use case loses to getting arguments use case.

As result the statement about explicitly splitting still has no arguments about why $() should be a string in subprocess mode but has counterarguments.

I want to additionally demonstrate that current $() operator corrupts the output:

body = '1\n2\r\r\r\r3\r\n4\n6\r7'

$(echo -n @(body)) == body
#False
len($(echo -n @(body))) == len(body)
#False

$(echo -n @(body))
#'1\n2\n\n\n\n3\n4\n6\n7'

echo -n $(echo -n @(body)) | md5sum
# 8dae88fa7c93e91a59abea2686e64994  -

echo -n @(body) | md5sum
# 35364d7e6a46e5f560e2e9509be31349  -

# Prepared by xontrib-hist-format

This means that current $() is not pure output. Any arguments that based on that user has an output and he can split it any way are all wrong. User already got corrupted output and should invent the way to do something with this. Current xonsh has no pure logic. Current xonsh is already has biased pre assumption that $() should return corrupted output.

Now back to @scopatz comment:

In my opinion, $() should return a string in both Python & subprocess mode, not a list.

This statement:

  • Has no strong support behind.
  • Has no arguments behing.
  • Has reasonable counter arguments.
  • Has reasonable alternative described in XEP-2.

I commented on Anthony's answer line by line and showed the lack of arguments not to attack him. Anthony is the founder of this incredibly awesome project and his opinion requires special attention. In this case this opinion does not have strong support and I just wanted to show it and urge all of us to what is included in the title of this issue - thinking.

I'm looking forward to XEP-2 review from Anthony and xonsh/xore.

@scopatz
Copy link
Member

scopatz commented Nov 10, 2020

Given XEP-2 says that it is backwards incompatible (which it is) and you want a xonsh2, I don't know how to approach this in any short time frame.

@scopatz
Copy link
Member

scopatz commented Nov 10, 2020

That is to say, is there any way to change this so that it is actionable in the near future? Maybe have something which is mostly backwards compatible, so that we can move toward 1.0 with whatever changes we do want to make.

@jnoortheen
Copy link
Member

I think the decision should be left to user, instead of forcing them to adapt to a behaviour one seem fit.
So making this configurable with some list of choices(string names like no_new_line, raw ...) and a callback path(to return custom class) would suffice all use cases.

@daniel-shimon
Copy link
Contributor Author

@scopatz Why is backwards compatibility a big issue before we're at 1.0? If it's because of the time frame maybe we can split tasks to a few parallel PRs?

@scopatz
Copy link
Member

scopatz commented Nov 10, 2020

Yeah, I 100% agree that we are not at v1.0 yet and so we should feel free to make backwards-incompatible changes. However, I don't know that we can fundamentally break every xonsh script & module that has ever been written.

I would also like to avoid Python 2 ->3 transition issues, especially in pre-1.0 world.

I guess we could use the current parser to translate older APIs into the newer ones, if we did want to go down these route.

@scopatz
Copy link
Member

scopatz commented Nov 10, 2020

Also, is there a place I should be giving specific feedback on XEP-2?

@anki-code
Copy link
Member

If you want to ask questions or point out to grey zones that should be described or improved in the proposal the best way is to create an issue in the repo. I'll think about it. We already have a question from laloch that was answered.

Other thoughts feel free to put in any place you want.

PS: it will be great to know who is from xore team also have an interest in this process and from who we can wait a feedback.

@anki-code
Copy link
Member

anki-code commented Nov 16, 2020

Minute of statistics. I've counted the meaning of $() operator in regro/rever that written on xonsh:

  1. Only 5 lines (30%) use $() operator to get string. All other 12 lines (70%) use $() to get "arguments" and have strip() or splitlines() tails.

    cd /tmp
    git clone https://github.com/regro/rever/
    rm -rf rever/.git
    grep -ri '$(.*)' ./rever | grep -E -v '(appimage|\.rst|/test|filename=|Autogenerated)+'
    File Line Meaning
    rever/conda.xsh envs = json.loads($(conda env list --json))["envs"] str
    rever/conda.xsh $(conda shell.xonsh hook), str
    rever/activities/push_tag.xsh raw_remote = [r for r in $(git remote -v).split() argument
    rever/activities/pypi.xsh pypi_name = $($PYTHON setup.py --name).strip() argument
    rever/activities/gcloud.xsh account = $(gcloud config get-value account).strip() argument
    rever/vcsutils.xsh return $(git rev-parse --abbrev-ref HEAD).strip() argument
    rever/vcsutils.xsh return $(git rev-parse HEAD).strip() argument
    rever/vcsutils.xsh tag = $(git describe --abbrev=0 --tags), return tag.strip() argument
    rever/vcsutils.xsh root = $(git rev-parse --show-toplevel), return root.strip() argument
    rever/vcsutils.xsh lines = $(git log "--format=%aN<%aE>").strip().splitlines() arguments list
    rever/vcsutils.xsh for line in $(git shortlog @(args)).splitlines(): arguments list
    rever/vcsutils.xsh for line in $(git shortlog @(args)).splitlines(): arguments list
    rever/vcsutils.xsh for line in $(git log --encoding=utf-8 --full-history --reverse "--format=format:%ae:%at").splitlines(): arguments list
    rever/tools.xsh keys = $(gpg --list-keys) str
    rever/authors.xsh log = $(git log --sparse --full-history @(fmt) @(refs)) str
    rever/docker.xsh imageid = $(docker images -q @(image)).strip() mostly argument
    rever/docker.xsh out = $(docker run --help) str
  2. Usage of !() operator is low:

    grep -ri '!(.*)' ./rever
    ./rever/activities/ghpages.xsh:                p = !(git diff --exit-code --quiet).rtn
    ./rever/activities/ghpages.xsh:                q = !(git diff --cached --exit-code --quiet).rtn
    ./rever/activity.xsh:        p = !(which $REVER_VCS)
    ./rever/activity.xsh:            if !(which @(cmd)):
    ./rever/tools.xsh:    if not bool(!(which gpg)):
    
  3. Usage of ![] is the same as $[] because the returning value is not used in 83% of cases:

    grep -ri '!\[.*\]' ./rever
    ./rever/activities/ghrelease.xsh:    ![git archive -9 --format=tar.gz --prefix=@(folder_name)/ -o @(fname) @(tag)]
    ./rever/activities/ghpages.xsh:            ![git clone @(repo) @(repo_dir)]
    ./rever/activities/pypi.xsh:        p = ![$PYTHON setup.py @(build_commands)]
    ./rever/activities/pypi.xsh:            p = ![twine @(upload_args) $dist_dir/*]
    ./rever/activities/forge.xsh:            p = ![git clone @(feedstock_origin) @(feedstock_dir)]
    ./rever/activities/forge.xsh:        rerender = ![conda-smithy regenerate --check]
    ./rever/activities/gcloud.xsh:        ![gcloud auth application-default login]
    ./rever/activities/gcloud.xsh:        ![gcloud auth login]
    ./rever/activities/gcloud.xsh:            ![kubectl set image deployment/$GCLOUD_CONTAINER_NAME $GCLOUD_CONTAINER_NAME=$GCLOUD_DOCKER_HOST/$GCLOUD_DOCKER_ORG/$GCLOUD_DOCKER_REPO:$VERSION]
    ./rever/activities/gcloud.xsh:                ![kubectl rollout status deployment/$GCLOUD_CONTAINER_NAME]
    ./rever/activities/gcloud.xsh:        ![kubectl rollout undo deployment/$GCLOUD_CONTTAINER_NAME]
    ./rever/activities/docker.xsh:            ![docker build @(args)]
    ./rever/activities/docker.xsh:            ![docker push @(tag)]
    ./rever/activities/docker.xsh:            ![docker build @(args) .]
    ./rever/activities/docker.xsh:                ![docker push @(tag)]
    ./rever/vcsutils.xsh:    ![git init @(tempd)]
    ./rever/vcsutils.xsh:            ![git checkout -b __rever__]
    ./rever/vcsutils.xsh:            ![git commit --allow-empty -m 'Checking rever permissions']
    ./rever/vcsutils.xsh:            ![git push --force @(remote) __rever__:__rever__]
    ./rever/vcsutils.xsh:            ![git push --force @(remote) :__rever__]
    ./rever/tools.xsh:    ![cd @(d)]
    ./rever/tools.xsh:    ![cd @(old_d)]
    ./rever/docker.xsh:    ![docker build -t @(image) -f @(dockerfile) --no-cache .]
    ./rever/docker.xsh:    ![docker run -t @(env_args) @(mount_args) @(image) @(command)]
    

As result this statistics on the code that written by xonsh maintainers and contributors completely supports the XEP-2 changes where we use: $() to get arguments, $().str shortcut to rarely used use case of getting string, finally we merged !() ![] operators into $ and removed them.

@danmou
Copy link
Contributor

danmou commented Aug 21, 2023

@anki-code I like your proposal, especially as remembering the meaning of all the different command substitution operators is the most difficult part of using xonsh IMO. It doesn't look like there's been any progress or discussion for a few years, is there any plan to move forward with this proposal or is something blocking it?

@anki-code
Copy link
Member

I have no time to implement this but I still believe in this. There are no direct blockers.

@Matthieu-LAURENT39
Copy link
Contributor

This seems like a really good proposal.
One thing that isn't adressed, the .out attribute would contain the raw stdout, so wouldn't it make sense to have a .err attribute as well for raw stderr?

@FlyingWombat
Copy link

FlyingWombat commented Apr 2, 2024

Great proposal. Xonsh's current 5+ expansion operators seem redundant.

Question: Do we need a separate operator for uncaptured subprocess?

I'm new to xonsh, so I'm not familiar with all of the use cases, but at first glance it doesn't look like there is much use for an uncaptured subprocess operator.

In python mode, if you don't need to use the value of a $() subprocess result, then just don't use it and it will be removed when the scope ends. If you need to see stdout in the terminal, then print($(...)) should work, right?

In subproc mode, there doesn't seem to be a purpose to uncaptured subprocesses at all, over just making separate statements foo; bar.

@anki-code
Copy link
Member

Question: Do we need a separate operator for uncaptured subprocess?

The xonsh tutorial shows different use cases. In addition I would say that ![] is needed to explicitly tells that this is subprocess command.

gforsyth pushed a commit that referenced this issue May 2, 2024
### Motivation

* To have an ability to manage the output format added ``$XONSH_SUBPROC_OUTPUT_FORMAT`` to switch the way to return the output lines. Default ``stream_lines`` to return text. Alternative ``list_lines`` to return the list of lines. Also supported custom lambda function.
* Additionally the [proposal to change default behavior](#5377 (comment)) for a single line case.
* Closes #3924 as soft solution.

### Before

```xsh
mkdir -p /tmp/tst && cd /tmp/tst && touch 1 2 3

$(ls)
# '1\n2\n3\n'

id $(whoami)
# id: ‘pc\n’: no such user: Invalid argument

du $(ls)
# du: cannot access '1'$'\n''2'$'\n''3'$'\n': No such file or directory

ls $(fzf)
# ls: cannot access 'FUNDING.yml'$'\n': No such file or directory
```

### After

```xsh
mkdir -p /tmp/tst && cd /tmp/tst && touch 1 2 3

$XONSH_SUBPROC_OUTPUT_FORMAT = 'list_lines'

$(ls)
# ['1', '2', '3']

[f for f in $(ls)]
# ['1', '2', '3']

id $(whoami)
# uid=501(user) gid=20(staff)

du $(ls)
# 0 1
# 0 2
# 0 3

ls $(fzf)
# FUNDING.yml

# etc
mkdir -p /tmp/@($(whoami))/dir
cat /etc/passwd | grep $(whoami)
```

### Notes

* It will be good to improve parser for cases like `mkdir -p /tmp/$(whoami)/dir`. PR is welcome!
* I named the default mode as `stream_lines` (instead of just `stream` or `raw`) because in fact we transform raw output into stream of lines and possibly reduce the length of output ([replacing `\r\n` to `\n`](https://github.com/xonsh/xonsh/blob/c3a12b2a9c2958ce2a8e97a59b41f030f24cb45c/xonsh/procs/pipelines.py#L380-L383)). May be some day we need to add raw "stream" output format.
* Now anybody can implement bash `IFS` behavior in [bashisms](https://github.com/xonsh/xontrib-bashisms).

---------

Co-authored-by: a <1@1.1>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@anki-code
Copy link
Member

To notify everybody from this epic thread with good news.
The PR that introduces soft changes around treating the ending of the subprocess output has merged now. See #5377.
We did a great job in this thread. Thanks to all here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants