Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow a deeper integration with nushell #5088

Open
fdncred opened this issue Jul 15, 2023 · 38 comments
Open

allow a deeper integration with nushell #5088

fdncred opened this issue Jul 15, 2023 · 38 comments

Comments

@fdncred
Copy link

fdncred commented Jul 15, 2023

As a maintainer of nushell, I just want to say that we love the effort of uutils/coreutils. Good work y'all and please keep up this excellent work!

We love it so much that we really want to have a much tighter integration than just calling external executables. I myself, have started to experiment by using the uu_cp and uu_mv crates as internal nushell commands in this PR nushell/nushell#9463.

There are some rough edges that we'd need to work together on and I've tried to document different strategies in my PR. Without going into a lot of detail I'll just list a few things.

  • Nushell doesn't use clap so working with command line arguments and the uu Options structs is interesting. Right now, I've kind of hacked it together where nushell checks params, then uu checks params, which isn't ideal, but it's a start.
  • I've experimented with calling a command's uumain directly and passing args to it (and that can work), or rolling our own type of uumain. The benefit with rolling-our-own is we get the expected nushell "pretty" error messages. Error handling in general will need help and I'm not quite sure what to do here, specifically in regards to passing back a Span so we know where the error is.
  • Streaming. Some uu_ commands stream output as output is generated, this is great and we'd need a way to hook into this output in order to consume it as nushell data structures that generate the nushell tabular or record output. I'm not really sure what is appropriate here.

In general, as you might guess, some things in coreutils may need to be exposed publicly, like command Options, to enable others, like nushell, to utilize them, and other things may need a different level of surgery in nushell or in coreutils.

One thing is for sure, we'd love to partner up and find a way forward because, let's be honest, dealing with some of these file-io and other commands is just difficult, and it would just be great to have one source of truth that works cross-platform like the uutils/coreutils.

@amtoine
Copy link

amtoine commented Jul 15, 2023

as a maintainer of Nushell myself, i totally support the initiative of @fdncred here 😊

@sylvestre
Copy link
Sponsor Contributor

Sure, let's work together to make this happen!

@dmatos2012
Copy link
Contributor

As a small small contributor of both nushell and coreutils, Id love to see this happen and would like to help in making this possible. :)

@fdncred
Copy link
Author

fdncred commented Jul 18, 2023

@dmatos2012 The easiest way for you (or anyone else that is interested) to help is to start playing with my nushell PR listed above. If you checkout my PR and compile it with cargo build --features=nuuu then cp and mv will be using the uutils version. You can play around and see that some mv syntax does not work like mv a b c test/. It works in uutils mv but nushell doesn't allow it. This type of testing starts to reveal changes that will need to be made somewhere.

The current task is to really to find a way forward. e.g. How do we handle spans, which nushell requires for errors? How do we handle parameter parsing? How do we handle streaming.

I started with uu_cp because there's not a lot of output with copying. I figured it may be an easy place to start. Then I went to uu_mv, which quickly revealed syntax that I think just needs to be added to nushell.

@tertsdiepraam
Copy link
Member

tertsdiepraam commented Aug 2, 2023

I've been thinking about this in the background and looked at nushell/nushell#9463. Here's what I think we should do:

Create a list of suitable utils for nu

Not all utils are suitable for nu. For example, using the uutils ls over the nu ls would probably be a bad idea. I like the start of mv and cp. In general, I think the utils that are suitable are:

  • relatively complex,
  • have very little output or the output is not the main function of the util,
  • do not rely on bash-isms.

For example:

  • Suitable: b2sum (and similar), cp, dd, rm
  • Unsuitable: test, ls, sort, true, false

There is a grey area of utils which have some significant output, but are relatively simple, such as wc. This could potentially be supported by creating some wordcount crate extracted from uutils to be shared between the projects.

Refactor those utils to expose their functionality

The nu cli interface is better than anything uutils could ever provide, so nu should do the argument parsing in my opinion. I think using uumain (like in nushell/nushell#9463) is probably also more complicated than necessary and might introduce subtle bugs. I think we should instead provide an interface that exposes the internals of the util directly:

pub fn cp(opts: CpOptions) -> UResult<()> { ... }

This can then be used by nu.

Ensure that uutils does not exit or panic

uutils has largely been written to be run as a separate process and has used std::process::exit quite freely in the past. We've been reducing this dramatically, but it's not yet fixed. There's also the occasional panic. Both of these would exit the nu process, which is of course unacceptable. The refactored utils should be checked to ensure that does not happen.

Non-goal for now: streaming output

If we focus on utils that are not focused on output, we can ignore the issue of generating streamed structured output from uutils. I'd love to tackle this in the future, but it would be very complicated. So, I think it's easier to focus on the quite large collection of other utils first.

I'd love to hear what you think of this!

@fdncred
Copy link
Author

fdncred commented Aug 2, 2023

Thanks @tertsdiepraam. I appreciate your thoughtfulness. Here's some of my thoughts.

I generally agree with this approach, which is why I started with cp. JT likes to say, "crawl, walk, run" which is basically what your proposal is. Start with the low hanging fruit and then look at the next lowest, and so on. So, yes, please. :) I'd love to move forward with this. I'm happy to just start with cp as a proof-of-concept. It seemed pretty straight forward.

Another thing that I started playing around with was porting your tests into nushell. As far as I'm concerned, the more tests, the better, especially when bridging platforms like this. It was not trivial for the couple tests I moved over. I'd like to somehow just dump the tests your community has created for individual commands. We may have to think a little more on that front.

I do know that with the mv command, there is some functionality that didn't work out-of-the-box, or said another way, nushell may have to change the way parameters are setup now for mv. IIRC, it was mv file file file dir. Nushell should support that, but the current code did not, so we'll have to investigate if this is a simple change or a more complex one.

Exposing uutils Options is a must. Agreed. If nushell is handling the command line parsing then we'll have the spans for "pretty" errors.

Exit/Panic - Ya, that would be bad but it seems like it's a known issue that we can continue to work on.

ls - I'm very keen to use your ls and stream the output somehow with a callback Fn or something. We have permission issues sometimes and those are just a pain to figure out because it's hard to reproduce them. I'm fine with waiting though. Another one that we have permission issues with is cd. I'm not sure I saw a uutils cd command though.

I'll take a look through the commands and throw a list together of ones that I'd consider the "crawl" stage, the post it here, but like I said, let's just plan on cp being first and let's just do one command. We just had a release, 0.83.1, so we have ~3 weeks until the next release 0.84.0. It would be cool to have uu_cp inside of nushell by then. If we can't do it by then, that's no biggy though.

@tertsdiepraam
Copy link
Member

Excellent! I love the "crawl, walk, run" saying!

Let's start with cp then. I think that it would be quite easy on the uutils side. We could just expose:

pub struct Options {

and

fn copy(sources: &[Source], target: &TargetSlice, options: &Options) -> CopyResult<()> {

and the types used in Options.

(Source and TargetSlice are just aliases for PathBuf and Path respectively)

I can add some additional documentation to those types and maybe clear up the naming a bit and then the rest would be up to you.

@fdncred
Copy link
Author

fdncred commented Aug 3, 2023

I took a swing at bucketing the commands into crawl, walk, run here https://docs.google.com/spreadsheets/d/1MP9V8BTtcy_3DIUT8025yTcGrjkIJ-ZKFT0uvJFw0Lo/edit#gid=0

I guessed at a bunch of them and some may be bucketed wrong, but it's a start.

@tertsdiepraam
Copy link
Member

I've made a PR to expose most of the necessary types of cp for use in nu: #5152. Feedback is welcome!

@dmatos2012
Copy link
Contributor

Now, building up on my comment on the PR, I have some pain points specifically for us to solve here. @fdncred. You can check my WIP branch building on yours (very ugly code now), to see some of updates.

  • Particularly pain point for me was parsing as GNU wants. I agree with @tertsdiepraam that we should not behave exactly like GNU. Specific problems:

  • Some GNU flags have only a short flag, but no long flag, and with it their behavior changes, so makes it very difficult to implement. For instance, I made it so that cp -u a b would work by having u being a NamedArg. So in this case -u is None and it uses a default. But if you do cp -u none a b ,-u is none, but if you do cp a b -u it wouldnt work since Nu would complain about "String is required for argument".

  • And as far as I am concerned, I dont think nu allows for no long flag, but only short flag, so we have to get around that as well.

  • GNU is quite specific with how it uses flags. So short flag, cp -u a b implies an specific update, update=older, but if you use the long flag, you use it differently cp --update=older a b (where = is required). Right now in my impl I have just parsed and made it work by doing cp --update older a b and cp -u older a b (without the = for simplicity). Therefore, I think we need to make a decision on how we are going to support the arguments, and what is allowed or not. Otherwise, it would become a headache to get the behavior exactly using the nushell parser.

  • Finally there is the problem of setting some Options that are private in uutils, such as Attribute that would be quite cumbersome for us to implement, so we need to keep working with them to make this fully functional and painless for both.

Once we define those, I think I could try to port more tests and improve the code, and add some docs.

Hopefully we can keep working together, to get close to a fully functional nushell using the cp from uutils :)

@tertsdiepraam
Copy link
Member

tertsdiepraam commented Aug 12, 2023

Excellent observations! Thanks @dmatos2012!

Pedantic correction: in GNU, -u and --update are not separate but only the long version allows an optional argument. I think optional arguments in general might be a bit difficult to support in the nu parser.

One workaround is to create flags like --update-older, --update-all & --update-none and then make -u the short version of --update-older. Though that is a bit ugly I suppose.

An example of a short argument that does not have a long counterpart at all is -H. But there we could just come up with a long name for that.

As for Attributes, I think the solution there is to extract all of the parsing to a single function in uutils and expose that to avoid duplication.

For the first version, I think we should simplify as much as possible (e.g. no attributes or other complicated flags).

Edit: Maybe SyntaxShape could help for --update? Would something like this work?

SyntaxShape::OneOf(vec![
    SyntaxShape::Keyword(
        vec![b'='],
        SyntaxShape::String,
    ),
    SyntaxShape::Nothing,
])

@fdncred
Copy link
Author

fdncred commented Aug 12, 2023

@dmatos2012 Great write up. Thanks for your effort! I like your "ugly" code, mine would've looked worse. 😆

I agree with your points and @tertsdiepraam has some interesting thoughts for work-arounds. One comment I'll throw in is no one said we have to support every parameter in cp (or any other command). We can approach cp with the crawl, walk, run strategy as well, if we want to. In my MVP comment in the code, I was only supporting a handful of parameters. Having said that, I would prefer to support all the parameters, but I'd like to leave our options open for now.

As you brought up, we also need to determine how compatible we want to be with the GNU flags. I'm not sure myself. I want people to feel at home but nushell is already pretty different. I'm not convinced it's important that we be 100% compatible with GNU flags. I think if we have a way to do the same thing but with a different way, perhaps like --update-older, then that's not a bad way to go. We'd probably just want to include something in the help to explain that.

I'm going to try to get some other nushell core-team members to chime in here about these points. This is the "devil in the details" I was sure that we'd come across. There will be other problems with other commands like mv, I'm sure.

You can't tell but I'm bubbling over with excitement for this integration!

@tertsdiepraam
Copy link
Member

You can't tell but I'm bubbling over with excitement for this integration!

So am I!

@tertsdiepraam
Copy link
Member

tertsdiepraam commented Aug 19, 2023

Quick question just to be sure: what platforms should nu support? I see Linux, Mac and Windows in the nushell CI. Are there other platforms that are officially supported as well? If there's any required platforms that we don't test with uutils, we should add those to our CI.

@fdncred
Copy link
Author

fdncred commented Aug 19, 2023

We're committed to MacOS, Linux, and Windows. However, we support people in other os's if we can. You can see that we have quite a few release assets to support people https://github.com/nushell/nushell/releases.

The only other thing I really see talked about on Discord is Android operating systems. I'm not familiar what those are but we allow people to put compile time #[cfg()] points for android.

@tertsdiepraam
Copy link
Member

Alright, our CI already covers all the nushell platforms then. Thanks for the super quick info!

@fdncred
Copy link
Author

fdncred commented Sep 19, 2023

Today is nushell 0.85.0 release day that will have the coreutils cp command in it 🥳. The release is being finished up as I write this. If anyone is interested in helping out, we'd like to add mv, and rm next. Any takers?

@tertsdiepraam
Copy link
Member

I'm happy to help in about a week. In the meantime if someone else is reading this: feel free to pick this up!

@dmatos2012
Copy link
Contributor

I was already starting with 'rm' but also not available to keep working on it till the end of the month. 🏖️⛰️. But after, I will be happy to help as well :)

So if anyone else wants to take it, go ahead.

@fdncred
Copy link
Author

fdncred commented Sep 20, 2023

I'd really love to get a "good amount" of commands added for the next release. I'm hoping for 1 a week, so that would be 4, but I'd love to see more.

@tertsdiepraam
Copy link
Member

Maybe to give this more visibility, you could open an issue per util you want to integrate? I think we could even make them good-first-issues. I think mv and rm are fine, but I'll need to check whether they are as easy to do as cp. What other utils did you have in mind?

@fdncred
Copy link
Author

fdncred commented Sep 20, 2023

Just looking at the crawl, walk, run spreadsheet, here's the commands IMO should come first. After 3, we can play around with the order.

Staying on the file-io theme

  1. cp - done
  2. mv add uutils/coreutils mv command into nushell nushell/nushell#10446
  3. rm (is there any value to rm + rmdir?) add uutils/coreutils rm command into nushell nushell/nushell#10447
  4. mkdir
  5. touch
  6. mktemp
  7. I really wish there was a cd command in coreutils. we have recurring problems with ours around permissions.
  8. df
  9. du (we already have a du, should we do something different? I kind of like the nushell version better right now)
  10. whoami

@tertsdiepraam
Copy link
Member

tertsdiepraam commented Sep 20, 2023

  • rmdir can be useful to try to remove directories but fail if they are non-empty. I think this can be mostly useful in scripts. The --parents flag can be useful too.
  • I think cd might even be impossible to implement separately, because cd needs access to the internals of the shell. In any case it's not part of the GNU coreutils, so it wouldn't be in this repository.
  • df and du should probably be structured in nu, so they would be more difficult than the others, but interesting to figure out.
  • whoami should be very easy 😃

@tertsdiepraam
Copy link
Member

I've opened issues on our side as well, we can use those to track whether uutils is ready to be integrated and your issues for the actual integration.

@KAAtheWiseGit
Copy link
Contributor

KAAtheWiseGit commented Sep 25, 2023

After looking through the files I got the following questions:

  • mkdir and touch use options as a module with string literals. Would this need to be refactored into a struct?
  • touch does most of its work inside the main function (so does whoami, but the latter is literally two lines). I am unsure how it should be approached
  • mktemp has most of the documentation already in place. Would it be fine to just add it in the missing places and make exec, dry_exec, and Options/Params public?

Depending on the answers, I am ready to take on mktemp or mkdir, if no one else is doing them at the moment.

@tertsdiepraam
Copy link
Member

  • mkdir: the function to expose is exec. nushell doesn't need the string literals.
  • touch: needs significant refactoring. All the argument handling should be in uumain, which should indeed be refactored into some Options struct. Even for whoami I extracted the logic to a separate function for clarity.
  • mktemp: I think the dry_exec and exec should be combined with a flag for nushell. Otherwise, looks like it's mostly ready to go. I think Options should be exposes, Params should probably be internal.

If you're gonna tackle one, open an issue for it and we can talk specifics!

@fdncred
Copy link
Author

fdncred commented Oct 6, 2023

@tertsdiepraam when you have a sec, can you please provide an update on the cp bugs that were being addressed for nushell? I looked and can't seem to find the PR where the fix was. My memory is failing right now but it was something about time/date stamps. Maybe it was with the --update?

I just ask because we're getting closer to release time and we need that issue fixed before we release on 17 Oct. We'd like to have that landed well in advance of that, if it's not already landed, so that we can put in the uutils version of cp as the primary and have old-cp as a backup.

@tertsdiepraam
Copy link
Member

tertsdiepraam commented Oct 7, 2023

@fdncred Sure! As far as I know, there was this issue:

There is a workaround for that issue in place, but there's some work to be done on making mac more robust in general:

I don't recall any issues about datetime or --update though?
I tried these queries on both repos and nothing else seems to have popped up:

We have some work on compiling cp without clap: #5358 which is also for nushell.

Here's some issues and PRs on other utils:

I'll make a GitHub project for these so we can track them.

Edit: The project is here: https://github.com/orgs/uutils/projects/5/

@fdncred
Copy link
Author

fdncred commented Oct 15, 2023

@tertsdiepraam @sylvestre Is it possible to get a coreutils release today/tomorrow maybe? We're getting close to the nushell release date of Oct-17 and need a release in order to ship the uu_cp crate as our official cp command.

If it's not possible, we'll need to roll back this PR nushell/nushell#10678 and wait for our next release 4 weeks later.

Just let me know please. Sorry to be a pain.

@sylvestre
Copy link
Sponsor Contributor

let me try now :)

@tertsdiepraam
Copy link
Member

No need to be sorry! It makes perfect sense. We agreed to do the release before Oct 17, but I'm not sure when @sylvestre was planning to cut the release exactly.

@sylvestre
Copy link
Sponsor Contributor

Still working on it but blocked on
the remote server responded with an error (status 429 Too Many Requests): You have published too many updates to existing crates in a short period of time. Please try again after Sun, 15 Oct 2023 15:30:45 GMT or email help@crates.io to have your limit increased.

@sylvestre
Copy link
Sponsor Contributor

fixed https://crates.io/crates/coreutils
working on the release notes now

@fdncred
Copy link
Author

fdncred commented Oct 15, 2023

Thanks so much @sylvestre. ❤️ We really appreciate your support.

@sylvestre
Copy link
Sponsor Contributor

Thanks
Here are the release notes:
https://github.com/uutils/coreutils/releases/tag/0.0.22
and I am mentioning your work ;)

@dmatos2012
Copy link
Contributor

This is awesome! Sorry that I have been out of it for a bit, but where are we at now with the supported commands? I know the next commands were mv,rm,whoami but have they been merged, or there are still commands that can be taken? @fdncred

@fdncred
Copy link
Author

fdncred commented Oct 16, 2023

@dmatos2012 none of them have been merged yet. I think we only have a PR for whoami atm. You can see issues here https://github.com/nushell/nushell/issues?q=is%3Aissue+is%3Aopen+coreutils+label%3Acoreutils-uutils. Feel free to make more issues with the commands you'd like to see. Also, feel free to start with mv or rm. I don't see anyone working on them yet. We're trying to keep up and use the coreutils-utils label in PRs and Issues.

@fdncred
Copy link
Author

fdncred commented Jan 16, 2024

For anyone watching this, I've added a list of commands that we'd like to integrate.
nushell/nushell#11549

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

6 participants