Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rsync globs behave differently in Nushell than in Bash #4609

Open
kubouch opened this issue Feb 23, 2022 · 5 comments
Open

rsync globs behave differently in Nushell than in Bash #4609

kubouch opened this issue Feb 23, 2022 · 5 comments
Labels
🐛 bug Something isn't working glob-expansion Specific behavior around file-system globbing with regular commands or `glob` quoting/expansion Issues related to string quoting and expansion of variable or glob patterns

Comments

@kubouch
Copy link
Contributor

kubouch commented Feb 23, 2022

Describe the bug

I used rsync -avhz name@server:/home/name/some/path/ destination --include="*/" --include="*.pth" --exclude="*" which did not fetch me only .pth files but all the files.

I need to elaborate on it more when I have the time, just wanted to log this in since there might be some misbehavior.

How to reproduce

  1. todo

Expected behavior

The rsync globs should work the same way in Nushell and in Bash.

Screenshots

No response

Configuration

key value
version 0.59.0
branch main
short_commit 41fa1ab
commit_hash 41fa1ab
commit_date 2022-02-21 15:46:19 +00:00
build_os linux-x86_64
rust_version rustc 1.58.1 (db9d1b20b 2022-01-20)
rust_channel stable-x86_64-unknown-linux-gnu
cargo_version cargo 1.58.0 (f01b232bc 2022-01-19)
pkg_version 0.59.0
build_time 2022-02-21 17:46:48 +02:00
build_rust_channel release
features dataframe, default, trash, which, zip
installed_plugins gstat

Additional context

No response

@kubouch kubouch added the 🐛 bug Something isn't working label Feb 23, 2022
@lukexor
Copy link

lukexor commented Feb 23, 2022

I'm not sure if this is just a glob issue or not. It seems like it's not properly passing parameters to rsync correctly.

I tried a minimal example and I still get different results from bash:

[15:03] v0.59.0 ❯ which rsync
╭───┬───────┬────────────────┬─────────╮
│ # │  arg  │      path      │ builtin │
├───┼───────┼────────────────┼─────────┤
│ 0 │ rsync │ /usr/bin/rsync │ false   │
╰───┴───────┴────────────────┴─────────╯

[15:01] v0.59.0 ❯ ls
╭───┬──────────────┬──────┬──────┬───────────────╮
│ # │     name     │ type │ size │   modified    │
├───┼──────────────┼──────┼──────┼───────────────┤
│ 0 │ someFile.txt │ file │  0 B │ 4 minutes ago │
│ 1 │ test.pth     │ file │  0 B │ 4 minutes ago │
╰───┴──────────────┴──────┴──────┴───────────────╯

[15:01] v0.59.0 ❯ rsync -avhz --include="test.pth" --exclude="someFile.txt" ./ dst
building file list ... done
created directory dst
./
someFile.txt
test.pth

sent 211 bytes  received 70 bytes  562.00 bytes/sec
total size is 0  speedup is 0.00

On bash:

~/dev/tmp [15:02] ❯ which rsync
/usr/bin/rsync

~/dev/tmp [15:02] ❯ rsync -avhz --include="test.pth" --exclude="someFile.txt" ./ dst
building file list ... done
created directory dst
./
test.pth

sent 150 bytes  received 48 bytes  396.00 bytes/sec
total size is 0  speedup is 0.00

My nushell version:

╭────────────────────┬──────────────────────────────────────────╮
│ version            │ 0.59.0                                   │
│ branch             │ main                                     │
│ short_commit       │ e1f98c1b                                 │
│ commit_hash        │ e1f98c1bfd16108e18641ef220090fa101057a77 │
│ commit_date        │ 2022-02-09 23:20:46 +00:00               │
│ build_os           │ macos-x86_64                             │
│ rust_version       │ rustc 1.58.1 (db9d1b20b 2022-01-20)      │
│ rust_channel       │ stable-x86_64-apple-darwin               │
│ cargo_version      │ cargo 1.58.0 (f01b232bc 2022-01-19)      │
│ pkg_version        │ 0.59.0                                   │
│ build_time         │ 2022-02-15 09:31:33 -08:00               │
│ build_rust_channel │ release                                  │
│ features           │ default, which                           │
│ installed_plugins  │                                          │
╰────────────────────┴──────────────────────────────────────────╯

@JCallicoat
Copy link

JCallicoat commented Mar 1, 2022

The problem seems to be that nu is passing the double quotes to rsync when the --exclude flag is using an = (using --debug=FILTER2 in rsync shows the rule that was created).

rsync --debug=FILTER2 --dry-run -avhz --exclude="b.txt" . dst/
[client] add_rule(- "b.txt")
sending incremental file list
[sender] pushing local filters for /tmp/foo2/
created directory dst
./
a.txt
b.txt

So it gets the exclude pattern including the quotes and b.txt doesn't match "b.txt" so it passes the filter.

Using the flag with = and no quotes works, or using quotes and a space:

rsync --debug=FILTER2 --dry-run -avhz --exclude=b.txt . dst/
[client] add_rule(- b.txt)
sending incremental file list
[sender] pushing local filters for /tmp/foo2/
[sender] hiding file b.txt because of pattern b.txt
created directory dst
./
a.txt
rsync --debug=FILTER2 --dry-run -avhz --exclude "b.txt" . dst/
[client] add_rule(- b.txt)
sending incremental file list
[sender] pushing local filters for /tmp/foo2/
[sender] hiding file b.txt because of pattern b.txt
created directory dst
./
a.txt

Bash and zsh both consume the quotes and just pass b.txt in all of these cases.

@lukexor
Copy link

lukexor commented Mar 1, 2022

@JCallicoat Seems to work for that example - but globs still seem to be an issue with or without the equals sign. In my tests it only seems to pick up one file that matches the glob instead of all.

[19:46] v0.59.0 ❯ ls
╭───┬───────────┬──────┬──────┬───────────────╮
│ # │   name    │ type │ size │   modified    │
├───┼───────────┼──────┼──────┼───────────────┤
│ 0 │ test1.txt │ file │  0 B │ 6 minutes ago │
│ 1 │ test2.txt │ file │  0 B │ 6 minutes ago │
│ 2 │ test3.md  │ file │  0 B │ 6 minutes ago │
╰───┴───────────┴──────┴──────┴───────────────╯

[19:46] v0.59.0 ❯ rsync -avvvhz --include *.txt --exclude *.md dst/
[client] add_rule(+ test1.txt)
[client] add_rule(- test3.md)
building file list ...
[sender] make_file(test2.txt,*,2)
done
server_recv(2) starting pid=5714
send_file_list done
send_files starting
recv_file_name(test2.txt)
received 1 names
recv_file_list done
get_local_name count=1 dst/
generator starting pid=5714 count=1
delta-transmission disabled for local transfer or --whole-file
recv_generator(test2.txt,0)
send_files(0, test2.txt)
test2.txt is uptodate
generate_files phase=1
send_files phase=1
recv_files(1) starting
recv_files(test2.txt)
recv_files phase=1
generate_files phase=2
send_files phase=2
send files finished
total: matches=0  hash_hits=0  false_alarms=0 data=0
recv_files phase=2
recv_files finished
generate_files phase=3
generate_files finished

sent 97 bytes  received 26 bytes  246.00 bytes/sec
total size is 0  speedup is 0.00

I don't use rsync enough to figure out why it's including test2.txt and not test1.txt but I suspect it's because the expansion of the command is:

rsync -avvvhz --include test1.txt test2.txt --exclude test3.md dst/

which seems like an unorthodox usage of rsync

@JCallicoat
Copy link

JCallicoat commented Mar 1, 2022

@lukexor Can confirm here. That seems to be the same issue with globs expanding in quoted strings you opened in #4631

Just running a little c program that prints out the arguments shows the different behaviors right now.

Quoted and equals passes through quotes literally:

./foo --include="*.txt" --exclude="*.c"
--include="*.txt"
--exclude="*.c"

Quoted and space expands glob to file list:

./foo --include "*.txt" --exclude "*.c"
--include
a.txt
b.txt
--exclude
foo.c

Both of those break rsync in different ways.

Unquoted and equals passes through glob literally:

./foo --include=*.txt --exclude=*.c
--include=*.txt
--exclude=*.c

This works correctly with rsync in my testing.

The different behaviors are pretty unexpected and non-intuitive.

Edit: FWIW, here's zsh in all three cases:

./foo --include="*.txt" --exclude="*.c"
--include=*.txt
--exclude=*.c
./foo --include "*.txt" --exclude "*.c"
--include
*.txt
--exclude
*.c
./foo --include=*.txt --exclude=*.c    
zsh: no matches found: --include=*.txt

@hustcer hustcer added the 0.60 label Mar 4, 2022
@hustcer hustcer added this to the v0.60.0 milestone Mar 5, 2022
@hustcer hustcer modified the milestones: v0.60.0, v0.61.0 Mar 18, 2022
@hustcer hustcer removed the 0.60 label Mar 18, 2022
@hustcer hustcer removed this from the v0.61.0 milestone Mar 26, 2022
@github-actions github-actions bot added the Stale used for marking issues and prs as stale label Jan 17, 2023
@sholderbach sholderbach added the glob-expansion Specific behavior around file-system globbing with regular commands or `glob` label Jul 9, 2023
@sholderbach sholderbach added quoting/expansion Issues related to string quoting and expansion of variable or glob patterns and removed Stale used for marking issues and prs as stale labels Aug 9, 2023
@sholderbach
Copy link
Member

The form without = should be fine as #4631 is solved but we need to check back if the --flag=value variants work or need work

IanManske pushed a commit that referenced this issue May 23, 2024
This PR is a complete rewrite of `run_external.rs`. The main goal of the
rewrite is improving readability, but it also fixes some bugs related to
argument handling and the PATH variable (fixes
#6011).

I'll discuss some technical details to make reviewing easier.

## Argument handling

Quoting arguments for external commands is hard. Like, *really* hard.
We've had more than a dozen issues and PRs dedicated to quoting
arguments (see Appendix) but the current implementation is still buggy.

Here's a demonstration of the buggy behavior:

```nu
let foo = "'bar'"
^touch $foo            # This creates a file named `bar`, but it should be `'bar'`
^touch ...[ "'bar'" ]  # Same
```

I'll describe how this PR deals with argument handling.

First, we'll introduce the concept of **bare strings**. Bare strings are
**string literals** that are either **unquoted** or **quoted by
backticks** [^1]. Strings within a list literal are NOT considered bare
strings, even if they are unquoted or quoted by backticks.

When a bare string is used as an argument to external process, we need
to perform tilde-expansion, glob-expansion, and inner-quotes-removal, in
that order. "Inner-quotes-removal" means transforming from
`--option="value"` into `--option=value`.

## `.bat` files and CMD built-ins

On Windows, `.bat` files and `.cmd` files are considered executable, but
they need `CMD.exe` as the interpreter. The Rust standard library
supports running `.bat` files directly and will spawn `CMD.exe` under
the hood (see
[documentation](https://doc.rust-lang.org/std/process/index.html#windows-argument-splitting)).
However, other extensions are not supported [^2].

Nushell also supports a selected number of CMD built-ins. The problem
with CMD is that it uses a different set of quoting rules. Correctly
quoting for CMD requires using
[Command::raw_arg()](https://doc.rust-lang.org/std/os/windows/process/trait.CommandExt.html#tymethod.raw_arg)
and manually quoting CMD special characters, on top of quoting from the
Nushell side. ~~I decided that this is too complex and chose to reject
special characters in CMD built-ins instead [^3]. Hopefully this will
not affact real-world use cases.~~ I've implemented escaping that works
reasonably well.

## `which-support` feature

The `which` crate is now a hard dependency of `nu-command`, making the
`which-support` feature essentially useless. The `which` crate is
already a hard dependency of `nu-cli`, and we should consider removing
the `which-support` feature entirely.

## Appendix

Here's a list of quoting-related issues and PRs in rough chronological
order.

* #4609
* #4631
* #4601
  * #5846
* #5978
  * #6014
* #6154
  * #6161
* #6399
  * #6420
  * #6426
* #6465
* #6559
  * #6560

[^1]: The idea that backtick-quoted strings act like bare strings was
introduced by Kubouch and briefly mentioned in [the language
reference](https://www.nushell.sh/lang-guide/chapters/strings_and_text.html#backtick-quotes).

[^2]: The documentation also said "running .bat scripts in this way may
be removed in the future and so should not be relied upon", which is
another reason to move away from this. But again, quoting for CMD is
hard.

[^3]: If anyone wants to try, the best resource I found on the topic is
[this](https://daviddeley.com/autohotkey/parameters/parameters.htm).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working glob-expansion Specific behavior around file-system globbing with regular commands or `glob` quoting/expansion Issues related to string quoting and expansion of variable or glob patterns
Projects
None yet
Development

No branches or pull requests

5 participants