Skip to content

curl 8.12.1 cygwin test 582/583 hang infloop ssh-agent using all CPU #16437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BrianInglis opened this issue Feb 22, 2025 · 12 comments
Closed

curl 8.12.1 cygwin test 582/583 hang infloop ssh-agent using all CPU #16437

BrianInglis opened this issue Feb 22, 2025 · 12 comments

Comments

@BrianInglis
Copy link
Contributor

BrianInglis commented Feb 22, 2025

I did this

Running *Cygwin autoconfig/automake, with no Windows dirs in PATH, make check hangs about tests 582/583, not finding icacls, and ssh-agent pegging system until killed.
Hints or pointer to docs about how to dig down into those test inputs and outputs gratefully received: been trying with limited success for years!

I expected the following

Normal tests 582-3 execution and result.
See attached log.
curl-8.12.1-cygwin-test-582-583-hang-infloop-ssh-agent.log

If uname -o == Cygwin, POSIX chmod or setfacl should be used as under any other POSIX, UNIX, or Linux system, as Windows icacls ACLs may be incompatible with Cygwin's emulation of POSIX ACLs using Windows ACLs and vice-versa (especially Cygwin's use of NULL SID first to block everything, which File Explorer whines about disliking, despite being perfectly legal in Windows ACLs, which File Explorer does not properly support).

Cygwin's POSIX network and file I/O emulation using Windows may be incompatible with Windows network and file I/O without changes, although curl, libcurl, and programs built with them are compatible with other POSIX systems, but may not be with Windows builds.

Note the weird line endings on some output: no longer supported in most Cygwin programs, other than some random character in data, to be compatible with scripts and programs under POSIX, UNIX, and Linux.

curl/libcurl version

curl 8.12.1 (x86_64-pc-cygwin) libcurl/8.12.1 OpenSSL/3.0.15 zlib/1.3.1 brotli/1.1.0 zstd/1.5.6 libidn2/2.3.7 libpsl/0.21.5 libssh2/1.11.0 nghttp2/1.61.0 libgsasl/2.2.1 OpenLDAP/2.6.9
Release-Date: 2025-02-13
Protocols: dict file ftp ftps gopher gophers http https imap imaps ipfs ipns ldap ldaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp ws wss
Features: alt-svc AsynchDNS brotli gsasl GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM PSL SPNEGO SSL threadsafe TLS-SRP UnixSockets zstd

operating system

$ uname -srvmo
CYGWIN_NT-10.0-19045 3.5.7-1.x86_64 2025-01-29 19:46 UTC x86_64 Cygwin

@bagder bagder added SCP/SFTP Windows Windows-specific labels Feb 22, 2025
@bagder
Copy link
Member

bagder commented Feb 22, 2025

not finding icacls, and ssh-agent pegging system until killed.

Can't we just add a check for those tools and fail starting the server if missing?

@bagder bagder added the tests label Feb 22, 2025
@BrianInglis
Copy link
Contributor Author

Can't you just use the normal Unix commands, as nothing is missing, all the ~3K normal Unix tools, servers, and daemons are available, but are not being used for some strange reason in the Cygwin environment, rather than the Windows tools which may not be available, and typically will not DTRT in this case, but could be used for the Mingw64-x86_64 build using Cygwin cross-tools.

@bagder
Copy link
Member

bagder commented Feb 23, 2025

If you say so, sure, then I suppose someone should try doing that change in the script.

@BrianInglis
Copy link
Contributor Author

That's why I asked for hints or pointers to the locations of the relevant test scripts, inputs and outputs, which do not seem to be obviously connected to the specific tests, possibly due to indirection via perl runners, or other bits or pieces?
You may see it as obvious, but I am missing some connections? ;^>

@bagder
Copy link
Member

bagder commented Feb 23, 2025

I just grepped for "icacls" and found its use in tests/sshserver.pl, the script that fires up the ssh server for test purposes:

curl/tests/sshserver.pl

Lines 424 to 430 in 4c50998

if(pathhelp::os_is_win()) {
# https://ss64.com/nt/icacls.html
$ENV{'MSYS2_ARG_CONV_EXCL'} = '/reset';
system("icacls \"" . pathhelp::sys_native_abs_path(pp($hstprvkeyf)) . "\" /reset");
system("icacls \"" . pathhelp::sys_native_abs_path(pp($hstprvkeyf)) . "\" /grant:r \"$username:(R)\"");
system("icacls \"" . pathhelp::sys_native_abs_path(pp($hstprvkeyf)) . "\" /inheritance:r");
}

@vszakats
Copy link
Member

vszakats commented Feb 24, 2025

In this case, icacls is just not found, and thus, not launched, and that's it.
It's unlikely this by itself causes anything fatal, besides the log messages
icacls: not found.

But, since its job wasn't done, it's possible that SSH gets messed up without
those expected ACLs.

Agreed that under Cygwin, it'd be better to use its native tools, and a patch
with that effect would be welcome. (chmod is already used, but it was
insufficient to fix it at the time I made this work in CI. I wasn't aware of
setfacl, and it certainly sounds like something that may fix it. Disclaimer
it's also possible I missed something. These things are notoriously difficult
to address through remote CI machines.)

Ref: e53523f #14859

@BrianInglis
Copy link
Contributor Author

In this case, icacls is just not found, and thus, not launched, and that's it. It's unlikely this by itself causes anything fatal, besides the log messages icacls: not found.

But, since its job wasn't done, it's possible that SSH gets messed up without those expected ACLs.

Agreed that under Cygwin, it'd be better to use its native tools, and a patch with that effect would be welcome. (chmod is already used, but it was insufficient to fix it at the time I made this work in CI. I wasn't aware of setfacl, and it certainly sounds like something that may fix it. Disclaimer it's also possible I missed something. These things are notoriously difficult to address through remote CI machines.)

Ref: e53523f #14859

The ACL utilities are part of the base system package including the DLL and utilities; not often used for directory or file setup: useful if any directory ends up with no DACLs, which means files end up with no ACLs and inaccessible.

The fileutils ch... are part of coreutils which is in install category base - autoinstalled.

My gotos for server and client setup info are the /usr/bin/...config scripts, in this case
/usr/bin/ssh-host-config and /usr/bin/ssh-user-config, which normally use chown and chmod either numeric or symbolic.
Some scripts set the perms expected, then unset undesirable perms, then check only expected perms are left, in case Windows hierarchies, inheritance, policies interfere.

Cygwin CI is Scallywag, home grown?, originally available under Appveyor, and now also GHA, which most use, although some purists will not because of principles.

@vszakats
Copy link
Member

vszakats commented Feb 24, 2025

In this case, icacls is just not found, and thus, not launched, and that's it. It's unlikely this by itself causes anything fatal, besides the log messages icacls: not found.
But, since its job wasn't done, it's possible that SSH gets messed up without those expected ACLs.
Agreed that under Cygwin, it'd be better to use its native tools, and a patch with that effect would be welcome. (chmod is already used, but it was insufficient to fix it at the time I made this work in CI. I wasn't aware of setfacl, and it certainly sounds like something that may fix it. Disclaimer it's also possible I missed something. These things are notoriously difficult to address through remote CI machines.)
Ref: e53523f #14859

The ACL utilities are part of the base system package including the DLL and utilities; not often used for directory or file setup: useful if any directory ends up with no DACLs, which means files end up with no ACLs and inaccessible.

The fileutils ch... are part of coreutils which is in install category base - autoinstalled.

My gotos for server and client setup info are the /usr/bin/...config scripts, in this case /usr/bin/ssh-host-config and /usr/bin/ssh-user-config, which normally use chown and chmod either numeric or symbolic. Some scripts set the perms expected, then unset undesirable perms, then check only expected perms are left, in case Windows hierarchies, inheritance, policies interfere.

Cygwin CI is Scallywag, home grown?, originally available under Appveyor, and now also GHA, which most use, although some purists will not because of principles.

Well, our goal (my goal) was simply to make Cygwin work with good
performance (now 5-10 minutes vs an hour) and to cover as much
tests as possible. It wasn't a small effort.

We're using what's available, accessible and practical, with the official
Cygwin GHA action to install it. That's our baseline (im)purity.

The tools and scripts you mention sound nice, and knowing they exist
is nice, but turning that into a working commit (for me) takes many days
of trial and error on remote machines.

Turning our CI to a purist Cygwin env would probably take adding
PATH=/usr/bin to every job step? If you have a patch to try it and
make the SSH ACL fix work using Cygwin tools, I'm happy to try or
assist you in a PR.

@vszakats
Copy link
Member

vszakats commented Feb 25, 2025

Made an attempt to replicate the icacls effect with setfacl based on
the available documentation and some examples found on GitHub (there
is a total of 35 hits mentioning both tools in the same context.)

https://linux.die.net/man/1/setfacl
https://cygwin.com/cygwin-ug-net/setfacl.html
https://github.com/search?q=icacls+setfacl&type=code

Tried:

curl/tests/sshserver.pl

Lines 421 to 435 in d1440e2

# Make sure that permissions are restricted so openssh doesn't complain
system "chmod 600 " . pp($hstprvkeyf);
system "chmod 600 " . pp($cliprvkeyf);
if($^O eq 'cygwin' || $^O eq 'msys') {
# https://cygwin.com/cygwin-ug-net/setfacl.html
system "setfacl --remove-all " . pp($hstprvkeyf);
system "setfacl --modify u:" . $username . ":r " . pp($hstprvkeyf);
}
elsif($^O eq 'MSWin32') {
# https://ss64.com/nt/icacls.html
$ENV{'MSYS2_ARG_CONV_EXCL'} = '/reset';
system "icacls \"" . pathhelp::sys_native_abs_path(pp($hstprvkeyf)) . "\" /reset";
system "icacls \"" . pathhelp::sys_native_abs_path(pp($hstprvkeyf)) . "\" /grant:r \"$username:(R)\"";
system "icacls \"" . pathhelp::sys_native_abs_path(pp($hstprvkeyf)) . "\" /inheritance:r";
}

07014ba

That hit this:

setfacl: illegal acl entries
setfacl: illegal acl entries
setfacl: illegal acl entries
setfacl: illegal acl entries
setfacl: illegal acl entries
setfacl: illegal acl entries
setfacl: illegal acl entries
[...]
Permissions 0650 for '/cygdrive/d/a/curl/curl/bld/tests/log/6/server/curl_host_rsa_key' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
Unable to load host key "/cygdrive/d/a/curl/curl/bld/tests/log/6/server/curl_host_rsa_key": bad permissions
Unable to load host key: /cygdrive/d/a/curl/curl/bld/tests/log/6/server/curl_host_rsa_key

https://github.com/curl/curl/actions/runs/13510332690/job/37749076435?pr=16465#step:12:1181
https://github.com/curl/curl/actions/runs/13510332690/job/37749089183?pr=16465#step:14:1254

What would be the correct call equivalent to icacls?

Ref: #16465

@vszakats
Copy link
Member

vszakats commented Mar 23, 2025

I tried setfacl again on Windows. Using setfacl.exe as shipped in the latest Git for Windows.

setfacl --remove-all curl_client_key, setfacl -x u:<group-id> curl_client_key, setfacl -x u:<me> curl_client_key don't seem to be doing anything. getfacl says "Not supported". Older version (from 2016) does work, but shows no change, just 0644. Just like ls -l and the AccessEnum tool from SysInternals.

Removing inheritance would be key, which is icacls curl_client_key /inheritance:r. There the effect is reflected in AccessEnum.

(Maybe the matching openssh.exe would see some change? Seems doubtful if no other tools do.)

Is there a way to do this with setfacls? Is setfacl supposed to be working at all on Windows ACLs?

The terse output of these tools, like "Not supported", "illegal acl entries" or no output at all (on "success"), makes it extremely difficult to figure out what's going on, even on a local machine.

So far my impression is that this isn't a curl issue, but a setfacl / Cygwin-layer one, unless this works differently in Cygwin. (AFAIU Cygwin, MSYS2 and Git for Windows are the same in this context)

@BrianInglis
Copy link
Contributor Author

BrianInglis commented Mar 23, 2025

You should see something like the below running setfacl, where the version depends on the Cygwin core package version:

$ setfacl --version
setfacl (cygwin) 3.5.7
POSIX ACL modification utility
Copyright (C) 2000 - 2025 Cygwin Authors
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

shown by:

$ uname -srvmo
CYGWIN_NT-10.0-19045 3.5.7-1.x86_64 2025-01-29 19:46 UTC x86_64 Cygwin

or:

$ tail /proc/version
CYGWIN_NT-10.0-19045 version 3.5.7-1.x86_64 (runneradmin@fv-az1564-334) (gcc version 12.4.0 (GCC) ) 2025-01-29 19:46 UTC

I use setfacl -b and chmod/chgrp/chown a lot on non-Cygwin downloaded files and folders to make them usable from Cygwin, and have a series of scripts like:

#!/bin/bash
# setacls.sh - set normal dacls and perms and strip ext acls

perms=a+r,u+w,go-w
dacl=u::rwx,g::r-x,o::r-x,d:u::rwx,d:g::r-x,d:o::r-x

d=()
f=()

for each in "$@"
do
  if [ -n "$each" ]; then
    if [ -f "$each" ]; then
      f+=("$each")
    elif [ -d "$each/" ]; then
      d+=("$each/")
    fi
  fi
done

if [ ! -z "$d" ]; then
    /bin/setfacl -m$dacl	"${d[@]}"
    /bin/setfacl -b		"${d[@]}"
    /bin/setfacl -m$dacl	"${d[@]}"
fi

if [ ! -z "$f" ]; then
    /bin/chmod -c $perms --	"${f[@]}"
    /bin/setfacl -b		"${f[@]}"
    /bin/chmod -c $perms --	"${f[@]}"
fi

used in conjunction with shell function:

# ls permissions
lsp ()
{
    local p w o dir;
    o='-n';
    dir='cmd /c dir /4 /a /-c /l /n /q';
    for p in "$@";
    do
        w=$(cygpath -am "$p");
        echo $o;
        o='';
        if [ -d "$p" ]; then
            lsattr -adl "$p" || attrib "$w";
            echo $o;
        fi;
        /bin/ls --color=auto -adlL "$p";
        echo $o;
        getfacl "$p" && icacls "$w" || $dir "$(echo "$w" | /bin/sed -e 's!/!\\!g')";
    done
}

to modify and show everything related to Cygwin directories and files.

The following patch was recently applied to Github Actions CI, run whenever a package build script or patch is checked into Cygwin git package repos:
CI: Remove inheritable permissions from working directory](https://inbox.sourceware.org/cygwin-patches/20250321140502.2122-2-jon.turney@dronecode.org.uk/)

@vszakats
Copy link
Member

vszakats commented Mar 23, 2025

Thanks @BrianInglis.

I re-tested locally with a 2018 installation of Cygwin (BTW, I wish there was a way to install Cygwin without access to an online Windows machine). Right away, its getfacl output was much more detailed than the forks tested earlier. setfacls -b also worked locally, so rebooted my previous CI attempt by just dropping chgrp (and any additional setfacl) and it seems to work as expected:
https://github.com/curl/curl/actions/runs/14023736699/job/39259043743?pr=16803

This, while using a pure PATH as you mentioned, and recommended in the action's README.
(edit: went a step further and excluded System32)

While there, I also moved Cygwin from C: to D:, saving noticeable time in the install step (sample size: 1 at 1m30s; but the longer install times (~2-3 minutes) were consistent before this patch.)

I wanted to move C:\cygwin-packages to D: too, but that isn't supported, with an open feature request since early 2023:
cygwin/cygwin-install-action#7
Is there a chance to add support for it as a configurable action input? PR → cygwin/cygwin-install-action#24

edit: This seemed unusually smooth, so I deleted all icacls and setfacl commands from runtests and everything continues to work in CI without them. I'm not sure what changed in the last 6 months that makes it work now out-of-the-box (in CI at least). (Can be anything from me overlooking something, runner image fixes, Cygwin/MSYS runtime fixes, openssh fixes, other changes that accidentally got around this issue.)

vszakats added a commit that referenced this issue Mar 24, 2025
Use the `PATH` `/usr/bin` to avoid any Windows system or 3rd-party tool
installed on the runner machine that may interfere with or add undesired
dependencies to the builds and tests.

Follow-up to d838d43 #16465
Ref: #16437

Closes #16814
pps83 pushed a commit to pps83/curl that referenced this issue Apr 26, 2025
To use a native Cygwin tool instead of the Windows `icacls`. It allows
running under Cygwin/MSYS without Windows system folders in the `PATH`.

Also: fix indentation and tidy up syntax of the `icacls` branch.

Note: As of this commit, these `setfacl` and `icacls` calls are not
necessary for a successful CI run. This includes OpenSSH for Windows
tests, that aren't run by default. Keep them anyway, because locally
they may be necessary depending on environment.

Reported-by: Brian Inglis
Fixes curl#16437
Ref: curl#16803
Closes curl#16465
pps83 pushed a commit to pps83/curl that referenced this issue Apr 26, 2025
Use the `PATH` `/usr/bin` to avoid any Windows system or 3rd-party tool
installed on the runner machine that may interfere with or add undesired
dependencies to the builds and tests.

Follow-up to d838d43 curl#16465
Ref: curl#16437

Closes curl#16814
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants