Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with parsing and reporting SSH endpoint connection strings #5184

Open
chrisd8088 opened this issue Nov 14, 2022 · 1 comment · May be fixed by #4975
Open

Problems with parsing and reporting SSH endpoint connection strings #5184

chrisd8088 opened this issue Nov 14, 2022 · 1 comment · May be fixed by #4975
Assignees
Labels
Projects

Comments

@chrisd8088
Copy link
Contributor

chrisd8088 commented Nov 14, 2022

Describe the bug

Git LFS supports both "bare" SSH connection strings, of the form user@host:path, and SSH URIs of the form ssh://user@host/path, when parsing its configuration fields. However, we do not at present retain any record of this difference, and so we are not able to reconstruct the original input when reporting SSH endpoints in the git lfs env command.

We also do not report a non-default SSH port, if one was configured, in the output from that command, leading to confusion when the input configuration was a URI like ssh://example.com:1337/path and we report it as example.com:/path, which looks like we have dropped the port entirely.

Further, we specifically accept and parse a form of "bare" SSH connection string in which square brackets are used to enclose a host:port pair (and user name, if provided) and delineate it from the colon separator before the path. However, this differs from how Git parses connection strings containing square brackets, because it also allows for a format such as user@[host:port]:path instead of only [host:port]:path or [user@host:port]:path. We should, as far as possible, align our handling of both IPv6 addresses and special characters in SSH connection strings with how Git handles them. This may require that certain types of connection strings (e.g., those with IPv6 addresses) only be provided in the ssh:// URI format.

Lastly, although we accept absolute and relative paths in the "bare" form of an SSH connection string, these are reported by git lfs env using an HTTPS URL that somewhat obscures their significance. We should also ensure our test suite exercises this absolute path functionality with the pure SSH-based protocol and transfer adapter added in PR #4446.

To Reproduce

$ git init test
$ cd test

$ cat <<EOF >.lfsconfig
[lfs]
        url = user@example.com:foo/bar
EOF
$ git lfs env | grep example
Endpoint=https://example.com/foo/bar (auth=none)
  SSH=user@example.com:foo/bar

$ cat <<EOF >.lfsconfig
[lfs]
        url = ssh://user@example.com/foo/bar
EOF
$ git lfs env | grep example
Endpoint=https://example.com/foo/bar (auth=none)
  SSH=user@example.com:/foo/bar

$ cat <<EOF >.lfsconfig
[lfs]
        url = [user@example.com:1337]:foo/bar
EOF
$ git lfs env | grep example
Endpoint=https://example.com/foo/bar (auth=none)
  SSH=user@example.com:foo/bar

$ cat <<EOF >.lfsconfig
[lfs]
        url = ssh://user@example.com:1337/foo/bar
EOF
$ git lfs env | grep example
Endpoint=https://example.com/foo/bar (auth=none)
  SSH=user@example.com:/foo/bar

$ cat <<EOF >.lfsconfig
[lfs]
        url = user@example.com:/foo/bar
EOF
$ git lfs env | grep example
Endpoint=https://example.com//foo/bar (auth=none)
  SSH=user@example.com:/foo/bar

$ cat <<EOF >.lfsconfig
[lfs]
        url = user@example.com:../../foo/bar
EOF
$ git lfs env | grep example
Endpoint=https://example.com/../../foo/bar (auth=none)
  SSH=user@example.com:../../foo/bar

Expected behaviour

The output of the git lfs env command's Endpoint SSH subfield should reflect the initial input from the lfs.url configuration (so long as it is vaild) rather than a reconstructed version which elides specific details, such as port numbers.

When an absolute path is provided in a non-URI "bare" form of SSH connection string, we should possibly not report this using an HTTPS URL with an extra leading / character in the path, which has no effect in the HTTP context, and when a relative path is provided with leading ../ path elements, we should likely also not report this using an HTTPS URL. Instead, we should perhaps always report the Endpoint field using a pure SSH format when our configuration specifies an SSH connection string.

Both IPv4 and all forms of IPv6 addresses should be supported in SSH connection strings, both with and without custom port numbers.

System environment

The examples above were performed on macOS, but should behave the same way on any platform.

Output of git lfs env

git-lfs/3.2.0 (GitHub; darwin amd64; go 1.19.1; git b64865c0)
git version 2.33.0

Additional context

Our parsing of a "bare" SSH connection string allows a port to be specified using the form [user@host:port]:path, but not user@[host:port]:path. It is possibly idiosyncratic of Git LFS to expect square brackets to be used in SSH connection strings specifically to enclose host:port pairs. Git's parsing of square brackets in such string, by contrast, was introduced in commit git/git@356bece in 2005 to support IPv6 addresses, and was then updated in commits git/git@86ceb33 and git/git@6b6c5f7 to support the combination of a user name, IPv6 address, and port number. While retaining backward compatibility, we should endeavour to align our SSH connection string parsing and tests with Git's as far as possible.

We have no explicit Go tests or shell tests of IPv6 addresses, either with or without square brackets, or with IPv4 addresses, for that matter, except for a few tests using the ssh_remote() helper function. We should expand our test suite to cover these possibilities, including their use both with and without port numbers. We should also attempt to cover the same range of test inputs strings as seen in Git's t/t5603-clone-dirname.sh tests.

In commit a0190a6 of PR #4446, the commit description includes the following details regarding "bare" SSH connection strings and absolute paths with a leading / character:

Note that since we process the non-URL form of SSH remotes by converting
them to a URL and then parsing, let's strip off the leading slash when
we process that form, since there we do have the ability to distinguish
between absolute and relative paths.

The test in t/t-batch-transfer.sh which validates the git-lfs-transfer pure SSH protocol transfer adapter support only uses the relative path SSH connection string created by the ssh_remote() helper function (specifically an ssh:// URI). We should create additional tests which also utilize the SSH protocol transfer capabilities but with a range of connection strings including absolute paths, relative paths, port numbers, etc.

Note that adding tests which use a non-default SSH port for the test server requires making the lfs-ssh-echo test server also accept a -o SendEnv=GIT_PROTOCOL parameter because Git adds that option when we supply GIT_SSH_VARIANT=ssh to the git clone command, which we need to do because Git's simple SSH command variant does not support setting a port number with the -p option.

Also note that one challenge when supplying a relative path in such tests may occur on Windows because our tests run in a shell environment where we may need to use cygpath -m to convert the paths returned by pwd into full Windows paths, as we will need to construct a relative paths from the temporary copies of our test repositories which are valid for the executing versions of our test server binaries.

Finally, note that in the lfs-ssh-echo.go test helper program we should probably correct the value interpolated as the port number in the reply to be the argument following the -p option instead of whatever happens to be the third command line argument. And in the case where the -oControlMaster=auto option is passed as a single argument, we should perhaps only increment our argument offset value by 1.

@chrisd8088 chrisd8088 added the bug label Nov 14, 2022
@chrisd8088 chrisd8088 added this to Bugs in Backlog Nov 14, 2022
@chrisd8088 chrisd8088 linked a pull request Nov 14, 2022 that will close this issue
@chrisd8088 chrisd8088 self-assigned this Nov 14, 2022
@chrisd8088
Copy link
Contributor Author

As noted in #5664 (comment), Git LFS is not happy with an ssh:// connection string if it contains a colon followed by empty port number, e.g., ssh://example.com:/path.

$ cat <<EOF >.lfsconfig
> [lfs]
>         url = ssh://user@example.com:/foo/bar
> EOF
$ git lfs env | grep Endpoint
Endpoint=<unknown> (auth=none)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging a pull request may close this issue.

1 participant