-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
git completions: Some alias names don't map to variable names #4147
Comments
FWIW, As I noted in issue #4048 to store each abbreviation as a separate variable requires the same sort of encoding/decoding to and from a valid variable name. I think we can do better than hex encoding every character. Since most of the characters will be alphanumeric it would be better to leave those alone for readability. Also, using |
It actually works - I've taken this from |
The problem with that approach is it's encoding the individual bytes of the wide char and is thus ambiguous. Try this:
Notice that they produce the same output. There's a reason Also, what is with the escaped single quote? Without it printf complains, as expected, that each letter is an invalid argument for |
Compare my previous example with this version. Put the following in a file and source it:
|
Huh? I'm afraid I'm missing something here - \u6161 encodes as e685a1 in UTF-8, and any byte in UTF-8 cannot be confused with a single-byte codepoint simply because it would have the continuation-bits set. (Of course this would be different with other encodings) I mean I'm seeing the same result, I just don't get why. Of course I don't have to understand this precisely to implement it since setting LC_ALL works, but I'd still like to know.
I'll have to check in the PR that introduced it in __fish_urlencode, but it's not a feature specific to our printf - my gnu printf does it as well. It seems to be a feature of the |
Okay, see https://www.gnu.org/software/coreutils/manual/html_node/printf-invocation.html:
|
I shouldn't have said "it's encoding the individual bytes of the wide char". It's encoding each wide char as an int. Since \u6161 is equivalent to \x6161 the
Notice the odd number of hex digits. |
That clears that up, thanks! Soo... this should work if we just One more thing:
How exactly do we check that? Currently I'm using |
Try this function:
For example:
|
Doing the inverse of my |
This
What exactly is the purpose of these |
Why is that a problem? Digits are legal in variable names.
Yes, good catch. That's why unit tests that cover all the corner cases are needed for code like this.
It's so that the only place we deal with the individual UTF-8 bytes is that block. Everywhere else fish works with its normal encoding. This minimizes potential problems. It's also a good argument for doing this in C++. |
That just means the "\d" is redundant 😄.
Yes.
A normal Here's what I came up with (after adding your underline bit because that's neat): function fish_escape_varname -a name
# Hex-encode any characters that cannot appear in a variable name
set -l escaped_name
for c in (string split '' -- $name)
if string match -qr '\W' -- $c
set -lx LC_ALL C
# Hex-encode the character, surround with "_"
# to improve readability.
set escaped_name "$escaped_name"(printf '_%X_' "'"$c)
else
# Double any literal "_".
if test "$c" = "_"
set escaped_name "$escaped_name"__
else
set escaped_name "$escaped_name$c"
end
end
end
echo $escaped_name
end |
That doesn't do the right thing:
This is a good example of why I cringe when I see us trying to deal with UTF-8 in fish script. |
Huh... interesting.
Yeah. The alternative is doing it in C++. I'd suggest an option to |
Agreed. I think that's a better idea than a new string subcommand. And infinitely better than doing it in fish script. Do you want to implement it or shall I? |
Honestly, I hate C++'s string handling, so if I may pass the buck... |
The PR I just created for issue #4150 will make it trivial to fix this issue. |
As reported on gitter by @zx8, git aliases can contain characters that variables can't, which causes the git completions to spew errors.
This requires us to encode the alias name. The easiest thing that I came up with is
set -l escaped_alias (printf '%02X' "'"(string split '' -- $alias))
, which hex-encodes the alias characters.The text was updated successfully, but these errors were encountered: