Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 'CommandLineStringSplitter.Instance.Split' parse bug. #1740

Open
Tracked by #1758
treenewlyn opened this issue May 19, 2022 · 5 comments
Open
Tracked by #1758

Use 'CommandLineStringSplitter.Instance.Split' parse bug. #1740

treenewlyn opened this issue May 19, 2022 · 5 comments
Labels
bug Something isn't working

Comments

@treenewlyn
Copy link

treenewlyn commented May 19, 2022

var raw = "\"dotnet publish \\\"xxx.csproj\\\" -c Release -o \\\"./bin/latest/\\\" -r linux-x64 --self-contained false\"";

var array = System.CommandLine.Parsing.CommandLineStringSplitter.Instance.Split(raw).ToArray();

Console.WriteLine(array.Length);

Its parsed array:

  1. "dotnet publish \"
  2. "xxx.csproj\ -c Release -o \./bin/latest\ -r linux-x64 --self-contained false"

But expected array like:

  1. "dotnet publish \"xxx.csproj\" -c Release -o \"./bin/latest\" -r linux-x64 --self-contained false"
@jonsequitur
Copy link
Contributor

Can you explain a bit more about what you're trying to do and why this output is your expectation? Also, examples using precise strings without the C# escaping would be a little clearer, I think.

For context, the CommandLineStringSplitter is intended to reproduce the way command line input to a .NET console app is split into the args array that gets passed to Main.

Let's use a Program.cs containing this to verify the behavior this is designed to reproduce:

foreach(var arg in args)
{
    Console.WriteLine(arg);
}

Your raw variable contains the following actual, unescaped string:

"dotnet publish \"xxx.csproj\" -c Release -o \"./bin/latest/\" -r linux-x64 --self-contained false"

Running the above program from the command line in PowerShell (keeping in mind that these examples will differ in other shells) with that string produces this output:

dotnet publish \
xxx.csproj\ -c Release -o \./bin/latest/\ -r linux-x64 --self-contained false

So that looks like it's working as designed, but at least in this example, it's probably not what you're really looking for.

@treenewlyn
Copy link
Author

treenewlyn commented May 20, 2022

using System.CommandLine;
using System.CommandLine.NamingConventionBinder;

var rootCommand = new RootCommand("A set command.");
rootCommand.Name = "SET";
rootCommand.AddArgument(new Argument()
{
    Name = "key",
    ValueType = typeof(string),
    Description = "A string"
});

rootCommand.AddArgument(new Argument()
{
    Name = "value",
    ValueType = typeof(string),
    Description = "A string"
});

rootCommand.Handler = CommandHandler.Create<SetCommand>(cmd =>
{
    return cmd.InvokeAsync();
});

while (true)
{
    Console.Write("> ");
    var line = Console.ReadLine();
    if (line == null || (line = line.Trim()).Length == 0) continue;
    if (line == "exit") break;

    try
    {
        await rootCommand.InvokeAsync(line);
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex);
    }
}

class SetCommand
{
    public string Key { get; set; } = null!;

    public string? Value { get; set; } = null!;

    public Task<int> InvokeAsync()
    {
        Console.WriteLine("Key: {0}", this.Key);
        Console.WriteLine("Value: {0}", this.Value);
        return Task.FromResult(0);
    }
}

So, When i call SET text abc, it's work. But, when i want to set a json value, not working.

Input

text abc
say Hello\"
json {\"a\":1}
json "{\"a\":1}"
json {"a":1}

Output

> text abc
Key: text
Value: abc
> say Hello\"
Key: say
Value: Hello\
> json {\"a\":1}
Key: json
Value: {\a\:1}
> json "{\"a\":1}"
Unrecognized command or argument 'a\:1}'.

Description:
  A set command.

Usage:
  SET <key> <value> [options]

Arguments:
  <key>    A string
  <value>  A string

Options:
  --version       Show version information
  -?, -h, --help  Show help and usage information



> json {"a":1}
Key: json
Value: {a:1}

How set char " in Value argument?

@jonsequitur
Copy link
Contributor

This is complicated and hard to talk clearly about. 😅

Going back to the previous example and taking System.CommandLine out of the picture for a moment, the following would work for PowerShell when starting your app (i.e. for the args values passed to Main):

> json '"{\"a\":1}"'

Here's what's happening:

  • PowerShell treats the text inside the single quotes as literal text
  • Main receives the string with the double quotes intact, which is valid JSON, in the args array.

If you were to now pass that args array to e.g. rootCommand.InvokeAsync(args), the CommandLineStringSplitter never even gets called, because the split has already happened before Main. (CommandLineStringSplitter is typically only used in testing or when calculating completions).

But, since you're building more of a REPL-style interaction, this shell escaping won't affect the value you get from Console.ReadLine. You'll get the exact string back, including the double and single quotes. When it's passed to CommandLineStringSplitter.Split, that method assumes this is command line input and tries to treat the quotes as delimiters for the command line, but that's not what they represent inside this JSON block. Since you know it's JSON, you might consider an alternative way to parse it, because otherwise your users will have to escape the quotes inside the JSON, which is not intuitive.

@treenewlyn
Copy link
Author

treenewlyn commented May 20, 2022

OK. I see. share my code

    /// <summary>
    /// 表示一个命令行的解析器。
    /// </summary>
    public static class CommandLineParser
    {
        private static bool TryReadFirstChar(this StringReader reader, out char c)
        {
            var i = reader.Read();
            if (i == -1)
            {
                c = char.MinValue;
                return false;
            }
            else
            {
                c = (char)i;
                if (char.IsWhiteSpace(c)) return reader.TryReadFirstChar(out c);
                return true;
            }
        }

        private static IEnumerable<char> ParseToken(StringReader reader)
        {
            if (!reader.TryReadFirstChar(out var c)) yield break;

            var isQueteString = false;
            var qc = char.MinValue;
            if(c is '=' or ':')
            {
                yield break;
            }
            else if (c is '\"' or '\'')
            {
                isQueteString = true;
                qc = c;
                if (reader.Peek() == -1)
                {
                    throw new InvalidDataException("Invalid quete in the string.");
                }
            }
            else
            {
                yield return c;
            }
            int i;
            while (true)
            {
                i = reader.Read();
                if (i == -1) break;
                c = (char)i;
                if (isQueteString)
                {
                    var pi = reader.Peek();
                    if (pi == -1) throw new InvalidDataException("Invalid quete in the string.");

                    var peek = (char)pi;
                    if (peek == qc)
                    {
                        reader.Read();
                        if (c == '\\')
                        {
                            yield return peek;
                        }
                        else
                        {
                            yield return c;
                            yield break;
                        }
                    }
                    else if (c == '\\' && peek == '\\')
                    {
                        reader.Read();
                        yield return peek;
                    }
                    else
                    {
                        yield return c;
                    }
                }
                else
                {
                    if (char.IsWhiteSpace(c) || (c is ':' or '=')) yield break;
                    yield return c;
                }
            }
        }

        /// <summary>
        /// 解析指定的命令行。
        /// </summary>
        /// <param name="commandLine">命令行。</param>
        /// <returns>一个命令行参数的列表。</returns>
        public static IEnumerable<string> Parse(string commandLine)
        {
            if (string.IsNullOrWhiteSpace(commandLine)) yield break;
            commandLine = commandLine.Trim();
            using var reader = new StringReader(commandLine);
            do
            {
                var chars = ParseToken(reader).ToArray();
                if (chars.Length == 0) continue;
                if (chars.Length == 1 && (chars[0] is ':' or '=')) continue;
                var arg = new string(chars);
                yield return arg;
            } while (reader.Peek() != -1);
        }
    }

xunit

        [Fact]
        public void AllTest()
        {
            Assert.Equal(new string[] { "text", "abc" }
            , CommandLineParser.Parse("text abc"));

            Assert.Equal(new string[] { "text", "Hello\"" }
            , CommandLineParser.Parse("text Hello\""));

            Assert.Equal(new string[] { "text", "Hello\"" }
            , CommandLineParser.Parse(" \t text   Hello\" \t  "));
        }

        [Fact]
        public void QueteTest()
        {
            var args1 = "\"{\\\"a\\t\\\":1}\"";
            Assert.Equal(new string[] { "text", "{\"a\\t\":1}" }
            , CommandLineParser.Parse("text " + args1).ToArray());
        }


        [Fact]
        public void Quete2Test()
        {
            var args1 = "'{\"a\\t\":1}'";
            Assert.Equal(new string[] { "text", "{\"a\\t\":1}" }
            , CommandLineParser.Parse("text " + args1).ToArray());
        }

        [Fact]
        public void SetTest()
        {
            Assert.Equal(new string[] { "a", "b", "c", "d", "e", "f", "g", "h" }, CommandLineParser.Parse("a=b c =d e = f g= h").ToArray());
            Assert.Equal(new string[] { "a", "b", "c", "d", "e", "f", "g", "h" }, CommandLineParser.Parse("a:b c :d e : f g: h").ToArray());
            Assert.Equal(new string[] { "a", ":b" }, CommandLineParser.Parse(" a= ':b'").ToArray());
        }

@jonorossi
Copy link

If you were to now pass that args array to e.g. rootCommand.InvokeAsync(args), the CommandLineStringSplitter never even gets called, because the split has already happened before Main. (CommandLineStringSplitter is typically only used in testing or when calculating completions).

Thanks @jonsequitur for the insight.

I'm not using CommandLineStringSplitter, but am passing JSON into an application and was running into the same issues. I determined that passing a single-quoted string that contains double quotes results in PowerShell stripping the double quotes before the C# application even gets it's args, while cmd does not. I'll just have to use cmd to run this command in my application.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants