Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable values should be escaped #11

Closed
Taitava opened this issue Sep 9, 2021 · 8 comments
Closed

Variable values should be escaped #11

Taitava opened this issue Sep 9, 2021 · 8 comments
Labels
cornerstone Significant features that will offer concrete benefits. Not quick to implement. enhancement New feature or improving change
Milestone

Comments

@Taitava
Copy link
Owner

Taitava commented Sep 9, 2021

The problem

A quote from README.md:

Note that for the sake of simplicity, there is no escaping done for variable values. If you have a command and a quoted string parameter like mycommand "{{clipboard}}", it might break if your clipboard content contains " quote characters, because those are inserted into the command as-is. Your command might end up looking like this: mycommand "Text pasted from clipboard that contains a " character." I am open to discussion how to best implement variable value escaping in the future.

An even worse example is this: mycommand {{clipboard}}, with clipboard happening to contain stuff like some text > C:\path\to\some\important\file. In this case, the command might end up accidentally overwriting some file without a conscious meaning from the user.

Terminology

I mess up with terms. When I say escaping, I mean encoding. These might be different things (I don't know), so I need to clarify these terms later. But for now, you'll see me using both words escaping and encoding for the same thing.

The current state

The current behavior of the plugin is to not do any escaping for variable values, with mainly two reasons:

  • The user needs to know exactly what the plugin is going to substitute a variable with. If a variable value contains e.g. " or ' characters, then the user should be able to expect that the result will be " or ', not e.g. \" or \' unless the user has consciously opted for escaping.
  • I do not yet know all the details that are needed for proper shell command escaping.

Optional escaping

How would the user denote that they want to turn escaping on?

  • A global setting that escapes all variable values in all commands? Maybe I'd like to have more fine grained control over this.
  • A per command setting to escape all variable values in a particular command? No because user might want to use different kind of escaping/encoding in different places of a command.
  • A flag that could be written into a variable name to tell that it should be escaped unescaped? E.g. !, so that the value of {{!clipboard}} would be unescaped, and the value of {{clipboard}} would be escaped. (Edit 2021-10-30: Changed the logic the other way around, so that ! denotes not to escape).

What to escape

How to know which characters need to be escaped? There are multiple ways parameters can be enclosed/quoted (I don't know a correct term) as shell command arguments:

  • Double quotes ""
  • Single quotes ''
  • If doing URL encoding, then at least these: : / ? & = and space
  • Anything else?

One complicated way would be to inspect the particular command, and check that if a variable is inside quotes, check which quotes (single/double) are used and escape those in the variable's value. This might be too complex to accomplish, and too prone for errors. We need to have an escaping solution that is simple by its logic and that users can trust. I don't want to have a "black box" that I don't know if it works or not. Also this technique cannot reliably determine if the user wants to perform URL encoding.

That's why there needs to be some explicit way for the user to define what characters will be escaped and what not.

URL encoding

Sometimes URLs might be used in commands, for example with Obsidian URI or when launching a web browser to open a specific web page or service, such as search from the internet using keywords from a variable.

For URL encoding, there are two different options regarding where the variable value will be actually placed in a URL:

  • URI query parameter values, e.g. https://duckduckgo.com/?q={{selection}}. Here {{selection}} should be encoded according to application/x-www-form-urlencoded (space is encoded with +).
  • Everything else, e.g. https://github.com/Taitava/{{selection}} (to go to a specific repository under my GitHub account). Here {{selection}} should be encoded according to the plain Percent-Encoding (space is encoded with %20).

Major parts of the above text are copied 2021-09-16 from https://stackoverflow.com/a/4744917/2754026 . The source talks about URL encoding in PHP, but the same principles can be applied to development in JavaScript/TypeScript too. I just need to find the corresponding encoding JS functions.

Optionality and "What to escape" combined

In "Optional escaping" I suggested to use ! as a flag which would indicate to escape the variable's value. It's nice and short. But it does need a way to indicate what kind of escaping the user wants. So I want to explore more ideas on how to define escaping. Here are some:

  • ! in front of the variable name, e.g. {{!clipboard}} (the same as I already suggested).
  • Make different encodings to be variables themselves, e.g. {{url_encode:file_path:relative}}. Here {{url_encode}} variable would resolve another varible that is passed to it: {{file_path:relative}}. You could even chain different encodings: {{escape_double_quotes:url_encode:file_path:relative}} or apply the same encoding twice if needed: {{escape_double_quotes:escape_double_quotes:file_path:relative}}. Downsides: 1) these are really long, but I could come up with shorter variable names, 2) No way to pass parameters to the encoding algorithm, e.g. what characters to escape if an algorithm can take an optional list of characters?

I need to look into for example different templating languages, scripting languages etc. and see how they handle escaping definitions conveniently. This is certainly a non unique problem, so it must have countless of solutions already.

Different platforms

There are three operating systems that this plugin supports:

  • Windows
  • Macintosh
  • Linux

I have used commandline on Linux and a little bit on Windows. I don't have experience of Macs, but I need to find out how Mac commandline works, too.

Help wanted

If you are well experienced with command line stuff (Windows, Linux, or Mac), I'd like to hear your tips about what characters need to be escaped and in what situations. You can comment in this issue. Thank you, your support is highly appreciated! 🙂

If something in this issue is unclear, please comment and I'll try to clarify/rewrite things better.

@Taitava Taitava added enhancement New feature or improving change help wanted Extra attention is needed labels Sep 9, 2021
@Taitava Taitava added this to the 1.0.0 milestone Sep 11, 2021
@Taitava
Copy link
Owner Author

Taitava commented Sep 21, 2021

I've been reading a blog article about how to escape shell command parameters in Node.js. The article starts by writing an example Node.js program that takes user input and passes it to a shell command. Then the article tells what kind of security implications this may have.

For the Shell commands plugin, security risks are a bit different than for a web application. However, I'm not a security expert, so I can't possibly know all security risks. But here is some differences:

  • Node.js applications are normally facing towards the internet (= accessible from the internet by everyone, or at least a group of people). Obsidian runs on localhost, and is accessible only by people a user lets access their computer, and programs that they let run on their computer. Due to this difference, a user can be trusted a bit more when they are permitted to define their own shell commands.
  • While intentional security breaches should be quite rare in this situtation, accidental breaches are still a reality. That's when {{variables}} come in, as a user does not always know what the variables will be containing later.

The actual point of my post begins here. I'll write here notes from the blog article that may be beneficial to this problem. The most useful part of the article for this case starts from the heading Input validation or output sanitization?.

Quotes:

  • The most commonly used metacharacters are: & ; ` ' \ " | * ? ~ < > ^ ( ) [ ] { } $ \n \r

    Note that other special characters might exist too!

  • In cases where input data may contain metacharacters, preventing command injection attacks requires proper escaping of those characters before passing them to the shell for execution. Doing so yourself is extremely difficult, and it is best to use a trusted and well-tested library. One such library is shell-quote, which can be installed using the npm package manager and used to format the entire command: [...] const command = quote(["git", "log", "--oneline", file]); [...]

The example has nicely split the command "git log --oneline " + file into an array consisting of the base command and arguments. However, Shell plugins cannot use this kind of approach, as here a user is allowed to define the entire command, not just a tiny part of it. In theory, it would be possible to split the user defined command string into an array, using a whitespace (space, tab, linebreak) as a separator. I don't want to do it, because it's complex (need to take into account quotes around spaces, escaped spaces \ and maybe stuff I don't even know). It would not even solve a case like --parameter={{variable_value}} because there is no space to use as a separator, and I'm not sure if it would work to split after the = equal sign?

Also, it's not possible to handle something like this only bit by bit: mycommand "A quoted parameter that has a {{variable}} in the middle.". I guess that dividing it to an array like this: [mycommand "A quoted parameter that has a , {{variable}}, in the middle."] would not go well, because when the {{variable}} part would be quoted, the process would not know that the part will be surrounded by double quotes, and should therefore escape all "s in {{variable}} to \".

The bottom line is that while I did not yet find a solution just by reading, this has given me some small ideas, and most importantly now I can try the Shell-quote library for Node.js. I will explore it and see if it can provide a solution I haven't seen yet.

@Taitava
Copy link
Owner Author

Taitava commented Oct 16, 2021

I just read an interesting article about escaping strings in Bash, written a decade ago. The article simply suggests the following:

  • Put a backslash in front of every non-alphanumeric character

    That means: every character that is not a number or a letter. Edit 2021-10-30: Underscore _ does not need escaping, at least according to the comments here.

  • Do not wrap the string in single quotes or double quotes.

I'll need to try this. Obviosly, this is only for Bash, and I'll need to find out if the same would work for PowerShell too. I've lost my hope for CMD already, so I'm probably not gonna do anything for it.

My current thoughts about escaping:

  • An exclamation mark ! to flag that a variable should be escaped (e.g. {{!clipboard}}) is IMO problematic. It would make a user responsible to remember to use it. I'd rather use this kind of flag to say: do not escape this value. These kinds of unescaped variable values are often called raw in programming.
  • I'd like to change ! to some other character to indicate that a variable value should be unescaped/raw. But then again, e.g. in PHP's Blade template language, raw values are denoted with {!! $variable !!}](https://stackoverflow.com/a/29254016/2754026). I'm really not sure about the correct character. I want it to be descriptive and logical.
  • I don't want to create any kind of settings that affect whether variable values will be escaped or not. A user should be able to clearly see from the shell command text itself, will variables values be escaped or not, i.e. will the command be safe or dangerous. I'm also considering scenarios where users share their shell commands on the internet. A user needs to be able to see if a shell command that is unkonown to them is designed to use escaped or unescaped variable values.
  • Most shell commands are designed to use escaped variable values. Shell commands that rely on variables to submit dynamic commands, are in margin.

I hope that I will soon be able to bring up something concrete.

@Taitava
Copy link
Owner Author

Taitava commented Oct 16, 2021

To continue my previous post, I want to add that this escaping obviosly needs the knowledge about which shell is used. It seems that I have not yet started a discussion about an ability to choose the shell that is used for executing shell commands. I'll create a discussion for that some time later. There's already a discussion for making it possible to have operating system specific versions of shell commands (#38). I need to combine the two abilities somehow, to be able to have OS specific and shell specific versions of commands. When these are implemented, I think I can start to think about creating escaping rules for different shells, starting from Bash and PowerShell.

@Taitava Taitava moved this from Long term, planning to Actively working on in Roadmap (old) Oct 30, 2021
@Taitava Taitava modified the milestones: 1.0.0, 0.7.0 Oct 30, 2021
Taitava added a commit that referenced this issue Oct 30, 2021
@Taitava
Copy link
Owner Author

Taitava commented Oct 30, 2021

Finally progress in this matter. I think I've managed to implement escaping for Windows PowerShell, at least according to a very quick test I made. I've also implemented escaping for Bash and Zsh (both for Linux and Mac), but I haven't tested them. For Bash I think it should work. For Zsh, I just made it to use the same escaping that bash will use - I hope Bash and Zsh are not too different in this regard. (Edit: they are a bit different. See my next comment).

Nonetheless, I'll still test these things more to be certain, rather than just guess that things work.

  • More testing on Windows and PowerShell.
  • Test Bash escaping on Linux.
  • Test Zsh escaping on Linux (I need to install Zsh).
  • Create tests in the test suite that will cover escaping.

And I'll repeat here just in case: All {{variable}} values will be escaped by default - it may cause your current shell commands to behave differently when you upgrade to 0.7.0. If you wish to use a variable without escaping its value, use it like {{!variable}}, so add ! in front of the variable name. Usually, it's the best option to use escaping = so think carefully if you are about to use !.

And last but not least, another thing to repeat: Windows's CMD will not get any kind of escaping. No matter if you use {{variable}} or {{!variable}}, if you have selected CMD as your shell in the settings, your variables will not be escaped. It's sad, but I haven't found a way to do this kind of escaping.

@Taitava Taitava changed the title Variable values should be escaped (but how?) Variable values should be escaped Oct 30, 2021
Taitava added a commit that referenced this issue Oct 30, 2021
@Taitava
Copy link
Owner Author

Taitava commented Oct 30, 2021

Note to self: Zsh seems to use % as an escape character, instead of \ which is used by Bash. I just need to figure out if there's any other practical differences.

Prompt: bash sets the prompt (mainly) from PS1 which contains backslash escapes. Zsh sets the prompt mainly from PS1 which contains percent escapes.

Source: https://apple.stackexchange.com/a/361957

Edit 2021-10-31: Nope, my quick test told me that Zsh uses \ as the escape character, not %. I think I have misunderstood the context of the above quotation, I guess Prompt means something else than typing commands to a shell.

@Taitava
Copy link
Owner Author

Taitava commented Nov 6, 2021

I hope this starts to be in a good shape. I'm considering this done, at least at the moment. Let's see if beta testing will bring up any issues.

@Taitava
Copy link
Owner Author

Taitava commented Nov 12, 2021

One thing that I think I haven't mentioned here in GitHub, but that will be important about escaping: At least in Bash, due to special characters escaping, it's recommended to place {{variables}} outside of quotes ", because quotes can make the escaping characters (\ in Bash) to become accidentally visible, because shells do not parse most of the escapes if they happen to be in quotes.

Examples:

  • Assume that {{clipboard}} contains word & another.
  • echo "a-{{clipboard}}-b" -> Bash outputs: a-word\ \&\ another-b (\ slashes should not be present in output).
  • echo "a-"{{clipboard}}"-b" -> Bash outputs: a-word & another-b (correct output, no \ slashes because the variable is outside of quotes ").

So this allows a simple usage of variables: If you don't need to add static text before or after a variable, you can just do: echo {{clipboard}} without quotes ", and you don't need to worry about spaces or anything else, the variable's value will appear to the shell in a way that the whole value will be interpreted as a single argument/parameter, even if it contains spaces or other special characters.

@Taitava Taitava removed the help wanted Extra attention is needed label Nov 15, 2021
@Taitava Taitava mentioned this issue Nov 24, 2021
14 tasks
@Taitava
Copy link
Owner Author

Taitava commented Nov 25, 2021

Done & released - at last.

@Taitava Taitava closed this as completed Nov 25, 2021
@Taitava Taitava moved this from Actively working on to Released lately in Roadmap (old) Nov 25, 2021
@Taitava Taitava added the cornerstone Significant features that will offer concrete benefits. Not quick to implement. label Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cornerstone Significant features that will offer concrete benefits. Not quick to implement. enhancement New feature or improving change
Projects
Roadmap (old)
Released lately
Development

No branches or pull requests

1 participant