Skip to content
This repository

Add safe_load #119

Closed
postmodern opened this Issue · 129 comments

22 participants

Postmodern Aaron Patterson Charlie Somerville Jordon Bedwell benmmurphy 7rans Nathan Zook Jakub Egor Homakov Dan Tao Lawrence Pit Bernard Lambeau HD Moore Jordan Thoms Fred Blasdel Rodrigo Rosenfeld Rosas Marc-André Lafortune cabo Kimcha Michal Papis Mike MacDonald Trevor Wennblom
Postmodern

In lieu of the recent Rails YAML RCE vulnerability, Psych should provide a safe_load equivalent method, that only loads Ruby primitives.

Aaron Patterson
Owner

Can you please be more specific about the desired behavior? Is it acceptable to load Symbols? Are subclasses of Ruby primitives considered "primitive"? If someone sets an instance variable on a string, then dumps / loads that string, should the instance variable survive?

Can you please demonstrate how to use YAML dump / load to execute arbitrary code? AFAIK it depends on objects other than YAML.

Postmodern
  • Symbols, sure they are primitives. In order to prevent a DoS via Symbols, perhaps there should be a configuration mechanism to specify exactly what types/tags should be allowed.
  • Subclasses of primitives are no longer primitives. In safe mode the custom class could be ignored and the primitive class used instead.
  • Instance variables in primitives should not survive.

Can you please demonstrate how to use YAML dump / load to execute arbitrary code?

rails_rce.rb

AFAIK it depends on objects other than YAML.

Which is why I believe there should be a safe_load method or a configuration mechanism that restricts YAML to only loading primitives or explicitly allowed Classes.

Aaron Patterson
Owner

@postmodern does the rails_rce.rb script depend on code other than YAML? I'm confused.

Aaron Patterson
Owner
  • Symbols, sure they are primitives. In order to prevent a DoS via Symbols, perhaps there should be a configuration > mechanism to specify exactly what types/tags should be allowed.
  • Subclasses of primitives are no longer primitives. In safe mode the custom class could be ignored and the primitive > class used instead.
  • Instance variables in primitives should not survive.

Given these points, why not just use JSON?

Postmodern

JSON does not support Symbols, does not support non-String-keyed Hashes and does not have special formatting such as:

summary: |
  bla bla bla bla bla bla
  bla bla bla bla

    indented stuff

  bla bla bla bla

Is there any reason YAML should not have a secure mode?

Postmodern

@tenderlove rails_rce.rb uses !ruby/hash:MyClass and ActionController::Routing::RouteSet::NamedRouteCollection.

@charliesome's PoC used ActiveSupport::Deprecation::DeprecatedInstanceVariableProxy, ERB, Gem::Requirement and Rack::Session::Cookie::Base64::Marshal.

These exploits would not work if the YAML parser was configured to only allow primitives (aka safe_load).

Charlie Somerville

+1. Only primitives that YAML can deserialize without using ! directives should be allowed.

No symbols IMO. I can't see a use case where you would want to allow symbols but still restrict other deserialization.

Jordon Bedwell

:+1: on what @postmodern and @charliesome are saying. I really wish it supported a way for it to skip over everything and raise what isn't in the specs by default, and certainly like what @charliesome said about skipping over symbols too... the symbols part is kind of important to me because it would make it easier to keep a consistent configuration file to API without having to run to_s on all the keys myself before I create an indifferent Hash.

Postmodern

Instead of having a big case/when statement of all the tags in to_ruby.rb, Psych could have a table of parser methods/procs. This way Psych could be configured to run in "safe mode". When an unknown tag is encountered, it could skip over it (and it's children) or raise an exception.

Jordon Bedwell

There is no other option other than raise an exception so it :bomb: and somebody knows they did bad.

Aaron Patterson
Owner

@postmodern so YAML didn't execute the code, but Rails did?

Is there any reason YAML should not have a secure mode?

I don't particularly care whether it does or not. We just have to be specific about what it means. Also, we need someone to write the patch. :-)

Postmodern

@tenderlove YAML just happened to call #[]= on an instance of the Class we specified. ActionController::Routing::RouteSet::NamedRouteCollection just so happens to pass the key value from #[]= down to module_eval. We could have used other non-Rails Classes that somehow pass our data into instance_eval or send. CVE-2013-0156 is not really specific to Rails, other applications that accept arbitrary YAML are also vulnerable; such as chef-server which uses extlib.

Please see the Rails PoC write up for a full walk through.

Aaron Patterson
Owner
benmmurphy

Marshal.load has this problem as well and so would any serialization mechanism that does the following with arbitrary Foo classes and arbitrary field values.

obj = Foo.allocate
obj.instance_variable_set(field, value)
Jordon Bedwell

@benmmurphy it's in Marshal's nature to be like this by simple definition of what it is, YAML however, it's not.

7rans

I think safe load can have a very simple definition: It simply loads the YAML as if no ! tags are present. The only exceptions would be the YAML standard types which are not the implicit types (e.g. omap).

Psych can add a yaml_tag method so a program can see what the tag is. This would allow a program to look at the tag and decide what to do with it if need be, or it can just ignore the tag if there is no need. Having the tag available is important though. In fact, a fundamental design principle of YAML is the ability to "round-trip" --a YAML document that is loaded and then re-emitted should be substantively the same. So we don't want to loose that information even if it is being ignored in a safe load.

Jordon Bedwell

:+1: on @trans tag callback idea.

Postmodern

:+1:@trans. The definition is easy to grok.

Nathan Zook

I have been toying with my own ideas for a framework to do rpc/serialize objects for a while. I really don't know if this is the correct approach. Attribute assignment safety is not, and, in a language such as ruby, cannot ever be global. Indeed, some of the high-level classes (drb, tk, and the like) are the sorts of things that would make me nervous unless I examined them. (I count 82 instances of []= in /usr/local/lib/ruby/2.0.0.) On the other hand, there are a huge number of user classes for which []= is perfectly safe.

My approach is to whitelist. You can examine the existing ruby base and stdlib classes and determine if there is a problem with any of them. There are several ways to proceed from there, but most likely, the data should be held in a data structure inside Psyche. All classes not registered as safe then are unsafe. A facility would be provided to register a class as safe. (HashWithIndifferentAccess comes to mind) Presumably, Psyche might probe a previously unknown class to see if it lists itself as safe.

The next issue is attribute assignment generally. Again, if a whitelist approach is taken, (and exact matches required), it become fairly simple to proceed. Indeed, both of these facilities might well specify that some inputs are safe but not others.

With this approach, you have the full ability to do attribute assignment where it is known to be safe. And if a maintainer does not want to bother, then their classes are simply not safe.

A global switch that says, "while processing the following string, consider everything as safe" would bring back the existing behaviour--which is exactly what would be needed in some cases.

Jordon Bedwell

@student Protecting setters is your problem, not Pysch's problem. Psych is not your object-guardian. The goal is to block unserializing attributes. Example: If people don't want symbols then they come out as ":value" rather than :value, as far as []= and "protecting it"... that is your job to protect. The rest of your argument, I'm not even gonna touch, I'll leave that to other folk to debate.

Nathan Zook

The question to me seems to be how much pain a user must endure in order to make use of a tool. If YAML is to compete with JSON, then YAML cannot require the user to audit their entire object library for []= weirdness. What's more, I doubt that I'm the only one that did not know that YAML was calling []= until last week--people won't even know about the threat. You can sit back at say nasty things about people who do weird stuff with []= or you can make a library that doesn't regularly get implicated in security problems.

Again, what I'm saying is that safety requires that unsafe actions be screened out by default. To me, that suggests an audit of Ruby's base classes and a facility for classes to register themselves as safe to whatever degree.

Charlie Somerville

@student: Why does this have to be so complicated? @trans's approach is the way to go - deserializing the YAML as if the ! directives didn't exist ensures that only 'primitive' Ruby types are created.

Jakub

Do we really needs this ? I have a strange feeling that making safe_load will be really hard and in the end nobody will use it because it will be so limited. Lets just not use YAML as something we accept from requests/external sources and just parse. If someone wants to do something like this he should explicitly say he wants to expose himself to problems that it carries with.

In my opinion we should use only/mostly JSON and Ruby. For config files ruby is so expressive that we can have instead of databas.yml like things something that will just load .rb file. aka

MyBaddAssWebAPP::Application.database do
  production do
    host: "localhost" 
     ....
  end
end

Or something similar to this. Do we need YAML at all ?

Jordon Bedwell

@JakubOboza How is it going to be limited? The fact of the matter is it already has an unserializer to move non-primates into their Objects, at that point it's a matter of having that Object tell you what kind of non-primative the Object wants to be and then you deciding if you want that non-primative, or giving you the ability to block non-primatives (like symbols) right off the bat. Since it has to have a decision engine for them to begin with (see lib/psych/visitors/to_ruby.rb) the logic for rejection is not really all that hard to add in. The callbacks might be hard, but only in the sense of how do you do it properly, not how do you implement it.

Jakub

Yeah right. So first of all do symbols get ever garbage collected ? No. So basically someone can makes series of reuqests with dozens of symbols and make your VM grow to insane size and die. Performing this DoS attack is not so hard.

So making load_safe or safe_load don't solve a problem but probably creates new ones :)

7rans

@JakubOboza Using Ruby is the exact opposite of the solution to this issue. It would create even more security issues --insurmountable ones.

@student Is your whitelist solution more complicated than we need? I wonder if whitelisting prior to loading is necessary when we can just safe-load plain YAML and then instantiate any nodes we need proactively. In other words, we can simply apply our own whitelist solution after loading given the plain YAML. That way Psych doesn't need to handle it and all the additional code and api it would entail.

However, I could see a nice general-purpose method for traversing arrays and hashes to do these conversions would be helpful to make that dead simple.

Jakub

Why ? if you use ruby in config files and never autoparse yaml in requests ? I gave the example just to show that we don't need yaml.

Everything you need from serialization format is list of objects thingy [ ], hash/map/object { } and string "" and i know working is not perfect but thats all and thats the stuff that you have in JSON.

Jordon Bedwell

@JakubOboza I think you should take a second, recollect your thoughts and come back when you are more rational and not spewing trash. Psych is what makes the non-primitives. So what you are saying is broken and flawed logic because safe_load would prevent the symbols from ever being created and would keep them as primitive strings.

Nathan Zook

First, an apology. I forgot that in general YAML is creating instances of objects, then making calls on these instances or setting attributes on them directly. Some of my first comments were coloured by that error. Clearly, it is safe to set attributes on a newly created instance, so there is no concern there.

However, if a class uses a singleton object, this is no longer safe, as global state is being overwritten. (I observe no occurrences of "ingleton" in my ruby 1.9.3p327 source.)

[]=, is another matter entirely, however. []= is no more a setter than a method ending in ? is a boolean probe. In fact, I would argue that it is less so. There is nothing at all wrong with having some sort of generic class that implements []= as sending messages to other classes. It just breaks some assumptions.

Now as I mentioned before, someone who wants to use gem X (which depends on gem Y) should not have to do research to figure out if it is safe to do so with YAML parsing external input. And just because I need to use a gem that needs a gem that implements an obscure class that is not safe, why should I be prevented from using YAML in the normal way for my entire application?

I audited the rails lib classes yesterday, and as of 2.0.0.preview2, they are clean. It is easy enough to create a structure to hold this data. When a unknown classes is encountered, it can be queried to determine if it has a yaml_safe method. If it returns false, then not even attributes will be set. If it returns true, then []= will be called with abandon. If it returns a symbol, that method will be called instead of []=. If it returns something that responds to call.... Otherwise, the class is registered as unknown, and attributes will be set but []= will not be called.

Of course, the global settings can also be used, and the end user can explicitly register whatever sort of bizarre rules make sense for them. Let the user decide!

With such a facility, the community has an easy way to register the external-input safety of their classes, and 99.9% of the people can proceed in ignorant bliss.

I REALLY want to be able to use YAML for parsing external input. JSON is entirely too lowest-common-denominator. XML is stilted at best. Marshall practically requires both ends to be running the same everything. DrB is for internal use only.

Jordon Bedwell

This shit is bananas. Seriously though, this thread turned from useful into a bag full of "come again say what?"

7rans

@student You are still making this more confusing then it is. More importantly, having developers add a yaml_safe? method to their classes does nothing to ensure they are actually safe.

Getting back to your whitelist notion however, I've been thinking about this and it might have some merit. Not exactly in the way you described (which again I think is overly complex) but just as an option passed into the safe load method. e.g. something like:

    YAML.load_file(file, :safe=>true, :whitelist=>{'!foo'=>Foo})

It's not strictly necessary, as I mention in my previous post. But it would be very convenient.

7rans

I looked at the code a bit. Am I right in concluding the heart of this matter is Visitors::ToRuby#deserialize (here)? If so the main of the solution is to create a #deserialize_safe method that removes the unwanted tag resolutions of #deserialize (some can remain like !binary and !float).

The hardest part looks like it's just getting the safe flag down to the deserialize call so it can be conditioned.

Jordon Bedwell

Float and Binary are primitives so shouldn't they be excluded by default without prejudice?

Edit: My 2 pence worth is that maybe there should be a SafeDeserializers and UnsafeDeserializers, where we hold ones that are considered always safe and anything else gets put into unsafe, if you run deserialize_safe then it will exclude all UnsafeDeserializers but if you pass in say :method then it will deserialize anything that applies.

module Deserializers
  module Safe
    module_function
    def my_unsafe_deserializer
      # => Do Work
    end
  end
end

YAML.load(data).safe_deserialize # => String
YAML.load(data).safe_deserialize(:my_unsafe_deserializer) # => Object

Or maybe so the logic makes more sense:

YAML.load(data).deserialize_all # => Objects
YAML.load(data) # => String
YAML.load(data).deserialize(:my_unsafe_deserializer) # => Object

I absolutely like the idea of having to be forced to choose to deserialize everything. It makes people think about what they are doing, where we have to be honest, most programmers now days don't really think twice about what they are doing... so I always like situations that make me think twice.

7rans

@envygeeks String, Hash and Array are primitives too, but we can't exclude them. But maybe I am misunderstanding?

As a general rule of thumb, I am thinking it is okay to accept any class for which Ruby supports a literal constructor. Not sure about !range, but we do need to support all YAML standard types and that includes Float. As for Binary, I think (correct me if I am wrong) it's really just a String, so it should be okay. I know Symbol has been mentioned above. Is the only security risk with symbols that of a memory overrun? If so, YAML's probably not the right place to be concerned with that. Rather incoming file size should be limited. (Is there a way to set a size limit on the stream parser?)

Jordon Bedwell

@trans I'm saying that Floats should be allowed to be parsed into their Objects by default.

The problem with the stream size limit is that even if you limit it, if this YAML comes over the public or public but private wire... and don't ask me why it would in the case of non-internal communication but at that point I would probably Marshal the object into a memory storage but we won't even get into that. Anyways, the point is that if it's coming from the internet it's only a matter of changing the symbol each time and a new symbol is created so the stream limit is nearly useless and it will always come down to being able to either white list objects that can be unmarshaled/deserialized or blacklist (the bad option IMO).

Ranges can be easy to safely handle:

str = "1..2"
str = Range.new($1.to_i, $3.to_i, $2.length == 3) if str =~ /^(\d+)(\.{2,3})(\d+)$/

While ghetto built it works.

Nathan Zook

Symbols are a really ugly case. The only way to prevent DOS is to check to see if a symbols is already defined before deserializing it. I have no idea how that might be done. Easier would be to have a list of symbols that are allowed for the call--but very ugly. Furthermore, in practice, if I were wanting to fling arguments for ActiveRecordFind, with its nested conditions, the list of acceptable values changes as we navigate the tree. And if you thought the list idea was ugly before, now you KNOW it is.

There is an option that I have often wanted for Array.flatten, and it strikes me as useful here as well--the ability to specify a depth limit--only deserialize one level, or two, or five.

Jordon Bedwell

@student Occam's razor. You either accept all symbols or no symbols.

Jordon Bedwell envygeeks referenced this issue in jekyll/jekyll
Merged

Safe YAML #777

Postmodern

To highlight the need for this feature, arbitrary YAML deserialization strikes Rails again.

Egor Homakov

turning YAML into JSON? Utopia

Jordon Bedwell

I actually prefer it when peeps turn my JSON into YAML. :wink:

Postmodern

@tenderlove could you possibly look at safe_yaml and maybe talk to @dtao about merging it's behaviour into Psych?

Dan Tao

@tenderlove and @postmodern: FWIW, my aim w/ the safe_yaml gem was/is pretty consistent with what @trans suggested: to provide a method which "loads the YAML as if no ! tags are present." In its current form the gem actually clobbers YAML.load, but I'd be totally fine w/ switching back to having a separate YAML.safe_load method.

Jordon Bedwell

@dtao Having two functions creates a cleaner API with expected results but having load accept a second argument for safe_load as well allows for pure backwards compatibility. /cc @tenderlove @postmodern @trans

Egor Homakov

+1 for separate function. OR load('---..', safe: true)

Lawrence Pit

+1

More vulnerabilities will be reported by various gems the coming days and weeks. Basically anything that uses the YAML gem to load arbitrary user input is at risk. There are several gems out there that do this unfortunately.

Given rails security strategy that it should be secure by default and the multitude of (breaking) changes that were made in the 3.x series to achieve this, I would logically think YAML.load should not be any different and be secure by default (if it followed the same strategy).

But I'm against breaking working apps. Would the following perhaps work? :

  1. YAML.load works default in unsafe mode.
  2. Calling YAML.safe! will make all subsequent YAML.load calls safe by default.
  3. YAML.load can receive an option to override the global safety setting.

The downside is that any gem might call YAML.safe! for no valid reason. They should instead use YAML.load("---..", :safe => true)

Meanwhile I would advise all ruby developers to use the safe_yaml gem from @dtao, run all your tests, and if nothing breaks, keep on using safe_yaml until this issue is resolved.

Charlie Somerville

@lawrencepit I'm -1 on anything like YAML.safe! that introduces global state. I think YAML.safe_load would be best as it makes grepping easier.

Postmodern

@lawrencepit Yeah, changing YAML.load's default behaviour would break pretty much everything (esp RubyGems). I'm against introducing complex state, it should be an explicit method-call or option which indicates that a specific YAML blob should be parsed in safe-mode. Explicit > implicit.

Dan Tao

@envygeeks @postmodern and @charliesome: I'm kind of torn on this one. I agree in general with avoiding global state and maintaining backwards compatibility; but I actually like @lawrencepit's suggestion a lot from an ease of use standpoint. App developers can change YAML.load to YAML.safe_load everywhere in their own application; but what if their apps depend on gems that also use YAML.load? The benefit of a permanent toggle like YAML.safe! (or I was thinking: YAML.disable_arbitrary_object_deserialization!) is that it can be called once—e.g., in a Rails initializer—and render the rest of an app secure against this particular exploit by default.

Yes, it may prove to be a breaking change in certain cases; but I'd argue those cases are likely to be more rare than the default case where user input needs to be deserialized safely. In these cases app developers could explicitly call YAML.orig_load or YAML.load("--- ...", :unsafe => true), or whatever, when they need to deserialize trusted input to arbitrary objects.

I'm thinking I probably will provide this option in safe_yaml in the next version. (This is probably strictly better than the gem's default behavior now, which is to clobber YAML.load no matter what. In retrospect I see that's probably not the best approach.)

Of course, we all know that the long-term solution is not a standalone gem but a safe option provided through Psych itself, hence this thread.

Jordon Bedwell

I would make it so that the initiator of the global state could fix the problem on behalf of the software he or she breaks, in that if Gemcutter absolutely needs to deserialize an Object, we can whitelist it in safe_load!. Something like YAML.safe_load!(Object1, Object2) At the same time I can't disagree with the dislike of it, it introduces the need to create extra unit's and it introduces new code complexity, it also introduces the ability for one lib to break another lib by abusing safe_load! by not knowing that it's for the implementer of the lib, not the creator of the lib. That will happen eventually and it's gonna create some huge friction.

Postmodern

The problem with a global switch, is that other software (RubyGems) may need to deserialize arbitrary YAML from trusted sources after safe_load! is called. Also there is no control on when safe_load! can be called. Developers should first determine whether the YAML input is from a trusted source (ex: not an arbitrary user). I sort of wish we tainted all user-input.

Whitelisting certain Classes is acceptable; perhaps maybe have a global whitelist?

Jordon Bedwell

Pinging @drbrain so he throw in his opinion from the side of RubyGem.

Postmodern

@envygeeks I hadn't thought about Gemcutter, but that is an excellent case for the need for a white-list (Gem::Version, Gem::Dependency, etc). Given that Gemcutter is a Rails 3.2 app and uses RubyGems to unpack/deserialize the gziped YAML metadata, rubygems.org is probably vulnerable right now.

Charlie Somerville

FWIW, Gem::Requirement was a vector in my RCE PoC. You should be very careful when defining a whitelist, especially for classes defining yaml_initialize

Postmodern

@charliesome you're exploit also used a bunch of Rack and ActiveSupport classes.

Charlie Somerville

Point still stands though. Whitelisting introduces too much complexity for my liking. All I really want is a way to load YAML as if the ! parts didn't exist. Nothing more, nothing less.

Bernard Lambeau

@postmodern @envygeeks I told the Gemcutter developers more than a week ago now. I suppose they've deployed a patch as for github's jekyll. Anyway, would you mind proposing a patch here? It's probably time to stop all of this and you seem to have sufficient knowledge for a first proposal.

Postmodern

@blambeau is there an issue open or a CVE registered? It looks like they only upgraded to Rails 3.2.11, which fixes the Rails vulnerabilities. However, rubygems.org is still reading the metadata of pushed gems.

Bernard Lambeau

@postmodern Not that I know, but I haven't be much involved. As far as I know, they were working on a patch.

Jordon Bedwell

@blambeau who is they? If it's Github that's already fixed... the patch you saw for Jekyll was an after the fact upstream pull request of something that was already patched in, if it's RubyGems somebody needs to get on their ass about it because RubyGems already has enough problems trying to stay up, this will only make it worse.

7rans

I was thinking about the way Psych handles user defined tags (which it basically inherited from Syck), and the idea of a whitelist. The two don't really mesh. Psych currently has these methods for globally defining tags.

add_builtin_type(tag_type, &block)
add_domain_type(domain, tag_type, &block)
add_private_type(tag_type, &block)
add_ruby_type(tag_type)

There's also add_tag(tag, klass) though I am not exactly sure how that works compared to the others. The above four all work basically the same, they just adjust the tag_type to meet different naming conventions.

I can understand the original intention for creating global interfaces for defining tag types --after all, they are analogous to classes and classes are defined globally. But, while that might had made sense when _why originally put together Syck, it's become clear, via this thread, that the reality is actually specific to the use case and not safely global at all. It's clear because using a safe_load would make all such type definitions impotent, and a whitelist would use them selectively. When you really get down to it, it's not good to have global type definitions at all. These types are generally application specific and crossing them between various dependencies is nothing but a recipe for an eventual pain in the ass.

So it strikes me that the answer to this whole issues lies in re-factoring the way Psych handles types altogether. By default YAML.load would work as it already does handling YAML core tags, the yaml extra tags (like omap) and the Ruby tags. For refined loading, definitions for the later could all be provided as constants. Then we would do, for example,

    myapp_tags = YAML::RUBY_TAGS.merge(
      :foo => Foo,
      '!tag:hospital.com,2003:patent' => Patent.method(:new),
      '!bar' => ->{ |val| Bar[val] },
      /(x|y)/ => { |type, val| Factory(type, val) }
    )

    YAML.load_file('foo.yml', :tags=>myapp_tags)

In this way, every application that uses YAML has full control over how that app handles type instantiation without any chance of global conflicts. And it is easy to use. In the example I tried to provide each possible way a tag could be defined --using Symbol, String or Regexp for the tag, and a Class, Proc/lambda or Method instance for the instantiation.

Also, this actually should not be too difficult to code. For the most part, it just means that the load method would dynamically alter the internal type tables and restore them to the default after each use. To optimize we could have a YAML::Loader instance that accepts the tags hash, and can be reused, e.g.

    yaml_loader = YAML::Loader.new(:tags=>myapp_tags)
    yaml_loader.load_file('foo.yml')

That would also make it easier for some projects to refactor their code.

Bernard Lambeau

@envygeeks "they" are guys who work for the whole ruby community on a daily basis, mostly for free. I know that the issue here is important, but "getting on their ass" is not necessarily appropriate. That being said, I'm very concerned with this issue too which is the reason why I called them in the first place. But don't get to a new drama, please. I suppose this should make the job. @tenderlove @drbrain @qrush @evan @evanphx, could we have the current status on your side? Any help/free time needed? Please call if it's the case.

Jordon Bedwell

@blambeau You are creating the drama not me. You are reading into something negatively and then telling somebody it's wrong when you should have just kept your fingers away from the keyboard. But I would expect nothing more than everybody to confuse it with "oh hey, must be mad" right?

Also, how about instead of pinging them you file a ticket on their projects? Or should I get off my ass to do it?

Bernard Lambeau

@envygeeks all my apologies if I miss the right way to handle it then. Be my guess, I think you'll probably do better (the ruby ecosystem is sooooo strange. any way you try something, it's always the wrong one; strange).

Jordon Bedwell

I don't think you did it wrong, I just think it might be better to file a ticket or email them (probably better to email since this could be critical) some people (especially people with major libs) get far too many pings via Github so sometimes they don't catch the pings. My ping to them earlier was ok if it got ignored, but this one might turn out to be important. Maybe @postmodern or @charliesome knows how to get ahold of them via email.

Bernard Lambeau

@envygeeks That's what I said. I had a (email) contact about this with them a week ago. As far as I know, they were working on a patch.

Jordon Bedwell

@blambeau ah okay, that's why I asked who "they" were and my "get on their ass" was just a hint to ping them again to see what the status is and see if they need help and where, I'm more than willing to try and help though I don't know if they want it.

HD Moore

I wrote a dirty monkey-patch gem that does this today: https://github.com/rapid7/psych_shield

Jordan Thoms

Also, the load method should probably be renamed unsafe_load or at least put a clear warning somewhere. Right now it's pretty hidden that psych is not for parsing untrusted input.

Jordon Bedwell

@jordan-thoms that's an API breaking behavior. While I agree with you, I cannot agree with you because it breaks the API and it's expected behavior one should never break their spec unless absolutely necessary.

Lawrence Pit

I think he was suggesting to output a deprecation warning when using YAML.load, which behaviour would remain exactly the same otherwise. The warning would tell you to use YAML.safe_load or YAML.unsafe_load.

Charlie Somerville

Can we expedite this issue a little? The recent rubygems.org fiasco shows that this is something that's really necessary.

The rubygems.org team knew about the vulnerability for about a week before the exploit gem was uploaded. If YAML.safe_load existed, it would have been easier for them to fix the issue and prevent the hack from ever happening.

Jordon Bedwell

I would fight acceptance unless the deprecation also mentioned that you could use YAML.load("---", safe: true) and vice verse and only if you do not include the second argument with safe in the hash. Since the safe could be a backwards trigger to YAML.safe and YAML.unsafe

Dan Tao

@charliesome and @envygeeks: I will admit it's an imperfect solution, but for now any site w/ YAML.load-related vulnerabilities can use safe_yaml and protect against deserialization exploits. Where the old behavior is needed they can change YAML.load calls to YAML.orig_load.

I think tonight I'll implement the interface @envygeeks just suggested as part of safe_yaml. Maybe Psych can then go ahead and do the same or something similar.

Fred Blasdel

@envygeeks Yes, it's an API breaking behavior, and the API needs to be broken.

The expectation that parsing a data serialization format won't eval its contents as code is so much more fundamental than the expectation that methods will continue to exist with the same names.

Postmodern

@charliesome Totally agree. Developers need an easy fix.

Now I am starting to think that a safe/allow option would be the least intrusive solution. This would allow enabling safe-mode with YAML.load and YAML.load_file. The option would accept either a Boolean or an Array of whitelisted Classes.

I might take a stab at shoehorning a whitelist into Psych, based on the existing whitelist implementations.

Dan Tao

FWIW, I've implemented several of the suggestions here in safe_yaml. I don't think my design decisions are likely to make everybody happy; but at least the library is there for applications that need a quick security patch.

Bernard Lambeau

@charliesome @dtao @postmodern @envygeeks As far as I understand the safe_yaml/safe_load proposal here, it won't be enough for rubygems.org. Gem metadata use object deserialization.

Jordon Bedwell

Yeah they briefly mentioned that earlier when I suggested we use @dtao's gem for the sake of quickness but there was so much going on I didn't ask why because they also said that Evan and Tenderlove were all over it trying to get a patch up ASAP and I was trying to help in other places too.

Postmodern

It's looks like rubygems.org is going to use this custom whitelist. However, it will take a while for a new version of Psych to be released, so safe_load is a valid option for other apps.

Egor Homakov

+1 for whitelist of basic ruby classes

Rodrigo Rosenfeld Rosas

I'm definitely -1 for safe_load. To me it is clear that Yaml is not mainly supposed to be used as a Marshal library. So it should be safe by default. load and all other related methods should be safe by default. If you want to use an unsafe version (for dump/marshalling purposes) you should be explicit about it like load_unsafe or use dump and marshal since those names are known to be insecure by definition when used with untrusted data.

For those who mentioned using JSON, this format is not nearly as rich as YAML. It doesn't even support date/time types.

Bernard Lambeau

+1 @rosenfeld I can't agree more.

EDIT: -1, therefore

Marc-André Lafortune

+1 @rosenfeld and others for a whitelist by default.
A whitelist is the best way to go, since the problem is deserializing a class whose writer never thought about deserialization.

Initial whitelist = basic core types

Optional parameter specify whitelist, either as an ok Class, or as a filter using ===.

As an extreme example, to load basic types, plus OkClass, descendants of OkModule or any class defined in MyNamescope but disallow Floats:

MY_WHITE_LIST = [
  OkClass,
  OkModule,
  ->(klass){ klass.name.start_with? 'MyNamescope::' },
  ->(klass){ raise "No floats buddy" if klass == Float },
]
YAML.load(str, MY_WHITE_LIST)

to load anything (like currently):

YAML.load(str, Kernel)

Precise list of basic core types is as @dtao's, including symbols:
Hash, Array, String, Numerics, Date, Time, Symbol, true/false/nil
I'd include Set too.

7rans

There is a misconception running through this thread: Classes do not have a one-to-one correspondence with tags.

Classes have a default tag, but you could define any tag you like and have it resolve to any class you like. So any white lists have to be by tag, not by class.

But it's even deeper then that b/c: Your tags might not be my tags; my tags may not be your tags. Different apps can use the same tags for different classes (actually it's instances, not classes, but I think Psych could be simplified to limit it to classes).

7rans

Hey. So, I've gotten pretty far in refactoring Psych's deserialization code, solving this issue in full. You can see my current progress here.

But is there someone familiar with the code that I can rap with? There are some sticking points that I need feedback on. (The first being, what's up with encode_with and init_with?)

Nathan Zook

This still feels like inversion of control to me. It is not YAML's job to try to guess which classes are safe. It is not some poor dev 10 levels down the include chain's. It is the class's responsibility. Certainly, we want the caller to be able to override default behaviour, but default should be up to the class.

By the way, YAML "safeness" is NOT an inheritable attribute of a class! Ruby has dirty inheritance, so just because []= is safe for C does not mean that it is safe for D < C.

Trans indicates that there is actually a level of indirection, which is fine. YAML is still instantiating instances of classes, and it is the class which knows what happens when methods are called on its instances. Not the tag.

Jordon Bedwell

Are we really back on setters?

7rans

@envygeeks Elucidate, please.

Jordon Bedwell

Not worth it.

7rans

So my work is based on the specification on Schemas. To create a "whitelist" you instantiate and flesh out a new Schema instance.

  my_schema = YAML::Schema.new do |s|
    s.tag '!foo', Foo
    s.tag '!bar', Bar
    s.tag /!baz:(.*?)/ do |tag, md|
       Factory(md[1])   # must return a class
    end
  end

Then you can pass that schema in as the white list.

  YAML.load_file('foo.yml', :schema=>my_schema)
Jordon Bedwell

@trans I like that a lot

HD Moore

If this goes the way of a whitelist, keep in mind not all classes in the list may be present at runtime, so stringification (as in psych_shield) is preferable.

Postmodern

@hmoore-r7 In the case of a safe-load method/option, if the Classes are not loaded when YAML.load is called then Psych will raise an error when it tries to resolve the Class. Also, I worry about allowing arbitrary Strings into a whitelist.

HD Moore

@postmodern I was referring to a solution above that provided the schema/whitelist at load time versus YAML.load time. I may have misread the context, but it is an important distinction when creating a whitelist.

EDIT: It looks like @trans's approach wouldn't eval until YAML.load was called, I misread, thanks

Postmodern

Not to add more noise to the thread, but I have added safe-loading support with whitelisting. We override resolve_class and added a revive_symbol method, to control all access to Classes/Modules and interment of Symbols. @tenderlove also requested that I add safe_load, safe_load_file and safe_load_stream methods, instead of shoe-horning in a :whitelist option.

Feedback welcome. The more implementations the better I say.

Jordon Bedwell

:-1: to @tenderlove's suggestion, I really wish the whitelist option was there too :(

7rans

@postmodern I wouldn't characterize it as "shoe-horn". It really wasn't difficult to add the option. I am concerned about adding additional public interface methods b/c then going forward they have to be supported --but when we get down to doing things as they really should be, they aren't necessary.

What do I mean by this? Your WhitelistedToRuby solution is nice. It's straight-forward, it fits cleanly into Psych's design and, of course, addresses the immediate issue at hand. But... this issue brings to light deeper issues in Psych's design. I have tried to make clear these issues in my previous posts. As it turns out, all of the issues derive from one core issue: Psych does not follow the YAML specification with regards to Schemas. For example, from the spec:

The failsafe schema is guaranteed to work with any YAML document. It is therefore the recommended schema
for generic YAML tools. A YAML processor should therefore support this schema, at least as an option.

It is clear from the specification that what schema is used is meant to be an option. By supporting this, not only is the immediate issue at hand addressed, but all the other issues I mention as well.

I know its a more difficult line of development to implement schema support, and requires more extensive changes to the code, but we can also take it as an opportunity to really improve some of the design. And, it seems to me, it's an opportune time to do it with Ruby 2.0 and all.

Jordon Bedwell

It's gonna take a lot of work to get that kinda change into Ruby 2.0 considering it's release date is supposed to be a couple of weeks away, I mean I understand things could be delayed but isn't it already feature frozen?

Charlie Somerville

@envygeeks is correct. There's no way this will make its way into Ruby 2.0.0 (unless @tenderlove pulls some strings, but given it's a few days away from code freeze, that's unlikely)

7rans

No it can't go into Ruby 2.0. People will need to gem install psych regardless. I only meant b/c of Ruby 2.0 the overall time frame is a good one to make larger improvements to Psych.

cabo

While you are looking at the symbol DoS issue, can you please think in terms of something like a "weak intern"? I.e., a method that interns a string if that symbol already exists and otherwise fails/returns nil? I need that for protocol parsing, but Ruby doesn't seem to have it (actually it does, at the C level, where it is obviously needed a lot).

Kimcha

I am very concerned about this issue and how this is being handled.
I have to say I am not a rails developer and I am not very familar with this framework and ruby itself. Therefore I might be wrong.

We are currently considering whether we should use the rails framework in our startup, but the way this new class of vulnerability and issue is being handled casts doubts on whether it is a safe choice.

I am probably coming from a different place than you guys, but the whole idea that user generated input can create objects and execute arbitrary code by default is absolutely insane to me. It's so crazy I don't even have the words to express my confusion about this. How could this NOT cause security problems?

Even more insane is to me is that for more than 1 month you guys are JUST TALKING about this and are concerned about the required changes breaking existing applications. Existing applications NEED to be broken, all rails developers need to be SLAPPED in their face as hard as possible to realise the magnitude of this and really consider whether they need to serialise objects in those places in their software.
Would you rather break a few rare cases where this functionality is actually used or have 10000 of rails apps hacked?

Meanwhile you are discussing this stuff another vulnerability was discovered (CVE-Nummer 2013-0277). If this change was already made, applications would have been already safe from it. No doubt researchers will find new ways of how to inject user supplied yaml and this needs to be correct AT THE ROOT not at the specific places. It needs to be default behaviour that you need to explicitly enable unsafe loading and even then it should show a big fat warning that if user supplied input sneaks in, the application is seriously vulnerable.

And I am certain a ton of bad guys are already all over this and finding new ways to inject user input into load()!

I know that we could just use safe_load in our project and be done with it, but that is not my main concern anymore. My main concern is what if another similar issue comes up and it will take you rails guys again months to provide a solution that effectively prevents and stops it. I think this seriously affects rails image and I cannot be the only one with those doubts.

I am not trying to troll or insult anyone here. I am trying to speed up the creation of a solution and promote a serious discussion about this. I also really, really want us to use rails in our project, since it is otherwise perfect. But this is a huge concern for me.

Jordon Bedwell

Your first two paragraphs need to be ported over to Rails because they are irrelevant here, Psych is not Rails. Also Rails has been patched and there are several gems that already protect against this. While I agree that @tenderlove is slow with pushing out an out-of-band update which makes me just as sad as you, until then you can use safe_load or another gem.

Kimcha

Thanks for your reply. But isn't the root of the problem in Psych and not rails? Isn't this the root, which executes actually YAML code? What can rails do except for try to prevent user input to sneak into Psych?

Please correct me if I am wrong.
As far as I understand it, unless ALL rails developers, ALL gem developers and ALL application developers close every possible way for user input to sneak into here, there will always be ways for code execution in the default configuration. (Unless this default behaviour is changed at the root here in Psych)
That is an absolutely crazy and unrealistic assumption and pushing this issue to rails and saying its their problem and they have already patched it, is not a solution.

Bernard Lambeau

@Kimcha have a look at Aaron's own viewpoint on his blog, especially the conclusion. http://tenderlovemaking.com/2013/02/06/yaml-f7u12.html

I don't disagree with him (the double negation matters here); anyway I think the blog post gives you a good answer from Psych's author and maintainer and that answer makes sense, even if you disagree.

Rodrigo Rosenfeld Rosas

@blambeau, I don't believe Aaron's concludes in his article that the current methods shouldn't be safe by default. He only seems to conclude that there is a need for a load method that is safe to use. Anyway, I've just commented in that article. Please read my opinions on it. I strongly agree with @Kimcha that Psych (default Ruby implementation for YAML) should be safe by default as I've already stated before and I do also agree this is a serious issue that is not being properly handled in a timely fashion. I don't also believe this problem is Rails' specifics either. In fact it affected RubyGems.org and it wouldn't really matter if RubyGems used Rails or not, or if Rails was already up-to-date or not. I really believe there are many other applications affected by such a bug (yes, I do consider this a serious bug).

Bernard Lambeau

@rosenfeld and others. It seems we agree (at least on some points), see the first comment on http://www.reddit.com/r/ruby/comments/18d83h/yaml_f7u12_by_tenderlove/.

Don't know where is the best place to further discuss those, reddit? tenderlove's blog? github issues?

Kimcha

@blambeau thanks it is an interesting read.

For you ruby and rails developers, please answer this:
How often do you actually need to serialise ruby objects from YAML? Would it be really that much effort for developers to change those cases from load to unsafe_load?

I for one am very concerned with this, especially since not just our developer's code will affect the security of our project. We can instruct our developer to use safe_load, but we can hardly check and change all gems that are going to be used. I wish safe_load was the default and all developers would have to update their apps to call unsafe_load where needed. It would be much easier and quicker to audit 3rd party code this way to evaluate whether this or another gem should be used.

Rodrigo Rosenfeld Rosas

@Kimcha, just a note: I don't believe YAML should have a method named "unsafe_load" ever. It should be called "unmarshal" or something like that to make the intent clear of what it should be used for. It doesn't make much sense to call a method that has "unsafe" on it. You don't see methods like "execute_unsafe_sql" to run arbitrary SQL commands to make the risk of SQL injection explicit.

Postmodern

@Kimcha Let's not play the blame game. All of us commentators can only discuss the vulnerability and suggest potential solutions. We do not have commit access to this repository; Aaron does. We are also neither part of ruby-core nor rails-core; Aaron is. Also, Aaron is far more cautious and considerate when making changes to code that almost every Ruby developer uses; we are probably not. If you desire to criticize someone, start with yourself for coming to this discussion so late and without patches. Perhaps you could also blame the YAML specification authors for allowing instantiation of arbitrary Hash-like classes which lead to this vulnerability; Aaron was merely interpreting(ive dancing?) the specification

Regarding your points about fixing the problem at the root. There are only so many places where user-input is passed to YAML.load (git grep "YAML.load"). In most instances the developer usually only wishes to parse YAML primitives (Integers, Strings, Arrays, etc). It is possible to hunt down each and every one of these call-sites and replace it with YAML.safe_load, which is being worked on by Aaron. I tried to use gauntlet to do this, but it appears to be broken.

As for your comments about "bad guys" and wishing to speed up development. Bad guys and exploits are the primary motivation for securing software. Once an exploit is released (either a PoC or a Metasploit exploit), it demonstrates a clear and immediate danger to organizations running the vulnerable code. If the organization does not wish to get "pwned", they merely have to upgrade, audit and test their code against the exploit. Security Darwinism: patch or end up serving Chinese malware.

Kimcha

@rosenfeld I just use the name "unsafe_load" to make it clear what I mean. It does not particularly matter what it is going to be called in the end. Although something that warns of the danger would be preferable in my opinion.

@postmodern I am not playing a blame game and I didn't mean to attack any of you users. I am just raising the question of why this issue is getting so little attention and why it is just being dragged on for a month now. With thousands of applications at risk to be hacked I cannot imagine what could be more important.

Even though you can hunt it down by grepping for YAML.load, you still have to trace the input and figure out reliably if there is a chance for user input to sneak into with a risk of you missing it. If safe_load was the default and applications were breaking because of the change, devs would have to change to "risky_load/unsafe_load/whatever_load" you would only have to audit those calls.

I disagree with you about how you see "bad guys". It is not just kiddies who download a ready-made PoC or Metasploit and start attacking servers. There are many incredibly motivated and talanted hackers who are looking for vulnerabilities and exploit them for their own gain or sell them to others.

Protecting yourself against known vulnerabilities is not a problem, but leaving ways open for 0day exploits by not fixing the root cause of vulnerabilities is. What needs to be clear is that there are hundreds of possible injection vectors and it is not realistic to assume that all of those will be fixed. As long as Psych loads yaml and serialises objects by default, there is a huge risk.

Jordon Bedwell

Any programmer who accepts YAML over the internet is a dumb ass anyways, JSON is there for a reason.

Kimcha

@envygeeks does this mean the developers of the code affected by the recent vulnerabilities are all dumb ass? If those guys who are writing critical parts of the framework are dumb ass, what can you expect from all the other developers of the thousands of gems available?

Developers make mistakes, bugs sneak in and some of them are exploitable. To prevent those things, the default configuration should be as safe as possible discourage those things. Those bugs are not going away, they are here to stay and you have the opportunity here to make it safer by default.

Rodrigo Rosenfeld Rosas

@envygeeks, I never used YAML myself as request params but I wouldn't call someone who opts of it a dumb ass. I always felt the need for support of Timestamp types in JSON specs for instance and it can be a pain in the ass to develop APIs that makes massive use of dates and times and always have to worry how to encode them consistently in JSON throughout your API. You wouldn't have to worry about it if you were using YAML. I considered using YAML once to store my cached results in Redis as a cache mechanism and also to allow data-interchange between my Rails and Grails apps. I just gave up on the idea because YAML was about 10 times slower than JSON and I had a lot to serialize. But I can see people willing to use YAML as a "data" (primitive only) format just the way they use JSON. But since browsers don't support YAML by default (like JSON.stringify/parse) I don't see YAML being used wide-spread any-time soon, specially if they can't perform better. But I can see some advantages of YAML over JSON anyway...

Postmodern

@Kimcha changing the default behaviour of YAML.load would create more bugs than it would fix. Code such as RubyGems relies on YAML to serialize/deserialize non-primitive objects.

you still have to trace the input and figure out reliably if there is a chance for user input to sneak into with a risk of you missing it.

Static analysis (grep / Ripper), dynamic analysis (proxy/probe objects), fuzzing.

It is not just kiddies who download a ready-made PoC or Metasploit and start attacking servers.

The skill level of the attacker does not matter, whether you are vulnerable or not does matter. Verify the gem versions which you are running (bundler-audit) and audit any additional source-code / dependencies (grep -r ... lib/ app/ vendor/bundle/).

What needs to be clear is that there are hundreds of possible injection vectors and it is not realistic to assume that all of those will be fixed.

No one has actually determined how many instances of YAML.load there are, and how many of them are passed user-input. Security <hint>Researchers should get on this</hint>; I tried to do this with gaunlet, but no go.

Jordon Bedwell

@rosenfeld: Time is not hard:

Time.parse(JSON.parse({ time: Time.now }.to_json)["time"]) # => 2013-02-12 07:12:45 -0700
Time.parse(JSON.parse({ time: "2012/2/12" }.to_json)["time"]) # => 2012-02-12 00:00:00 -0700
Time.parse(JSON.parse({ time: "12-2012-2" }.to_json)["time"]) #=> 2012-02-12 00:00:00 -0700

From your API use a Unix timestamp, it's common, it's known, it's easy to convert.

1.) Since when is Redis considered "from the internet"?
2.) JSON is the considered the "internet sub-set" of YAML.

In the database, in Redis, in configs YAML is fine, it's designed for that if you ask me. From the internet, you're a dumb ass, because you are using something that by default really is designed pretty much to serialize arbitrary objects... from the internet, and then wondering what the hell happened.

I considered (and always will consider) Rails allowing YAML from the internet a dumb ass move, I said it everywhere... I considered Rails allowing symbols from the internet dumb ass-ery as well. Does that mean I think they are bad programmers? For sure, I think we are all bad programmers and plenty of people have heard me say we are all bad programmers, including myself. Does that mean I hate them? No. Does it mean I think they are a dumb ass for doing what they did? Sure. We all do dumb ass things sometimes, and we are dumb asses for it.

Kimcha

@postmodern I understand it would break many applications and many developers would have to actually make a conscious decision of whether in an instance they need to serialise objects or not and if they do they should verify no user input gets in and then change the code to call "unsafe_load". That is the whole point of this.

We are discussing convenience vs security and I cannot believe you are in favour of convenience at such a risk.

What you don't seem to understand is that you can be vulnerable to a skilled attacker even with the most up to date gems and framework. Just because nobody has reported to the rails security team that a part is vulnerable, doesnt mean it is not. The best current example is (CVE-2013-0277). Yesterday you were fully updated and yet vulnerable to it to skilled attackers who also knew about this vulnerability, but chose not to report it. Today you are vulnerable to all attackers.

This could have been prevented if YAML wasn't serialised by default in rails. There are many more vulnerabilities like that to come and they shouldn't be exploitable by default!

It is an never ending game to try to fix all the attacking vectors when you can simply fix it at the root of the issue in default behaviour. You will NEVER find all instances in all cases and even when things get patched, they often get messed up again in the future.

Bernard Lambeau

@envygeeks it seems that we don't ask you then ;-)

you:

In the database, in Redis, in configs YAML is fine, it's designed for that if you ask me. From the internet, you're a dumb ass, because you are using something that by default really is designed pretty much to serialize arbitrary objects... from the internet, and then wondering what the hell happened.

official doc:

YAML™ (rhymes with “camel”) is a human-friendly, cross language, Unicode based data serialization language designed around the common native data types of agile programming languages. It is broadly useful for programming needs ranging from configuration files to Internet messaging to object persistence to data auditing.

Bernard Lambeau

@Kimcha

We are discussing convenience vs security and I cannot believe you are in favour of convenience at such a risk.

I'm affraid, my experience in the ruby world seems to suggest that ruby programmers tend to favor convenience over almost everything, including security. Given what you've already told about your project and concerns, I'm not sure the ruby/rails culture will fit very well.

Jordon Bedwell

@blambeau I'll just leave this here:

YAML
serialization language
serialization
broadly useful for programming needs
broadly useful
broadly
configuration files to Internet messaging
configuration files

Just because it says it can be used for the internet doesn't mean it's a good idea and doesn't mean "should".

Postmodern

@Kimcha changing YAML.load would not just break the application, it would break many dependencies. The change would be comparable to upgrading from 1.8 to 1.9.

We are discussing convenience vs security and I cannot believe you are in favour of convenience at such a risk.

I do not agree that this is a binary choice between convenience and security, that we must be forced into choosing one or the other. It is simply the matter of the right tool for the right job:

  • Is the YAML input trusted: YAML.load
  • Is the YAML input from the user: YAML.load_safe

Just because nobody has reported to the rails security team that a part is vulnerable, doesnt mean it is not.

This is where one should take extreme measures, such as auditing their code-base for any YAML.load calls. On a more philosophical note, we can never know every vulnerability before they are reported. We can only respond to vulnerabilities as they are reported.

It is an never ending game to try to fix all the attacking vectors when you can simply fix it at the root of the issue in default behaviour.

You assume that every developer (and user) would immediately upgrade to the latest version of Psych containing safe_load. :) Although, there is a bit of truth to this. Even if we were to audit every rubygem and every Rails app, a clueless (or sleep deprived) developer could still use YAML.load unsafely. The Ruby programming language does not prevent the developer from making bad design decisions. Even if we changed YAML.load to behave like YAML.safe_load, there is still eval, instance_eval, module_eval, send, system, CSV.load, JSON.load. Likewise, SQL injection and XSS are still possible in the Java and Haskell programming languages. A competent developer must be aware of the potential dangers of their languages/frameworks and avoid them. I feel that Secure Coding Guidelines are a part of the solution here.

Bernard Lambeau

@Kimcha something I haven't told you about the ruby world: confusion. A nice example from @postmodern here:

It is simply the matter of the right tool for the right job:

  • Is the YAML input trusted: YAML.load
  • Is the YAML input from the user: YAML.load_safe

Very strong confusion between functional requirements (loading data that might or might not contain only scalars) and non-functional requirements (keeping the operation safe).

There is no such right tool for the right job, especially when your job tells you to load non-scalar data in a safe way. Anyway...

Michal Papis

continuing #119 (comment) - I assume that every developer does not write to_s_safe, you just assume that default is the safe the same default in the feature should be safe, even if it takes 3 years to make it happen - we should make it happen.

Would it be hard to announce an deprecation today, add a warning in a year, allow flag in two years and switch the new default in 3 years - is it enough time to make things safe?

Mike MacDonald

I think YAML.load should remain as it is, .load implies that we are trusting the input, just as in Marshal.load or CSV.load. Perhaps, by analogy, the primitives-only version should be YAML.parse?

Jordon Bedwell

If you want to fall inline with JSON then sure, which I wouldn't mind since it means consistency.

Trevor Wennblom
trevor commented

since it hasn't been mentioned, here's documentation that contains examples of python's / pyyaml's take on safe_load: http://pyyaml.org/wiki/PyYAMLDocumentation

Aaron Patterson
Owner

safe_load is added and released in 2.0.0, so I'm closing this.

Jordon Bedwell

Thank you!

Postmodern

w00t finally!

Garen Torikian gjtorikian referenced this issue in jch/html-pipeline
Closed

Add yaml front-matter filter #83

Jay Fredlund jayfredlund referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
Dan Tao dtao referenced this issue in dtao/safe_yaml
Closed

Is safe_yaml safe from CVE-2014-2525? #56

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.