New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow raw SQL in dangerous AR methods #27947

Closed
wants to merge 42 commits into
base: master
from

Conversation

@mastahyeti
Contributor

mastahyeti commented Feb 8, 2017

Some AR methods allow raw SQL where raw SQL is rarely necessary. This opens up applications to SQL injection vulnerabilities when a developer unwittingly passes params or a model attribute to the AR method.

Code like this looks innocuous, though it actually is an exploitable SQL injection vulnerability:

class PostsController < ApplicationController
  def index
    @posts = Posts.order(params[:order_by]).limit(10)
  end
end

A probably-not-comprehensive list of such dangerous APIs can be found at http://rails-sqli.org/.

These APIs should be changed to not allow raw SQL. In the rare case where raw SQL is needed, the developer should be forced to acknowledge the risk by using an API whose name indicates the danger.

This would be a breaking change though, which I'm assuming would not be acceptable. This PR adds an AR config flag that, when enabled, restricts the arguments that can be passed to previously dangerous AR APIs. The previous functionality can be accessed using unsafe_raw_ prefixed APIs.

These changes take a bit of work, so I'm opening this PR early for discussion. So far, I've made changes to the #pluck, #order and #reorder APIs.

I'll be curious to hear what folks think about the idea. I'm happy to try a different approach if anyone has suggestions, but it would be really good to remove some sharp edges from AR.

mastahyeti added some commits Feb 8, 2017

@rails-bot

This comment has been minimized.

rails-bot commented Feb 8, 2017

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @eileencodes (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@eileencodes

This comment has been minimized.

Member

eileencodes commented Feb 8, 2017

@eileencodes

I am 👍 on improving security to avoid SQL injection. I've left some comments about method names and a couple other things.

Thanks for working on this!

# rename methods like #order to #unsafe_raw_order and will validate
# arguments to #order. A value of false will do nothing.
mattr_accessor :guard_unsafe_raw_sql, instance_writer: false
self.guard_unsafe_raw_sql = false

This comment has been minimized.

@eileencodes

eileencodes Feb 8, 2017

Member

I think this would be better as allow_unsafe_raw_sql so that it's more likely to motivate someone to use it.

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Agree it should be opt-out by using the unsafe_* API.

However, for compatibility with existing code, would expect to be able to turn off checks on a model superclass: SomeEngine::Model.guard_unsafe_raw_sql = false without affecting app models or other engines.

#
# Person.effective_column_names
# # => ["id", "created_at", "updated_at", "name", "age"]
def effective_column_names

This comment has been minimized.

@eileencodes

eileencodes Feb 8, 2017

Member

I think this method name is doesn't really explain what it's doing, but I haven't been able to come up with a better name yet. safe_column_names? attribute_column_names?

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

attribute_names_and_aliases?

if has_include?(column_names.first)
construct_relation_for_association_calculations.pluck(*column_names)
else
if ActiveRecord::Base.guard_unsafe_raw_sql && !@klass.effective_column_name?(column_names)

This comment has been minimized.

@eileencodes

eileencodes Feb 8, 2017

Member

I'd consider moving this into a method of it's own column_valid? or something like that.

end
end
def unsafe_raw_pluck(*column_names)

This comment has been minimized.

@eileencodes

eileencodes Feb 8, 2017

Member

Should this have documentation?

@maclover7 maclover7 added the needs work label Feb 8, 2017

#
# Person.effective_column_names
# # => ["id", "created_at", "updated_at", "name", "age"]
def effective_column_names

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

attribute_names_and_aliases?

def effective_column_names
@effective_column_names ||= begin
(attribute_names + attribute_aliases.keys).uniq
end

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Can drop the begin/end

end
end
def effective_column_name?(*names)

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

"Can I ask this model for attributes with these names?"

Person.respond_to_attributes? :foo, :bar

end
def effective_column_name?(*names)
(names.flatten.map(&:to_s) - effective_column_names).empty?

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

When names is typically small, would expect that iterating through those and checking whether they're attributes would be quicker than subtracting each attr: names.all? { |name| has_attribute?(name) || has_attribute_alias?(name) }

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

(That'd obviate the effective_column_names method, too)

return records.pluck(*column_names)
end
if has_include?(column_names.first)

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Can drop the early return above and make this an elsif

hash = args.last.is_a?(Hash) ? args.pop : {}
directions.concat(hash.values)
columns.concat(hash.keys)
columns.concat(args)

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Can do the assignments above here:

orderings = args.extract_options!
attribute_names = args | orderings.keys
raise ArgumentError, "Invalid order column: #{bad_columns}"
end
bad_directions = directions.map(&:to_s) - %w(asc desc ASC DESC)

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Rather than assign directions above, it's clear to ask for orderings.values here.

@@ -98,6 +98,13 @@ def self.error_on_ignored_order_or_limit=(value)
self.error_on_ignored_order = value
end
# :singleton-method:
# Specify the behavior for unsafe raw query methods. A value of true will

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

I get "unsafe," but how are these "raw?"

This comment has been minimized.

@mastahyeti

mastahyeti Feb 9, 2017

Contributor

I went with unsafe_raw because 1) they are unsafe and 2) they allow raw SQL. I'd be happy to drop the raw bit if you'd like.

ensure
ActiveRecord::Base.guard_unsafe_raw_sql = old_value
end
end

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Nice.

Coverage for the new unsafe_* API, too?

# rename methods like #order to #unsafe_raw_order and will validate
# arguments to #order. A value of false will do nothing.
mattr_accessor :guard_unsafe_raw_sql, instance_writer: false
self.guard_unsafe_raw_sql = false

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Agree it should be opt-out by using the unsafe_* API.

However, for compatibility with existing code, would expect to be able to turn off checks on a model superclass: SomeEngine::Model.guard_unsafe_raw_sql = false without affecting app models or other engines.

incorporate changes from reviews
rename config option. fix tests.

s/effective_column_names/attribute_names_and_aliases/

s/effective_column_name?/attribute_name_or_alias?/

move pluck column checking logic into its own method

document unsafe_raw methods

💄

s/attribute_name_or_alias?/respond_to_attributes?/

optimize respond_to_attributes?

remove attribute_names_and_aliases

💄

typo

dedup pluck/unsafe_raw_pluck logic with private method

only call restrict_order_args unless unsafe-raw enabled

💄

refactor a bit

typo

test unsafe_raw methods too (and fix them)
@@ -184,6 +177,27 @@ def ids
private
def _pluck(column_names, unsafe_raw)

This comment has been minimized.

@mastahyeti

mastahyeti Feb 9, 2017

Contributor

Is the _ prefix okay here, or should I choose something more descriptive like #pluck_internal?:

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

That's cool. We use single-_ elsewhere too.

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Would use a keyword arg for the allow_unsafe_raw: boolean.

@mastahyeti

This comment has been minimized.

Contributor

mastahyeti commented Feb 9, 2017

Breaks ordering by a SQL expression. That's fine, but the only workaround is to disable checks globally. Would expect to selectively opt out of checks.

Similar, need an allowlist of ordering modifiers, probably delegated to Arel or the conn adapter. For example, Oracle ORDER BY foo NULLS FIRST.

Good call. I should have known there would be other crazy syntaxes 😄. I'll see if arel has anything to help with this.

@mastahyeti

This comment has been minimized.

Contributor

mastahyeti commented Feb 9, 2017

I couldn't find any existing list of order modifiers that included NULLS FIRST/LAST. There is a VALID_DIRECTIONS constant, but it only includes ASC, and DESC. Should I just special case NULLS FIRST/LAST, or are you aware of other modifiers that should be included?

@jeremy

This comment has been minimized.

Member

jeremy commented Feb 9, 2017

Paraphrasing via @matthewd: We could use Arel.sql(…) to provide intentional SQL literals for "unsafe" cases. Then we don't split safe/unsafe API, don't need to pass special flags around, and have a clear compatibility path.

Then we could do a few modes for unsafe query checking: strict (new apps, raise on violations), deprecated (default, warn on violations that didn't use SQL literals), false (off, shush).

# # => false
def respond_to_attribute?(name)
name = name.to_s
attribute_names.include?(name) || attribute_aliases.include?(name)

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

attribute_types.key?(name) a tad quicker?

delegate :count, :average, :minimum, :maximum, :sum, :calculate, to: :all
delegate :pluck, :ids, to: :all
delegate :pluck, :unsafe_raw_pluck, :ids, to: :all

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

No biggie, but hard to review diffs when line breaks change too.

@@ -184,6 +177,27 @@ def ids
private
def _pluck(column_names, unsafe_raw)

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

That's cool. We use single-_ elsewhere too.

@@ -184,6 +177,27 @@ def ids
private
def _pluck(column_names, unsafe_raw)

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Would use a keyword arg for the allow_unsafe_raw: boolean.

records.pluck(*column_names)
elsif has_include?(column_names.first)
construct_relation_for_association_calculations.pluck(*column_names)
elsif unsafe_raw || unrecognized.none?

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Feels like this is a conditional checking the guard condition, but it's actually checking whether not to guard, leaving the raise in the else disconnected from the decisionmaking that leads to it. Compare with:

elsif !allow_unsafe_raw && unrecognized.any?
  raise …
else
  # actual pluck impl
end
raise ArgumentError, "Invalid order direction: #{unrecognized}"
end
end

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

👍

# TODO: find a better list of modifiers.
unrecognized = orderings.values.reject { |d| VALID_DIRECTIONS.include?(d.to_s) }
if unrecognized.any?
raise ArgumentError, "Invalid order direction: #{unrecognized}"

This comment has been minimized.

@jeremy

jeremy Feb 9, 2017

Member

Consider unrecognized.inspect so the message reflects the string/symbol syntax from the caller.

@jeremy

This comment has been minimized.

Member

jeremy commented Feb 9, 2017

I couldn't find any existing list of order modifiers that included NULLS FIRST/LAST. There is a VALID_DIRECTIONS constant, but it only includes ASC, and DESC. Should I just special case NULLS FIRST/LAST, or are you aware of other modifiers that should be included?

Think rolling with VALID_DIRECTIONS is the way to go since we already have it. Introducing other ordering modifiers is a separate effort.

@mastahyeti

This comment has been minimized.

Contributor

mastahyeti commented Feb 9, 2017

We could use Arel.sql(…) to provide intentional SQL literals for "unsafe" cases. Then we don't split safe/unsafe API, don't need to pass special flags around, and have a clear compatibility path.

My inexperience working with AR is showing: I don't follow that statement at all 😄.

@jeremy

This comment has been minimized.

Member

jeremy commented Feb 9, 2017

My bad 😊

Boils down to this:

>> Arel.sql('foo')
=> "foo"
>> Arel.sql('foo').class
=> Arel::Nodes::SqlLiteral

If we get an ordering that's an Arel::Nodes::SqlLiteral, then we skip the guard. Otherwise, we enforce the guard.

Example

unrecognized = column_names.reject do |cn|
  cn.is_a?(Arel::Nodes::SqlLiteral) || @klass.respond_to_attribute?(cn)
end

and

unrecognized = orderings.values.reject do |d|
  d.is_a?(Arel::Nodes::SqlLiteral) || VALID_DIRECTIONS.include?(d)
end

To use unsafe raw SQL, you'd do .order(Arel.sql('raw stuff')) and .pluck(Arel.sql(…)) instead of introducing a new set of .unsafe_raw_*('raw stuff') APIs.

Finally, to do a deprecation to encourage moves to Arel.sql style without breaking existing code, we could insert warnings like

unrecognized = orderings.values.reject do |d|
  d.is_a?(Arel::Nodes::SqlLiteral) || VALID_DIRECTIONS.include?(d) ||
    (@klass.allow_unsafe_raw_sql == :deprecated).tap do |warn|
      ActiveSupport::Deprecation.warn "Ordering in the #{d.inspect} direction is unrecognized and considered unsafe, vulnerable to SQL injection. Switch to `Arel.sql(#{d.to_s.inspect})` if this is intentional custom SQL ordering." if warn
    end
end
@matthewd

This comment has been minimized.

Member

matthewd commented Feb 9, 2017

If we get an ordering that's an Arel::Nodes::SqlLiteral, then we skip the guard.

Tiny logic tweak: we should just let any Arel node through

@t27duck

This comment has been minimized.

t27duck commented Feb 12, 2017

Sorry to interrupt, but I feel like this is putting the cart before the horse. It's relative common to pass raw SQL snippets into order as there is no other way the method allows developers to specify other tables in the ORDER BY.

Example use case:

class Game < ActiveRecord::Base
  # id - integer
  # platform_id - integer
  # title - string
  # ...
  belongs_to :platform
end

class Platform < ActiveRecord::Base
  # id - integer
  # name - string
  # ...
  has_many :games
end

If we want to display a table of games sorted by their platform, we have to do this in our code:

@games = Game.joins(:platform).order("#{Platform.quoted_table_name}.name ASC")

While I'm pretty sure most developers would love to instead be able to do this (similar to how where handles joined tables:

@games = Game.joins(:platform).order(platforms: {name: :asc})

Even if the change is to allow all Arel::Nodes::SqlLiteral objects passed into order it would still require some rework of existing code due to the limitation of the current API.

@matthewd

This comment has been minimized.

Member

matthewd commented Feb 12, 2017

Even if the change is to allow all Arel::Nodes::SqlLiteral objects passed into order it would still require some rework of existing code

@t27duck I'm not sure what you're suggesting. Surely any change that addresses the core problem (that the method accepts raw SQL, and that callers might reasonably not account for that) is going to require some change to existing callers?

@t27duck

This comment has been minimized.

t27duck commented Feb 12, 2017

Point taken about any changes to the public API would require developers to adjust. Perhaps I'm not 100% sold on blocking out raw strings to the interface.

What if, instead, ActionController::Parameters aren't allowed in? Is there a way to ask if params[:order_by] came from a request?

To your suggestion on allowing Arel::Nodes::SqlLiteral to be passed ... I don't think it really prevents anything as a developer can do this to bypass this proposed check:

@posts = Post.order(Arel::Notes::SqlLiteral.new(params[:order_by]))

All it would take is one person asking on stackoverflow "Why I can't pass a string into order" and one poorly thought out answer to suggest this "workaround".

@mastahyeti

This comment has been minimized.

Contributor

mastahyeti commented Feb 14, 2017

What if, instead, ActionController::Parameters aren't allowed in? Is there a way to ask if params[:order_by] came from a request?

We need to protect against model attributes in addition to parameters.

All it would take is one person asking on stackoverflow "Why I can't pass a string into order" and one poorly thought out answer to suggest this "workaround".

This is why I'd rather prefix dangerous method names with unsafe_raw_. I'm happy to defer that decision to core folks though.

@jeremy

This comment has been minimized.

Member

jeremy commented Feb 14, 2017

All it would take is one person asking on stackoverflow "Why I can't pass a string into order" and one poorly thought out answer to suggest this "workaround".

Same for bypassing strong params, HTML safety, and any other security measure. "Why can't I pass a string to #order?" → "Oh just change that to #unsafe_raw_order." Of course there is a spectrum from "cuts hand on contact" to "can shoot self in foot" to "impossible to use in error," so we need to keep our heads on straight about where we land on that spectrum and why.

Foremost, provide a fluent, sensible, obvious interface that Rails developers are happy to use and is secure in default, typical usage.

Secondarily, make it awkward for users to inadvertently construct a foot-gun from otherwise-safe tools.

People copy/pasting senseless API usage that bypasses the secure default? These are not the developers we want to hedge against when crafting our API.

In this case, exposing a single, safe set of relation builder methods is a great way to tackle both. Arguments are allowlisted by default, which catches inadvertent injection vulns, but that can be explicitly bypassed with raw SQL when needed. That's right in the sweet spot of the Active Record way: safe abstraction plus a moderately-vinegared bypass.

@mastahyeti

This comment has been minimized.

Contributor

mastahyeti commented Oct 26, 2017

Sorry to take so long with those last updates. I was on vacation. This branch should be good to go now?

@mastahyeti

This comment has been minimized.

Contributor

mastahyeti commented Nov 2, 2017

Bump. Is this ready to merge @matthewd?

@matthewd

This comment has been minimized.

Member

matthewd commented Nov 2, 2017

Yes, thanks 👍🏻

Sorry, I've been sitting on it because I feel reluctant to squash all of this work and history down to a single commit -- but the more I stare, the less I can see historically-relevant divisions to rebase it into a new series. 😕

@matthewd

This comment has been minimized.

Member

matthewd commented Nov 9, 2017

@mastahyeti so I had a go at a semi-squash that I think retains the core narrative of our design process, but cuts out the cross-merges and fixups: rails:master...matthewd:unsafe_raw_sql

Are you okay with that series? The commits still have your name on them, so even though they're just re-stitchings of your originals, I'd rather get your +1 before I merge.

@mastahyeti

This comment has been minimized.

Contributor

mastahyeti commented Nov 13, 2017

That series looks great @matthewd. Thanks for taking care of that.

matthewd added a commit that referenced this pull request Nov 14, 2017

Merge pull request #27947 from mastahyeti/unsafe_raw_sql
Disallow raw SQL in dangerous AR methods
@matthewd

This comment has been minimized.

Member

matthewd commented Nov 14, 2017

🎉 a1ee43d 🎉

Thanks again for all your work on this @mastahyeti -- I'm really excited about how much this will improve everyone's safety!

@matthewd matthewd closed this Nov 14, 2017

@mastahyeti

This comment has been minimized.

Contributor

mastahyeti commented Nov 14, 2017

Thanks for all the help getting this shipped!

@jeremy jeremy added this to the 5.2.0 milestone Nov 14, 2017

@pschambacher

This comment has been minimized.

Contributor

pschambacher commented Nov 14, 2017

Congrats to make our lives easier by having to worry less about sql injection.
Just to be sure, if we have code like this:

MyModel.pluck('distinct(`my_models`.`name`)')

Should I replace it with the following?

MyModel.pluck(Arel.sql('distinct(`my_models`.`name`)'))

I feel like I'm going to miss Squeel again.

@mastahyeti

This comment has been minimized.

Contributor

mastahyeti commented Nov 14, 2017

Should I replace it with the following?

Yes.

@matthewd

This comment has been minimized.

Member

matthewd commented Nov 15, 2017

@pschambacher or MyModel.distinct.pluck(:name)

@mastahyeti mastahyeti deleted the mastahyeti:unsafe_raw_sql branch Jan 5, 2018

@Fernan2

This comment has been minimized.

Fernan2 commented Mar 5, 2018

It's pretty obvious that Post.order("length(title)") is safe, because it contains no user inputs, while Post.order(params[:my_order]) is not... but both are treated as unsafe. I wonder if that's a needed pain because there's no way to distinguish between them, or if treating Post.order("length(title)") as unsafe is really intended.

Also I'd like to know if Arel.sql() is the way to deal with NULLS LAST sorting...

flackou pushed a commit to flackou/geokit-rails that referenced this pull request Apr 17, 2018

JasonBarnabe added a commit to kickbooster/wice_grid that referenced this pull request Jun 28, 2018

Fix 'Dangerous query methods' deprecation in Rails 5.2
Ref: rails/rails#27947

- Use Arel objects instead of Strings whenever we can
- Allow Arel objects with custom_order option
- Add tests for Arel and Proc custom_orders
- Update readme and changelog

kamipo added a commit to kamipo/rails that referenced this pull request Jul 10, 2018

Allow frozen string literal as a safe raw SQL string
rails#27947 was intended to address `Post.order(params[:order])` which
leading to SQL injection.

But the protection caused very annoying warnings, especially for calling
by a string literal which invoke a function like `Post.order("length(title)")`.

Calling by a string literal should be safe since it is not an user
input. So I'd like to allow string literals as a safe raw SQL string.

There is no reliable way to distinguish between`params[:order]` and
`"length(title)"`, but at least `params[:order]` as an user input is
always a mutable string.

So I think that we can add frozen strings/objects to the whitelist to
mitigate the annoying warnings without losing the original intention of
rails#27947.

Resolves rails#32995.

prathamesh-sonpatki added a commit to codetriage/codetriage that referenced this pull request Jul 27, 2018

Disallow Raw SQL by using Arel.sql for the raw part of the query
- Followup of rails/rails#27947
- Silences bunch of deprecation warnings.

prathamesh-sonpatki added a commit to codetriage/codetriage that referenced this pull request Jul 27, 2018

Disallow Raw SQL by using Arel.sql for the raw part of the query (#725)
- Followup of rails/rails#27947
- Silences bunch of deprecation warnings.
@matthewd

This comment has been minimized.

Member

matthewd commented Sep 25, 2018

@harshal0608 this change adds an extra confirmation of intent for reorder, but as discussed above, that doesn't apply for find_by and where: they are operating as intended, and can't be "fixed" any more than eval can.

kamipo added a commit to kamipo/rails that referenced this pull request Sep 28, 2018

Allow frozen string literal as a safe raw SQL string
#27947 was intended to address `Post.order(params[:order])` which
leading to SQL injection.

But the protection caused very annoying warnings, especially for calling
by a string literal which invoke a function like `Post.order("length(title)")`.

Calling by a string literal should be safe since it is not an user
input. So I'd like to allow string literals as a safe raw SQL string.

There is no reliable way to distinguish between`params[:order]` and
`"length(title)"`, but at least `params[:order]` as an user input is
always a mutable string.

So I think that we can add frozen strings/objects to the whitelist to
mitigate the annoying warnings without losing the original intention of
#27947.

Resolves #32995.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment