-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coercing conditions with array-like objects into arrays #1143
Coercing conditions with array-like objects into arrays #1143
Conversation
Can one of the admins verify this patch? To accept patch and trigger a build add comment ".ok\W+to\W+test." |
Can one of the admins verify this patch? |
2 similar comments
Can one of the admins verify this patch? |
Can one of the admins verify this patch? |
going to move this to a new issue in strong-remoting instead, so it doesnt side track fixing the original issue |
@slnode ok to test |
b948624
to
a341f76
Compare
thanks @doublemarked! the changes LGTM, can you fix the commit title to be under 50chars? @Amir-61 can you please take a look too? some line seems quite long, but I believe juggler has more lenient linting conventions, can you confirm? |
@davidcheung thanks for your feedback. Yes, happy to fix the conflicts and improve linting compliance. Let's see if @Amir-61 has anything to add. |
throw new Error(g.f('Value is not an {{array}} or {{array-like object}}')); | ||
} | ||
|
||
var arrayVal = new Array(Object.keys(val).length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, then this is opening a security vulnerability, where the attacker can send an array-like object { 4000000000: true }
and we will create a 4GB array containing a single item at a large index. This is reverting the prevention measures implemented by qs.
strongloop/loopback#2824 (comment)
Also in that Issue they reveal that arrayLength exists to prevent params like
key[999999]
auto-vivifying an array of that length, which I'll keep in mind to avoid as well.
I think we need to find a different solution here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bajtos Are you sure? I tried to implement this in a way that specifically avoids such an issue.
Scallop:Projects heath$ node -v
v6.9.1
Scallop:Projects heath$ node
> keys = Object.keys({ 4000000000: true })
[ '4000000000' ]
> a = new Array(keys.length)
[ ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ node
> const arg = { 4000000000: true };
undefined
> var result = new Array(Object.keys(arg).length)
undefined
> result[4000000000]=true;
true
> result
[ ,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
... 3999999901 more items ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fortunately, V8 seems smart enough to allocate a sparse array where empty items do not take any (or very little) memory.
However, notice that the maximum array size is limited:
> new Array(5000000000)
RangeError: Invalid array length
In general, I am not comfortable overriding qs
decisions about security in LoopBack. I mean if the team working on qs
believe that automatically converting indexed query parameters into an array is a security vulnerability, then I trust them and don't want to undo their security-hardening measurements in loopback. If you believe they are wrong and it's ok to allow arbitrarily large arrays, then please talk to them and convince them to remove this restriction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @bajtos :) So, there are two aspects here, the PR code's behavior and parity with qs, both of which I'll address.
Regarding the code from this PR - this code aims to specifically avoids the issue you're concerned about, but in the very least, the fact that this is not clear means that something needs to change here. That said, let me decompose the code a bit to explain it.
The code operates on an assumption that a well-formed "array-like object" contains a series of sequential numeric indices starting from zero. It initializes the arrayVal
array with the length of the keys from the source content (Object.keys(arg).length
, which is not an arbitrarily large number), and is very careful to never perform an assignment operation like that from your example (result[4000000000]=true
).
To avoid assignments like that, it iterates from 0 up until the keys length, and hard fails in case of any missing or unexpected keys. In addition to protecting against the arbitrary-sized key, it also rejects any objects that are cluttered with unexpected non-sequential non-numeric keys. Does this make sense?
Now, regarding qs's approach - yeah generally I agree that it's better to avoid the potential for subtle discrepancies. However, they seem to have decided to entirely disable this array parsing by default in a future release, meaning even single element array-like parameters would be represented as objects. The premise behind this PR was to anticipate that upcoming change, and my implementation aims to provide a more conservative approach to array processing than qs
is currently performing.
So it seems like there are several routes to go here regarding qs's arrayLimit,
- Do nothing, however this will be a decision to (in the future) disable by default support for array-like parameters (
filter[where][someprop][inq][0]=...
). - Pick more a reasonable default for
arrayLimit
as part of the default Loopback config. Currently it defaults to 20, which is easy to unexpectedly stumble into (potentially disastrously). A substantially larger value (1000?) would create no practical risk and likely avoid the problem in almost all cases. - A patch like the one from this PR, which seeks to provide support for array-like objects in the future.
Thoughts?
EDIT: Sorry, quick additional comment:
A substantially larger value (1000?) would create no practical risk and likely avoid the problem in almost all cases.
So, 1000-element arrays shouldn't generally be a problem, but this is an assumption that may be wrong in this context. If a larger default seems like a viable solution I can poke into the qs thread on this topic to see if they agree with this risk assessment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(face palm) I misunderstood what your code was doing, thank you for detailed explanation! Your proposal looks very reasonable now.
I'll be on vacation in the next few days, I'll try to review this in more details later next week, so that we can move your pull request forward. Sorry for the delay!
var arrayVal = new Array(Object.keys(val).length); | ||
for (var i = 0; i < arrayVal.length; ++i) { | ||
if (!val.hasOwnProperty(i)) { | ||
throw new Error(g.f('Value is not an {{array}} or {{array-like object}}')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this code (and the test cases) correctly, when processing my example result[4000000000]=true
, you will only access result[0]
and bail out with an error.
I am wondering whether we should iterate over all keys in the input val
, thus converting result[4000000000]=true
to a single-item array [true]
?
var arrayVal = [];
for (var key in val) {
arrayVal.push(val[key]);
}
return arrayVal;
If we want to reject such requests, then I would like to see a better error message reported back to the user. Right now, coerceArray({ 40000: true })
fails with Value is not an array or array-like object
, which is misleading to me, because the input (result[40000]=true
) is an array-like object.
@doublemarked Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this code (and the test cases) correctly, when processing my example result[4000000000]=true, you will only access result[0] and bail out with an error.
Yes, correct!
I am wondering whether we should iterate over all keys in the input val, thus converting result[4000000000]=true to a single-item array [true]?
My objective when writing this was to maintain a strict understanding of an "array-like object" to limit the opportunity for subtle bugs, as I didn't see any circumstance where that structure of input data is not the result of some mistake. In particular I was concerned that the presence of non-index keys was an indication that the object was not intended to be interpreted as an array.
But honestly, looking at this again, the rather few places that arrays are permitted (inq
, nin
,between
, and
, or
) leaves little room for unintended consequences. One ambiguity I see is with between
, which depends on correct element order, but a key sort would probably produce intended behavior.
So anyways, in summary - I'd be happy to refactor the patch to loosen the parsing a bit if you feel that would be an improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I don't have a strong opinion. Considering that it's usually easier to loosen a strict implementation, than to make a loose implementation more strict, I think we should proceed with your current proposal.
} else { | ||
try { | ||
clauses = coerceArray(clauses); | ||
} catch (e) { | ||
err = new Error(g.f('The %s operator has invalid clauses %j', p, clauses)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please log the e
error (at least via debug
), so that users have means how to investigate what exactly is wrong with their clauses.
case 'between': | ||
try { | ||
val = coerceArray(val); | ||
} catch (e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto - please log e
. I think it may be best to include it in the error message?
debug('Cannot coerce array clause: ', e);
err = new Error(g.f('The %s property has invalid clause %j: %s', p, where[p], e.message));
it('should be able to coerce where clauses with array-like objects', function() { | ||
where = model._coerce({scores: {0: '10', 1: '20'}}); | ||
assert.deepEqual(where, {scores: [10, 20]}); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One assert per test please. Split this test into multiple smaller tests, each test calling _coerce
+ assert
only once.
(I do realise the existing code does not follow this best practice.)
{scores: {between: {0: '10', 1: '20', 2: '30'}}}, | ||
]; | ||
|
||
invalid.forEach(function(where) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, please split this into multiple tests. You can write a parameterised test case, e.g.
context('malformed array-like objects for array-only operators', function() {
var INVALID_WHERES = [
{scores: {inq: {0: '10', 1: '20', 4: '30'}}},
{scores: {inq: {0: '10', 1: '20', bogus: 'true'}}},
{scores: {between: {0: '10', 1: '20', 2: '30'}}},
];
INVALID_WHERE.forEach(function(where) {
it('should reject ' + JSON.stringify(where), function() {
expect(function() { model._coerce(where) })
.to.throw(/*...*/);
});
});
});
@doublemarked what's the status of this pull request? Are you waiting for any information from me? I'll be on vacation from Friday to the end of the year, I would like to resolve any blockers before I leave. |
@bajtos yes, I was waiting on your feedback before cracking the code back open to fix the more minor things. I don't think I need anything else. I should be able to wrap things up by EOD tomorrow, so we can try to get it closed out before your leave. Thanks! |
a341f76
to
c685763
Compare
The query-string parser used by express https://github.com/ljharb/qs#parsing-arrays limits the size of arrays that are created from query strings to 20 items. Arrays larger than that are converted to objects using numeric indices. This commit fixes the coercion algorithm used by queries to treat number-indexed objects as arrays. We still maintain a strict understanding of an "array-like object" to limit the opportunity for subtle bugs. In particular, the presence of non-index keys is an indication that the object was not intended to be interpreted as an array.
c685763
to
2377792
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @doublemarked, sorry for the delay. I have applied few last improvements to the code, rebased on top of the current master and git push -f
to your feature branch.
Let's wait for CI results before landing.
Landed, thank you for the contribution! 🙇 |
Fixes strongloop/loopback#2824
Please let me know if this is aligned with your goals for
DataAccessObject._coerce
.