Skip to content

Conversation

KevinCybura
Copy link
Contributor

No description provided.

@cswanson310 cswanson310 self-requested a review March 23, 2018 19:42
@cswanson310 cswanson310 self-assigned this Mar 23, 2018
Copy link
Contributor

@cswanson310 cswanson310 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Mostly stylistic comments and a couple suggestions that might help simplify the code a bit.

* it in the license file.
*/


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid stray whitespace changes. It makes it harder to review and complicates the git blame unnecessarily. Not generally a big deal, but please avoid it going forward.

: ExpressionRangedArity<ExpressionIndexOfArray, 2, 4>(expCtx) {}

Value evaluate(const Document& root) const final;
Value evaluate(const Document& root) const ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra space after 'const'.


Value evaluate(const Document& root) const final;
Value evaluate(const Document& root) const ;
std::vector<Value> parseDeps(const Document& root,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment here describing what this does. Also - parseDeps doesn't really give me a clue what this method does, so a more descriptive name like 'evaluateAndValidateArguments' might be more appropriate. Finally, if this is only used within IndexOfArray and IndexOfArray::Optimized, can you move it to the private or protected section?

});
}

TEST(ExpressionArrayTest, ExpressionArrayWithALlConstantValuesShouldOptimizeToExpressionConstant) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra capital in WithALlConstant

* This class handles the case where IndexOfArray is given an ExpressionConstant
* instead of using a vector and searching through it we can use a unordered_map
* for O(1) lookup time.
**/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra * in close, just one here.

};

intrusive_ptr<Expression> ExpressionIndexOfArray::optimize() {
ExpressionNary::optimize();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here indicating that this will optimize the arguments to this expression. In particular, if all arguments are constant, this should immediately evaluate to a constant. Can you add tests for that like you did for ExpressionArray?

intrusive_ptr<Expression> ExpressionIndexOfArray::optimize() {
ExpressionNary::optimize();

// If the input arr is an ExpressionConstant we can optimize using a unordered_map instead of an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't abbreviate - it just makes it harder to understand, which is counter to the point of the comment. arr -> array.

// If the input arr is an ExpressionConstant we can optimize using a unordered_map instead of an
// Array
if (dynamic_cast<ExpressionConstant*>(vpOperand[0].get())) {
ExpressionConstant* ec = dynamic_cast<ExpressionConstant*>(vpOperand[0].get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid short variable names like ec. Also, you can combine this with the if statement above:
if (auto constantArray = dynamic_cast<ExpressionConstant*>(vpOperand[0].get())) {

ExpressionConstant* ec = dynamic_cast<ExpressionConstant*>(vpOperand[0].get());
const Value valueArray = ec->getValue();
if (valueArray.nullish()) {
return this;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know what the answer is in this case, just return ExpressionConstant::create(getExpressionContext(), Value(BSONNULL)) to save time. Also add a test case for this please :)

getExpressionContext()->getValueComparator().makeUnorderedValueMap<int>();

for (int i = 0; i < int(arr.size()); i++) {
auto pair = std::make_pair(arr[i], i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional, re-write this as
indexMap.emplace(arr[i], i); or if that doesn't work, indexMap[arr[i]] = i;? I'm curious why you made a pair here.

Copy link
Contributor

@cswanson310 cswanson310 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more points of feedback.

for (int i = 0; i < int(arr.size()); i++) {
auto pair = std::make_pair(arr[i], i);
indexMap.insert(pair);
// auto pair = std::make_pair(arr[i], i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove dead code.

ExpressionConstant* ec = dynamic_cast<ExpressionConstant*>(vpOperand[0].get());
const Value valueArray = ec->getValue();
// If the input array is an ExpressionConstant we can optimize using a unordered_map instead of an
// array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End comments in periods - also lines 2801 and in the header.

protected:
// Evaluates and validates the value that is being searched for and if given the start index and
// the end index passed to ExpressionIndexOfArray returns arguments in the form of a
// vector<Value>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/**
 * Use this syntax for comments documenting the behavior of a function or method.
 */

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also - can we re-word this comment to make it more clear what the vector of values will be? How about something like
When given 'operands' which correspond to the arguments to $indexOfArray, evaluates and validates the target value, starting index, and ending index arguments and returns their values in that order. The starting index and ending index are optional, so the returned vector will have a length between 1 and 3. Throws a UserException if the values are found to be invalid in some way, e.g. if the indexes are not numbers.

vector<Value> ExpressionIndexOfArray::parseDeps(const Document& root,
vector<Value> ExpressionIndexOfArray::evaluateAndValidateArguments(const Document& root,
const ExpressionVector& operands,
size_t arrayLength) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arrayLength is unused?

vector<Value> ExpressionIndexOfArray::evaluateAndValidateArguments(const Document& root,
const ExpressionVector& operands,
size_t arrayLength) const {
std::vector<Value> deps;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deps is a strange name for this - how about validatedArguments?

auto expIndexOfArray = Expression::parseExpression(
expCtx,
BSON("$indexOfArray" << BSON_ARRAY(BSON_ARRAY(BSON("$add" << BSON_ARRAY(1 << 1)) << 1 << 1)
// Start index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the start index, I think this is the value you're searching for?

ASSERT_TRUE(shouldBeNary);

auto optimizedArgs = shouldBeNary->getOperandList();
// All arguments should be ExpressionConstants.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting - I would have expected it to fully optimize to a constant in this case... I wonder why it didn't? Can you make sure it does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is because we are optimizing an ExpresssionIndexOfArray not evaluating so it should still be an ExpressionIndexOfArray.

Im casting it to an ExpressionNary because I need to be able to call getOperandList() to determine if the arguments optimized. I think this is where the confusion is to clearify I will cast it to an ExpressionIndexOfArray

// Nullish input array should become a Value(BSONNULL).
ASSERT_VALUE_EQ(Value(BSONNULL), shouldBeExpressionConstant->getValue());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we aren't adding any coverage of actually evaluating $indexOfArray when it can be optimized, can you either add some tests here which exercise the new code (create an IndexOfArray with a constant array and various other arguments, making sure to call optimize() before evaluate()), or find some that are already written to prove that it will be exercised?

};

intrusive_ptr<Expression> ExpressionIndexOfArray::optimize() {
// This is optimize all arguments to this expression.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will optimize ...

getExpressionContext()->getValueComparator().makeUnorderedValueMap<int>();

for (int i = 0; i < int(arr.size()); i++) {
indexMap.emplace(arr[i], i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, we should not add a value if it already occurred so that we only ever return the first occurrence.

auto comparator = getExpressionContext()->getValueComparator();
Value vIndex = Value(index->second);
if (index != _indexMap.end() && comparator.evaluate(Value(index->second) >= operands[1]) &&
comparator.evaluate(Value(index->second) < operands[2])) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this is pretty awkward. You have to keep using comparator.evaluate(...) because operands[1] and operands[2] are of type Value. In reality, we know they will be of type int, right? Or even size_t?

Can we change the signature of evaluateAndValidateArguments to return a struct, something like

struct Arguments {
    Arguments(Value targetOfSearch, int startIndex, boost::optional<int> endIndex) : 
        targetOfSearch(targetOfSearch), startIndex(startIndex), endIndex(endIndex) {}

    Value targetOfSearch;
    int startIndex;
    boost::optional<int> endIndex;
};

It might be more complicated in the parsing code, but I think it would make this much easier to read.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how do you know that operands[2] is valid? Couldn't the user have left off the end index?

intrusive_ptr<ExpressionContextForTest> expCtx(new ExpressionContextForTest());
VariablesParseState vps = expCtx->variablesParseState;


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete extra newline.

ASSERT_TRUE(shouldBeIndexOfArray);

auto optimizedArgs = shouldBeIndexOfArray->getOperandList();
// All arguments should be ExpressionConstants.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you change the optimize implementation to return what ExpressionNary::optimize() returns then the whole expression should evaluate to a constant.

}

TEST(ExpressionIndexOfArray,
OptimizedExpressionIndexOfArrayWithConstantArgumentsShouldEvaluateProperly) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good test, but because you're using all constant values, I think you will hit the case where all arguments are constant instead of when just the array is constant. If you change the search values (e.g. 2 and 10) to be something like $x, then use Document{{"x", 2}} as the argument to evaluate(), you'll be testing what you want to test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing for the test below.

// Should evaluate to -1 if not found in range.
ASSERT_VALUE_EQ(Value(-1), optimizedIndexNotFoundInRange->evaluate(Document()));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test where there are duplicate values in the array? Such a test should have caught the bug you mentioned with inserting into the unordered_map twice.

index->second < args.endIndex) {
return Value(index->second);
} else {
// If the item we are searching is not found or if not found in given range return -1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I actually think this won't work. Consider the following:

{$indexOfArray: [[0, 1, "a", 0, 1, 1, "a"], "a", 3, 8]}

I think that the map here will be populated with 0-> 0, 1-> 1, "a" -> 2, but then it has lost the information about where the second or third occurrences of those values are. I think we need to map a value to a vector<int> of all the positions so that we can rule out those which are out of range.

… unorderedmap<vector<int> and added additional comments
Copy link
Contributor

@cswanson310 cswanson310 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good! I think all the changes I proposed are more cosmetic, they should do the same thing, but with less code :)

Value startIndexArg = operands[2]->evaluate(root);
uassertIfNotIntegralAndNonNegative(startIndexArg, getOpName(), "starting index");
startIndex = static_cast<size_t>(startIndexArg.coerceToInt());
startIndex = static_cast<int>(startIndexArg.coerceToInt());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need a static_cast here.

startIndex = static_cast<size_t>(startIndexArg.coerceToInt());
startIndex = static_cast<int>(startIndexArg.coerceToInt());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra newline.

uassertIfNotIntegralAndNonNegative(endIndexArg, getOpName(), "ending index");
// Don't let 'endIndex' exceed the length of the array.
endIndex = std::min(array.size(), static_cast<size_t>(endIndexArg.coerceToInt()));
endIndex = std::min(arrayLength, static_cast<size_t>(endIndexArg.coerceToInt()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need a static_cast here.

for (size_t i = startIndex; i < endIndex; i++) {
if (getExpressionContext()->getValueComparator().evaluate(array[i] == searchItem)) {
return Value(static_cast<int>(i));
return ExpressionIndexOfArray::Arguments(vpOperand[1]->evaluate(root), startIndex, endIndex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the complier knows the return types, so you could just say
return {vpOperand[1]->evaluate(root), startIndex, endIndex}

Or, if you prefer to keep the name, this method is defined on ExpressionIndexOfArray, so you don't need to include the ExpressionIndexOfArray:: prefix here.

virtual Value evaluate(const Document& root) const {
auto args = evaluateAndValidateArguments(root, vpOperand, _indexMap.size());
auto indexVec = _indexMap.find(args.targetOfSearch);
if (indexVec != _indexMap.end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be easier to read as

if (indexVec == _indexMap.end()) {
    return Value(-1);
}

// Search through the vector of indices for the first index in our range.
...  // Un-indented code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you could just leave off the else, and move the return Value(-1); to always be the last thing that happens, whether you get into the if statement or not.

fromjson("{ $indexOfArray : [ [0, 1, 2, 3, 4, 5] , '$x'] }"),
expCtx->variablesParseState);
auto optimizedIndexOfArray = expIndexOfArray->optimize();
ASSERT_VALUE_EQ(Value(1), optimizedIndexOfArray->evaluate(Document{{"x", 1}}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well add a bunch of these:

ASSERT_VALUE_EQ(Value(0), optimizedIndexOfArray->evaluate(Document{{"x", 0}}));
ASSERT_VALUE_EQ(Value(1), optimizedIndexOfArray->evaluate(Document{{"x", 1}}));
ASSERT_VALUE_EQ(Value(2), optimizedIndexOfArray->evaluate(Document{{"x", 2}}));
ASSERT_VALUE_EQ(Value(3), optimizedIndexOfArray->evaluate(Document{{"x", 3}}));
ASSERT_VALUE_EQ(Value(4), optimizedIndexOfArray->evaluate(Document{{"x", 4}}));
ASSERT_VALUE_EQ(Value(5), optimizedIndexOfArray->evaluate(Document{{"x", 5}}));

expCtx->variablesParseState);
auto optimizedIndexOfArray = expIndexOfArray->optimize();
ASSERT_VALUE_EQ(Value(1), optimizedIndexOfArray->evaluate(Document{{"x", 1}}));
auto IndexNotFound =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to re-parse or optimize, you can use the same one.

auto optimizedIndexOfArray = expIndexOfArray->optimize();
ASSERT_VALUE_EQ(Value(4), optimizedIndexOfArray->evaluate(Document{{"x", 4}}));

auto IndexNotFoundInRange = Expression::parseExpression(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can re-use optimizedIndexOfArray

ASSERT_VALUE_EQ(Value(2),
optimizedIndexOfArrayWithDuplicateValues->evaluate(Document{{"x", 2}}));

auto expIndexInRangeWithhDuplicateValues = Expression::parseExpression(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I think you can re-use the expression from above.

optimizedIndexInRangeWithDuplcateValues->evaluate(Document{{"x", 2}}));
}


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra newline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants