Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect behavior when doing BatchWriteItem() of GetUnprocessedItems() #358

Closed
halayli opened this issue Dec 4, 2016 · 9 comments
Closed

Comments

@halayli
Copy link

halayli commented Dec 4, 2016

When I do a BatchWriteItem and the result has unprocessed items, I try to resend them but the ones that succeeded show up as nulls in the next request.

Here's how the data shows up on the wire:

initial request

   "RequestItems" : {
      "dev_values_0min_1480860274" : [
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAAAoAAAAAAAAAAAAAAAAAAAAHAAAAAAAAAFWpbAAAAAAA"
                  },
                  "id_location" : {
                     "S" : "9-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863254"
                  }
               }
            }
         },
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAAAoAAAAAAAAAAQAAAAAAAAAGAAAAAAAAAH7KcgAAAAAA"
                  },
                  "id_location" : {
                     "S" : "3-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863255"
                  }
               }
            }
         },
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAABIAAAAAAAAAAQAAAAAAAAAJAAAAAAAAANFNawAAAAAA"
                  },
                  "id_location" : {
                     "S" : "13-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863255"
                  }
               }
            }
         },
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAAAAAAAAAAAAAFgAAAAAAAAAwAQAAAAAAAAAAAAAAAAAA"
                  },
                  "id_location" : {
                     "S" : "7-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863254"
                  }
               }
            }
         },
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAAC0AAAAAAAAAOQEAAAAAAACxAQAAAAAAAJQ3AAAAAAAA"
                  },
                  "id_location" : {
                     "S" : "18-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863254"
                  }
               }
            }
         },
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAAAAAAAAAAAAAAQAAAAAAAAANAAAAAAAAAFLUxAAAAAAA"
                  },
                  "id_location" : {
                     "S" : "14-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863255"
                  }
               }
            }
         }
      ]
   }
}

response

{"UnprocessedItems":{"dev_values_0min_1480860274":[{"PutRequest":{"Item":{"timestamp":{"N":"1480863255"},"id_location":{"S":"3-us-west-1"},"data":{"B":"BAAAAAoAAAAAAAAAAQAAAAAAAAAGAAAAAAAAAH7KcgAAAAAA"}}}},{"PutRequest":{"Item":{"timestamp":{"N":"1480863255"},"id_location":{"S":"13-us-west-1"},"data":{"B":"BAAAABIAAAAAAAAAAQAAAAAAAAAJAAAAAAAAANFNawAAAAAA"}}}},{"PutRequest":{"Item":{"timestamp":{"N":"1480863254"},"id_location":{"S":"7-us-west-1"},"data":{"B":"BAAAAAAAAAAAAAAAFgAAAAAAAAAwAQAAAAAAAAAAAAAAAAAA"}}}},{"PutRequest":{"Item":{"timestamp":{"N":"1480863254"},"id_location":{"S":"18-us-west-1"},"data":{"B":"BAAAAC0AAAAAAAAAOQEAAAAAAACxAQAAAAAAAJQ3AAAAAAAA"}}}},{"PutRequest":{"Item":{"timestamp":{"N":"1480863255"},"id_location":{"S":"14-us-west-1"},"data":{"B":"BAAAAAAAAAAAAAAAAQAAAAAAAAANAAAAAAAAAFLUxAAAAAAA"}}}}]}}

Second request to send the unprocessed items (notice the nulls in the beginning)

  "RequestItems" : {
      "dev_values_0min_1480860274" : [
         null,
         null,
         null,
         null,
         null,
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAAAoAAAAAAAAAAQAAAAAAAAAGAAAAAAAAAH7KcgAAAAAA"
                  },
                  "id_location" : {
                     "S" : "3-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863255"
                  }
               }
            }
         },
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAABIAAAAAAAAAAQAAAAAAAAAJAAAAAAAAANFNawAAAAAA"
                  },
                  "id_location" : {
                     "S" : "13-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863255"
                  }
               }
            }
         },
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAAAAAAAAAAAAAFgAAAAAAAAAwAQAAAAAAAAAAAAAAAAAA"
                  },
                  "id_location" : {
                     "S" : "7-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863254"
                  }
               }
            }
         },
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAAC0AAAAAAAAAOQEAAAAAAACxAQAAAAAAAJQ3AAAAAAAA"
                  },
                  "id_location" : {
                     "S" : "18-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863254"
                  }
               }
            }
         },
         {
            "PutRequest" : {
               "Item" : {
                  "data" : {
                     "B" : "BAAAAAAAAAAAAAAAAQAAAAAAAAANAAAAAAAAAFLUxAAAAAAA"
                  },
                  "id_location" : {
                     "S" : "14-us-west-1"
                  },
                  "timestamp" : {
                     "N" : "1480863255"
                  }
               }
            }
         }
      ]
   }
}

The code behind this:

...
std::map<std::string, std::vector<WriteRequest>> results;
// results get populated

...
  BatchWriteItemRequest req;

  req.SetRequestItems(results);
  req.SetReturnConsumedCapacity(ReturnConsumedCapacity::TOTAL);


  while (1) {

    auto outcome = client_->BatchWriteItem(req);

    if (!outcome.IsSuccess()) {
      auto error = outcome.GetError();
      auto error_type = error.GetErrorType();
      LOG(ERROR) << "Failed to batch write. data lost. ("
                 << outcome.GetError().GetMessage() << ")";
      if (error_type != DynamoDBErrors::PROVISIONED_THROUGHPUT_EXCEEDED &&
          error_type != DynamoDBErrors::THROTTLING &&
          error_type != DynamoDBErrors::SLOW_DOWN &&
          error_type != DynamoDBErrors::NETWORK_CONNECTION)
        break;
    } else {
      const auto& unprocessed = outcome.GetResult().GetUnprocessedItems();
      if (unprocessed.empty())
        break;
      req.SetRequestItems(unprocessed);
    }

    // back off exponentionally
    usleep((pow(2, num_throttled++) * 50) * 1000);
  }
@halayli
Copy link
Author

halayli commented Dec 4, 2016

The issue turns out that the vector allocated to hold the results is begin initialized with empty values followed by pushing the new values. I believe what's meant was to reserve the vector or use the [] operator to insert at the specified indexes. but the code is doing both resulting in twice the # of elements to be inserted where the first half are nulls because of the initialization.

The line below will create a vector of size = GetLength()

https://github.com/aws/aws-sdk-cpp/blob/master/aws-cpp-sdk-dynamodb/source/model/BatchWriteItemResult.cpp#L45

Following this, is https://github.com/aws/aws-sdk-cpp/blob/master/aws-cpp-sdk-dynamodb/source/model/BatchWriteItemResult.cpp#L48

which will push_back the new entries on top of the allocated empty entries.

@halayli
Copy link
Author

halayli commented Dec 4, 2016

I did some grepping and found other potential bugs waiting to happen because of the same pattern.

$grep  --include \*.cpp -riEn 'vector<.+> .+\(' . | grep GetLength
./aws-cpp-sdk-apigateway/source/model/GetUsageResult.cpp:69:      Aws::Vector<Aws::Vector<long long>> listOfUsageList((size_t)listOfUsageJsonList.GetLength());
./aws-cpp-sdk-apigateway/source/model/GetUsageResult.cpp:73:        Aws::Vector<long long> listOfLongList((size_t)listOfLongJsonList.GetLength());
./aws-cpp-sdk-apigateway/source/model/TestInvokeAuthorizerResult.cpp:79:      Aws::Vector<Aws::String> listOfStringList((size_t)listOfStringJsonList.GetLength());
./aws-cpp-sdk-apigateway/source/model/UpdateUsageResult.cpp:69:      Aws::Vector<Aws::Vector<long long>> listOfUsageList((size_t)listOfUsageJsonList.GetLength());
./aws-cpp-sdk-apigateway/source/model/UpdateUsageResult.cpp:73:        Aws::Vector<long long> listOfLongList((size_t)listOfLongJsonList.GetLength());
./aws-cpp-sdk-budgets/source/model/Budget.cpp:81:      Aws::Vector<Aws::String> dimensionValuesList((size_t)dimensionValuesJsonList.GetLength());
./aws-cpp-sdk-cloudsearchdomain/source/model/Hit.cpp:62:      Aws::Vector<Aws::String> fieldValueList((size_t)fieldValueJsonList.GetLength());
./aws-cpp-sdk-devicefarm/source/model/ListUniqueProblemsResult.cpp:45:      Aws::Vector<UniqueProblem> uniqueProblemsList((size_t)uniqueProblemsJsonList.GetLength());
./aws-cpp-sdk-dynamodb/source/model/BatchGetItemResult.cpp:45:      Aws::Vector<Aws::Map<Aws::String, AttributeValue>> itemListList((size_t)itemListJsonList.GetLength());
./aws-cpp-sdk-dynamodb/source/model/BatchWriteItemResult.cpp:61:      Aws::Vector<ItemCollectionMetrics> itemCollectionMetricsMultipleList((size_t)itemCollectionMetricsMultipleJsonList.GetLength());
./aws-cpp-sdk-kinesisanalytics/source/model/DiscoverInputSchemaResult.cpp:51:      Aws::Vector<Aws::String> parsedInputRecordList((size_t)parsedInputRecordJsonList.GetLength());
./aws-cpp-sdk-opsworks/source/model/DeploymentCommand.cpp:60:      Aws::Vector<Aws::String> stringsList((size_t)stringsJsonList.GetLength());
./aws-cpp-sdk-servicecatalog/source/model/SearchProductsResult.cpp:54:      Aws::Vector<ProductViewAggregationValue> productViewAggregationValuesList((size_t)productViewAggregationValuesJsonList.GetLength());
./aws-cpp-sdk-ssm/source/model/AssociationDescription.cpp:124:      Aws::Vector<Aws::String> parameterValueListList((size_t)parameterValueListJsonList.GetLength());
./aws-cpp-sdk-ssm/source/model/AutomationExecution.cpp:121:      Aws::Vector<Aws::String> automationParameterValueListList((size_t)automationParameterValueListJsonList.GetLength());
./aws-cpp-sdk-ssm/source/model/AutomationExecution.cpp:137:      Aws::Vector<Aws::String> automationParameterValueListList((size_t)automationParameterValueListJsonList.GetLength());
./aws-cpp-sdk-ssm/source/model/AutomationExecutionMetadata.cpp:123:      Aws::Vector<Aws::String> automationParameterValueListList((size_t)automationParameterValueListJsonList.GetLength());
./aws-cpp-sdk-ssm/source/model/Command.cpp:123:      Aws::Vector<Aws::String> parameterValueListList((size_t)parameterValueListJsonList.GetLength());
./aws-cpp-sdk-ssm/source/model/CreateAssociationBatchRequestEntry.cpp:75:      Aws::Vector<Aws::String> parameterValueListList((size_t)parameterValueListJsonList.GetLength());
./aws-cpp-sdk-ssm/source/model/StepExecution.cpp:121:      Aws::Vector<Aws::String> automationParameterValueListList((size_t)automationParameterValueListJsonList.GetLength());
./aws-cpp-sdk-xray/source/model/TraceSummary.cpp:137:      Aws::Vector<ValueWithServiceIds> valuesWithServiceIdsList((size_t)valuesWithServiceIdsJsonList.GetLength());

@bretambrose
Copy link
Contributor

We'll get the generator fixed today and a release out tomorrow.

@halayli
Copy link
Author

halayli commented Dec 5, 2016

Cool. I submitted a pr that fixes it by calling reserve() explicitly or if you want to fix it by replacing push_back with [] that also works.

@bretambrose
Copy link
Contributor

We applied your change to the code generator and it should be fixed in yesterday's release. Please reopen if we missed something.

@halayli
Copy link
Author

halayli commented Dec 8, 2016

Would you like me to submit a PR for the other ones mentioned as well?

@bretambrose
Copy link
Contributor

No need, all of that code is actually generated, so applying your fix to the codegen template (see lines 64-65 of https://github.com/aws/aws-sdk-cpp/blob/master/code-generation/generator/src/main/resources/com/amazonaws/util/awsclientgenerator/velocity/cpp/json/ModelInternalMapOrListJsonDeserializer.vm) should fix everything.

@halayli
Copy link
Author

halayli commented Dec 8, 2016

Oh got it.

@henriquef-cit
Copy link

Hi,
I'm searching for code sample using AWS XRay cpp, without success.
Does anyone has something to share (despites de poor documentation).

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants