Skip to content

Conversation

@danieljbruce
Copy link
Contributor

@danieljbruce danieljbruce commented Oct 23, 2024

Summary:

When calling datastore.save an error will be thrown for input arguments that have nested arrays with objects containing name and value properties when the excludeLargeProperties flag is set to true. Such input arguments are valid input arguments for the save function and the right properties should be excluded from indexes so that an error doesn't occur. This PR is also just onto a feature branch so after Next Steps are completed for this feature branch there will be an opportunity to review everything together.

Background:

Previously a fix was applied that would allow calling save to not throw an error when providing large values for nested fields and setting excludeLargeProperties to true. However, unique edge cases remain where save throws an error for requests that should be ok and excludeFromIndexes: true gets applied in the wrong places.

When a user calls save, the following steps are followed and this PR changes the findLargeProperties_ function used in step 4 below:

  1. The user passes entities into the save function
  2. For each entity the steps 3-6 happen:
  3. Some preprocessing is done
  4. If excludeLargeProperties: true is provided then the excludeFromIndexes list is extended using the findLargeProperties_ function. This step uses the findLargeProperties_ function differently depending on whether the entity is an array or a non-array.
  5. An entityProto is built from the entity. This is done differently depending on whether the entity is an array or the entity is a non-array.
  6. A mutation is built from the entityProto
  7. All mutations are sent to the server

Changes:

  • For system-test/datastore.ts an integration test is added that calls save with long name/value properties inside a nested array. Without the source code changes from this PR, this test fails because the save call throws an error. This test ensures that the save call doesn't throw an error. A second test is also added just to be sure that long strings will be excluded from indexes when they are in the top level of the array.
  • For test/gapic-mocks/commit.ts a unit test is added that makes the same save call as the new integration test in system-test/datastore.ts. It ensures that excludeFromIndexes: true is added in all the right places for the mutations proto.
  • For test/entity/findLargeProperties_.ts we write some unit tests for the findLargeProperties_ function because that function is where the source code changes are in this PR. Also, there currently are no unit tests for the findLargeProperties_ function and the function is complex so the unit tests are helpful for documenting what the function actually does.
  • For src/entity.ts some code is removed that pertains to objects with name/value properties in arrays. The removed code adds excludeFromIndexes: true to various locations in the entityObject which isn't necessary anymore on the top level because that is done in the save function. It also isn't something we want to do in levels other than the top level because the name/value encoding for array entity protos is in the top level. This code is also preventing a recursive search for large values in child objects when an object in an array happens to have name and value properties, but we want this recursive search for large values so the code should be removed.

Next Steps:

There are so many different combinations of input for the save function that it can be difficult to have complete test coverage. I suggest we break the save function down into smaller parts?

Right now two of the steps in the save function are:
4. If excludeLargeProperties: true is provided then the excludeFromIndexes list is extended using the findLargeProperties_ function. This step uses the findLargeProperties_ function differently depending on whether entities are arrays or non-arrays.
5. An entityProto is built from an entity. This is done differently depending on whether the entity is an array or the entity is a non-array.

We should do the following on exclude-from-indexes-cleanup branch:

  1. Extract step 4 out into a function so that it can be unit tested.
  2. Extract step 5 out into a function so that it can be unit tested.
  3. Ensure that every input to step 4 and then step 5 results in an entity proto where excludeFromIndexes: true is only in all places where a long string is present.

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: datastore Issues related to the googleapis/nodejs-datastore API. labels Oct 23, 2024
@danieljbruce danieljbruce changed the base branch from main to exclude-from-indexes-cleanup October 23, 2024 16:00
) {
entry.excludeFromIndexes = true;
} else {
continue;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want this continue because then the findLargeProperties_ method call below won't happen and we won't search nested properties of entry for large properties and add them to the list of properties to exclude.

const MAX_DATASTORE_VALUE_LENGTH = 1500;
if (Array.isArray(entities)) {
for (const entry of entities) {
if (entry && entry.name && entry.value) {
Copy link
Contributor Author

@danieljbruce danieljbruce Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original intent of this if statement was to deal with arrays passed into the save function. When arrays are passed into the save function, they are normally expected to have name and value properties defined at the top level and then their proto is encoded using the name as the key so this code adds excludeFromIndexes:true in the right places for this special encoding.

The problem with this if statement is that this function is recursive so this logic will be applied recursively and excludeFromIndexes: true will be added recursively, but we only want this logic to be applied at the top level for arrays because the special encoding is only done for arrays at the top level. As it turns out excludeFromIndexes is applied in the right places of the top level here instead so this if block is no longer required and a new test should save a nested name/value pair with value as a long string proves this.

@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Oct 23, 2024
is.string(entry.value) &&
Buffer.from(entry.value).length > MAX_DATASTORE_VALUE_LENGTH
) {
entry.excludeFromIndexes = true;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this assignment in will apply excludeFromIndexes: true in more places than is necessary. As mentioned in the comment above, we already apply excludeFromIndexes in the right places elsewhere in the top level for arrays and in lower levels this code will actually apply excludeFromIndexes: true to the wrong places so it should be removed.

await datastore.delete(postKey);
});

it('should save a nested name/value pair with name as a long string', async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test contains the code that will throw the error without the source code changes from this PR.

properties: {
name: {
stringValue: longString,
excludeFromIndexes: true,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source code changes from this PR make sure that excludeFromIndexes: true will be added here.

excludeLargeProperties: true,
});
});
it('should save a nested name/value pair with value as a long string', async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test shows that even though we are omitting the code that appends entry.excludeFromIndexes = true; to the entity that entry.excludeFromIndexes = true; is still applied in the right places for entities that are arrays.

@danieljbruce danieljbruce marked this pull request as ready for review October 29, 2024 17:12
Copy link

@daniel-sanche daniel-sanche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@danieljbruce danieljbruce merged commit 033e1db into exclude-from-indexes-cleanup Oct 31, 2024
12 checks passed
@danieljbruce danieljbruce deleted the 1242-write-tests-for-findLargeProperties-revisit branch October 31, 2024 21:18
danieljbruce added a commit that referenced this pull request Nov 6, 2024
#1356)

* fix: The save function should not throw an error for name/value properties in arrays (#1336)

* Add some test cases for findLargeProperties_

* Add a test for arrays and skip it.

* Add a test for the complex case wrapped in an arra

* Add a test case for an array without name/value.

* Add another test for name/value pair

* Change the test case involving a name/value pair

* Add two simple test cases for excludeFromIndexes

* Add another unit test for findLargeProperties_

* Remove if block that stops recursion

* Add a test for save with a `name/value` pair

* Add some test code that ensures name/value encoded

Against the current source code, this test fails because the code throws an error, but it should pass.

* Add header

* Remove only

* Add a test that saves a long string value property

* Add a unit test checking longString for value

* Remove only

* TODO is done

* refactor: Add separate function for adding large properties to the excludeFromIndexes list (#1349)

* Add some test cases for findLargeProperties_

* Add a test for arrays and skip it.

* Add a test for the complex case wrapped in an arra

* Add a test case for an array without name/value.

* Add another test for name/value pair

* Change the test case involving a name/value pair

* Add two simple test cases for excludeFromIndexes

* Add another unit test for findLargeProperties_

* Remove if block that stops recursion

* Add a test for save with a `name/value` pair

* Add some test code that ensures name/value encoded

Against the current source code, this test fails because the code throws an error, but it should pass.

* Add header

* Remove only

* Add a test that saves a long string value property

* Add a unit test checking longString for value

* Remove only

* TODO is done

* Refactor the save function

Pull out the code that extends the excludeFromIndexes list

* Add TODO

* Write unit tests for the new excludeFromIndexes fn

* Add the header

* Document parameter for extendExcludeFromIndexes

* Remove redundant check

* refactor: Add a separate function for building the entity proto (#1350)

* Add some test cases for findLargeProperties_

* Add a test for arrays and skip it.

* Add a test for the complex case wrapped in an arra

* Add a test case for an array without name/value.

* Add another test for name/value pair

* Change the test case involving a name/value pair

* Add two simple test cases for excludeFromIndexes

* Add another unit test for findLargeProperties_

* Remove if block that stops recursion

* Add a test for save with a `name/value` pair

* Add some test code that ensures name/value encoded

Against the current source code, this test fails because the code throws an error, but it should pass.

* Add header

* Remove only

* Add a test that saves a long string value property

* Add a unit test checking longString for value

* Remove only

* TODO is done

* Refactor the save function

Pull out the code that extends the excludeFromIndexes list

* Add TODO

* Write unit tests for the new excludeFromIndexes fn

* Add the header

* Refactor the code for building the entity proto

* Rename it buildEntityProto

* The tests need another mock due to the structure

The structure change requires another mock.

* First two tests for buildEntityProto

* Reintroduce extendExcludeFromIndexes.ts tests

* Adjust the test to match expected proto

* Add another test for an entity in an array

* Add headers

* Add another header

* Eliminate space. Simplify diff

* test: Ensure that using extendExcludeFromIndexes and then buildEntityProto always excludes large properties from indexing (#1355)

* Add some test cases for findLargeProperties_

* Add a test for arrays and skip it.

* Add a test for the complex case wrapped in an arra

* Add a test case for an array without name/value.

* Add another test for name/value pair

* Change the test case involving a name/value pair

* Add two simple test cases for excludeFromIndexes

* Add another unit test for findLargeProperties_

* Remove if block that stops recursion

* Add a test for save with a `name/value` pair

* Add some test code that ensures name/value encoded

Against the current source code, this test fails because the code throws an error, but it should pass.

* Add header

* Remove only

* Add a test that saves a long string value property

* Add a unit test checking longString for value

* Remove only

* TODO is done

* Refactor the save function

Pull out the code that extends the excludeFromIndexes list

* Add TODO

* Write unit tests for the new excludeFromIndexes fn

* Add the header

* Refactor the code for building the entity proto

* Rename it buildEntityProto

* The tests need another mock due to the structure

The structure change requires another mock.

* First two tests for buildEntityProto

* Reintroduce extendExcludeFromIndexes.ts tests

* Adjust the test to match expected proto

* Add another test for an entity in an array

* Add headers

* Add another header

* Add some tests for checkEntityProto

Make sure that checkEntityProto is working correctly.

* Empty commit

* Pull out complexCaseEntities

* Add a few more tests to excludeIndexesAndBuildProto

* Add generated test cases ensuring completeness

* Add a few comments that explain behaviour of block

* Add comments explaining code blocks

* Add a bunch of other generated tests

* Make this test more meaningful for name/value chek

* Fix the generator

* Remove console logs

* Add better test name descriptions

* Add TODO

* Add a comment about the array of arrays test case

* Increase the max depth

* Add headers to the codebase

* Update parameter descriptions

* Add a bunch of parameter documentation

Correct only as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: datastore Issues related to the googleapis/nodejs-datastore API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants