Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix object metadata and bucket label change tracking to be fine-grained #1830

Closed
wants to merge 1 commit into from

Conversation

BenWhitehead
Copy link
Collaborator

@BenWhitehead BenWhitehead commented Dec 28, 2022

Context

Historically, when calling storage#update(BlobInfo) or storage#update(BucketInfo) the whole info would be packed and sent to the backend as a PATCH request. The backend would then perform some server side diffing against what it already had and make modifications. Of note here, for Object#metadata and Bucket#labels the PATCH behavior meant that any value provided in the request was always treated as an append.

Code sample

Specifically, the following call sequence will actually append key2: value2 to the object metadata despite nothing in the method names communicating this.

try (Storage storage = StorageOptions.http().build().getService()) {
  BlobInfo info = BlobInfo.newBuilder("bucket", "object")
      .setMetadata(ImmutableMap.of("key1", "value1"))
      .build();
  Blob gen1 = storage.create(info);

  BlobInfo modified = gen1.toBuilder().setMetadata(ImmutableMap.of("key2", "value2")).build();
  Blob gen2 = storage.update(modified, BlobTargetOption.metagenerationMatch());

  // gen2.metadata = {"key1": "value1", "key2": "value2"}
}

Additionally, this now leaves the internal state of modified such that if you were to call modified.getMetadata() it would only contain the new key2: value2. However, after calling storage.update gen2.getMetadata() both keys are present.

This is confusing, and can be remedied now that we have the capability of tracking field level modification.

Another confusing thing about this: if you want to clear the metadata on an object, the intuitive .setMetadata(Collections.emptyMap()) does not work. Instead, you would need to do the following:

try (Storage storage = StorageOptions.http().build().getService()) {
  Blob gen1 = storage.get("bucket", "object");

  // Can't use ImmutableMap here, as we need null values
  HashMap<String, String> clearedKeys = new HashMap<>();
  for (String key : gen1.getMetadata().keySet()) {
    clearedKeys.put(key, null);
  }

  BlobInfo modified = gen1.toBuilder().setMetadata(clearedKeys).build();
  Blob gen2 = storage.update(modified, BlobTargetOption.metagenerationMatch());
}

The Change

Update the behavior of BlobInfo#setMetadata and BucketInfo#setLabels to always explicitly set to the specified value, whether adding, removing or updating a key.

Add new methods BlobInfo#addAllMetadata(Map) and BucketInfo#addAllLabels(Map) to allow performing an upsert without the need to first query and reconcile.

Clearing all object metadata
try (Storage storage = StorageOptions.http().build().getService()) {
  Blob gen1 = storage.get("bucket", "object");
  BlobInfo modified = gen1.toBuilder().setMetadata(Collections.emptyMap()).build();
  Blob gen2 = storage.update(modified, BlobTargetOption.metagenerationMatch());
}
Adding/modifying object metadata
try (Storage storage = StorageOptions.http().build().getService()) {
  Blob gen1 = storage.get("bucket", "object");
  BlobInfo modified = gen1.toBuilder().addAllMetadata(ImmutableMap.of("key1", "value2", "key3", "value3").build();
  Blob gen2 = storage.update(modified, BlobTargetOption.metagenerationMatch());
}
Removing an individual metadata key
try (Storage storage = StorageOptions.http().build().getService()) {
  Blob gen1 = storage.get("bucket", "object");
  HashMap<String, String> map = new HashMap<>();
  map.put("key1", null);
  BlobInfo modified = gen1.toBuilder().addAllMetadata(map).build();
  Blob gen2 = storage.update(modified, BlobTargetOption.metagenerationMatch());
}

Implementation Details

  1. Field modification now tracks nested metadata/label keys
    • New UnifiedOpts.NestedNamedField has been added to track nested keys
    • UpdateMaskTest has been updated to expect nested field names
  2. StorageImpl#update(BlobInfo) and StorageImpl#update(BucketInfo) have both been updated to take field modification into consideration.
  3. New test class ITNestedUpdateMaskTest has been added to validate many permutations of modifying metadata/labels
  4. Each of the metadata/label related methods have had @NonNull/@Nullable annotations to their signatures to communicate acceptable arguments, and to set expectations.
  5. All Data.isNull/Data.nullOf has been moved away from Blob metadata and Bucket labels into ApiaryConversions as they only apply to Apiary usage not gRPC/Protobuf.
  6. Several easymock test in StorageImplTest have been wholly deleted as they weren't testing behavior, only wiring mocks to dance to the current implementation.
  7. Update BucketInfoTest deep equals assertions to use TestUtils.assertAll

Samples

Update samples to use new addAll* methods for their upsert use cases rather than set*.

…grained

### Context
Historically, when calling storage#update(BlobInfo) or storage#update(BucketInfo)
the whole info would be packed and sent to the backend as a PATCH request. The backend
would then perform some server side diffing against what it already had and make
modifications. Of note here, for Object#metadata and Bucket#labels the PATCH behavior
meant that any value provided in the request was always treated as an append.

#### Code sample
Specifically, the following call sequence will actually append `key2: value2` to
the object metadata despite nothing in the method names communicating this.

```java
try (Storage storage = StorageOptions.http().build().getService()) {
  BlobInfo info = BlobInfo.newBuilder("bucket", "object")
      .setMetadata(ImmutableMap.of("key1", "value1"))
      .build();
  Blob gen1 = storage.create(info);

  BlobInfo modified = gen1.toBuilder().setMetadata(ImmutableMap.of("key2", "value2")).build();
  Blob gen2 = storage.update(modified, BlobTargetOption.metagenerationMatch());

  // gen2.metadata = {"key1": "value1", "key2": "value2"}
}
```

Additionally, this now leaves the internal state of `modified` such that if you were
to call `modified.getMetadata()` it would only contain the new `key2: value2`. However,
after calling `storage.update` `gen2.getMetadata()` both keys are present.

This is confusing, and can be remedied now that we have the capability of tracking field
level modification.

Another confusing thing about this: if you want to clear the metadata on an object, the
intuitive `.setMetadata(Collections.emptyMap()` does not work. Instead, you would need
to do the following:

```java
try (Storage storage = StorageOptions.http().build().getService()) {
  Blob gen1 = storage.get("bucket", "object");

  // Can't use ImmutableMap here, as we need null values
  HashMap<String, String> clearedKeys = new HashMap<>();
  for (String key : gen1.getMetadata().keySet()) {
    clearedKeys.put(key, null);
  }

  BlobInfo modified = gen1.toBuilder().setMetadata(clearedKeys).build();
  Blob gen2 = storage.update(modified, BlobTargetOption.metagenerationMatch());
}
```

### The Change
Update the behavior of `BlobInfo#setMetadata` and `BucketInfo#setLabels` to always
explicitly set to the specified value, whether adding, removing or updating a key.

Add new methods `BlobInfo#addAllMetadata(Map)` and `BucketInfo#addAllLabels(Map)`
to allow performing an upsert without the need to first query and reconcile.

##### Clearing all object metadata
```java
try (Storage storage = StorageOptions.http().build().getService()) {
  Blob gen1 = storage.get("bucket", "object");
  BlobInfo modified = gen1.toBuilder().setMetadata(Collections.emptyMap()).build();
  Blob gen2 = storage.update(modified, BlobTargetOption.metagenerationMatch());
}
```

##### Adding/modifying object metadata
```java
try (Storage storage = StorageOptions.http().build().getService()) {
  Blob gen1 = storage.get("bucket", "object");
  BlobInfo modified = gen1.toBuilder().addAllMetadata(ImmutableMap.of("key1", "value2", "key3", "value3").build();
  Blob gen2 = storage.update(modified, BlobTargetOption.metagenerationMatch());
}
```

##### Removing an individual metadata key
```java
try (Storage storage = StorageOptions.http().build().getService()) {
  Blob gen1 = storage.get("bucket", "object");
  HashMap<String, String> map = new HashMap<>();
  map.put("key1", null);
  BlobInfo modified = gen1.toBuilder().addAllMetadata(map).build();
  Blob gen2 = storage.update(modified, BlobTargetOption.metagenerationMatch());
}
```

### Implementation Details

1. Field modification now tracks nested metadata/label keys
   * New UnifiedOpts.NestedNamedField has been added to track nested keys
   * UpdateMaskTest has been updated to expect nested field names
2. `StorageImpl#update(BlobInfo)` and `StorageImpl#update(BucketInfo)` have both
   been updated to take field modification into consideration.
3. New test class `ITNestedUpdateMaskTest` has been added to validate many
   permutations of modifying metadata/labels
4. Each of the metadata/label related methods have had `@NonNull`/`@Nullable`
   annotations to their signatures to communicate acceptable arguments, and to
   set expectations.
5. All `Data.isNull`/`Data.nullOf` has been moved away from Blob metadata and
   Bucket labels into `ApiaryConversions` as they only apply to Apiary usage
   not gRPC/Protobuf.
6. Several easymock test in StorageImplTest have been wholly deleted as they weren't
   testing behavior, only locking implementation.
7. Update BucketInfoTest deep equals assertions to use `TestUtils.assertAll`

### Samples

Update samples to use new `addAll*` methods for their upsert use cases rather than `set*`.
@BenWhitehead BenWhitehead added the owlbot:ignore instruct owl-bot to ignore a PR label Dec 28, 2022
@BenWhitehead BenWhitehead requested review from a team as code owners December 28, 2022 23:42
@product-auto-label product-auto-label bot added size: l Pull request size is large. api: storage Issues related to the googleapis/java-storage API. labels Dec 28, 2022
@@ -360,7 +362,14 @@ Builder setMediaLink(String mediaLink) {
}

@Override
public Builder setMetadata(Map<String, String> metadata) {
@BetaApi
public Builder addAllMetadata(@Nonnull Map<@NonNull String, @Nullable String> newMetadata) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not tied to this name, would upsertMetadataWith be better?

* @see #setLabels(Map)
*/
@BetaApi
public abstract Builder addAllLabels(@NonNull Map<@NonNull String, @Nullable String> newLabels);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not tied to this name, would upsertLabelsWith be better?

}

private static final Map<String, String> BUCKET_LABELS_TARGET =
ImmutableMap.of("label1", "value1", "label2", "");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was forcing Data.nullOf into the BucketInfo internals leaking apiary specifics. Those apiary specific have been pushed down to ApiaryConversions leaving BucketInfo able to do java standard things.

assertTrue(LOCATION_TYPES.contains(BUCKET_INFO.getLocationType()));
assertEquals(LOGGING, BUCKET_INFO.getLogging());
public void testBuilder() throws Exception {
assertAll(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure all assertions run

assertEquals(expected.getLogging(), value.getLogging());
assertEquals(expected, value);
private void compareBuckets(BucketInfo expected, BucketInfo value) throws Exception {
assertAll(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure all assertions run

transports = {Transport.HTTP, Transport.GRPC})
@Parameterized(NestedUpdateMaskParametersProvider.class)
@ParallelFriendly
public final class ITNestedUpdateMaskTest {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before adding this test class, only 1 case was being explicitly tested in the existing suite - Object add new key. Now we're testing permutations of add/update/remove for both objects and buckets.

@frankyn
Copy link
Member

frankyn commented Jan 4, 2023

Hey @BenWhitehead thanks for the summary.

Changing the behavior of setMetadata is a behavior breaking change, this can impact anyone that upgrades based on minor and patch versions. I agree that this is confusing and could be seen as a bug but it has been in the library for years.

Are you planning on lumping this into the next major version release and just didn't call it out in the PR?

@@ -44,7 +44,9 @@
import java.util.Map;
import java.util.Objects;
import java.util.concurrent.TimeUnit;
import javax.annotation.Nonnull;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional use of both javax.annotation.Nonnull and org.checkerframework.checker.nullness.qual.NonNull?

@BenWhitehead
Copy link
Collaborator Author

Closing in favor of maintaining the existing setMetadata behavior.

@BenWhitehead BenWhitehead deleted the fix/update/diff-tracking branch January 11, 2023 19:18
BenWhitehead added a commit that referenced this pull request Jan 11, 2023
…BucketInfo.labels and BlobInfo.metadata

Successor to #1830, whereas this preserves existing behavior rather than introducing a new method.

A new set of tests have been added to validate all permutations of modifying metadata/labels.
BenWhitehead added a commit that referenced this pull request Jan 12, 2023
…BucketInfo.labels and BlobInfo.metadata (#1843)

Successor to #1830, whereas this preserves existing behavior rather than introducing a new method.

A new set of tests have been added to validate all permutations of modifying metadata/labels.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/java-storage API. owlbot:ignore instruct owl-bot to ignore a PR size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants