refactor(gce): convert GoogleZonalServerGroupCachingAgent to Java #4002

plumpy · 2019-09-04T21:10:50Z

No description provided.

ezimanyi

💻 💻 💻 💰 🕵 ☕️ 🎉 🏅

ezimanyi · 2019-09-05T00:43:06Z

.../netflix/spinnaker/clouddriver/google/provider/agent/GoogleZonalServerGroupCachingAgent.java

+
+      List<GoogleServerGroup> serverGroups = getServerGroups(providerCache);
+
+      // If an entry in ON_DEMAND was generated _after_ we started our caching run, add it to the


I believe the behavior here is because clouddriver has an endpoint to check the status of on-demand cache refreshes.

When orca requests a cache refresh, it generally polls waiting for the request has been processed. If a pending request disappears before orca can poll for it, it will assume that it was never received and will schedule it again.

So in general we want to leave these on-demand refreshes around for at least one caching cycle before deleting them so that someone (orca) polling for the status can see that the request has been processed. This is even if the way it was "processed" was actually to ignore it because we have a more recent full cache update.

It feels like there's probably a more robust way of doing this, but that's the current way this works, so deleting a pending update too soon would likely introduce a race condition (or rather more of a race condition than there already is).

I also have a vague recollection that you asked my why we did this at some point and I didn't have a good answer, but now that I see this comment right next to the code I remember.

Huh, yeah, that definitely seems like a crazy way of doing things. I can think of other adjectives too! But thanks for the backstory. I updated the comment; let me know if it appears inaccurate or incomplete:

// We don't evict things unless they've been processed because Orca expects to see these keys // when calling pendingOnDemandRequests, and then knows the on-demand cache request has // finished when they disappear. If they go away before Orca has a chance to see them, Orca // decides the on-demand cache request was ignored completely and sends it again. So we have // to leave them around a while so Orca can observe them. Is this weird? Yes. Yes, it is.

Only slight comment is that orca doesn't know they've succeded "when they disappear" but rather when it sees they have a processedCount of >0, but other than that looks great!

Ah, great, thanks! Updated the comment (again).

ezimanyi · 2019-09-05T00:47:37Z

.../netflix/spinnaker/clouddriver/google/provider/agent/GoogleZonalServerGroupCachingAgent.java

+  }
+
+  @Override
+  public OnDemandResult handle(ProviderCache providerCache, Map<String, ?> data) {


The super is annotated @Nullable; not sure if it's good practice to also annotate this as such.

Yeah, good point. I think that because @Nullable is not marked @Inherited, you do have to add it here too.

ezimanyi · 2019-09-05T00:49:00Z

.../netflix/spinnaker/clouddriver/google/provider/agent/GoogleZonalServerGroupCachingAgent.java

+  private static final MapSplitter IMAGE_DESCRIPTION_SPLITTER =
+      Splitter.on(',').withKeyValueSeparator(": ");
+
+  private final GoogleNamedAccountCredentials credentials;


Not sure how much you want to annotate everything @Nonnull but most of these private final variables are at some point in the class used without null checks.

Added @ParametersAreNonnullByDefault to the class. (Which then caused IntelliJ to highlight a missing @Nullable annotation.)

ezimanyi · 2019-09-05T00:55:22Z

.../netflix/spinnaker/clouddriver/google/provider/agent/GoogleZonalServerGroupCachingAgent.java

+                    .get("cacheResults"),
+            new TypeReference<Map<String, List<DefaultCacheData>>>() {});
+    onDemandData.forEach(
+        (namespace, cacheDatas) -> {


cacheDatas always confuses me, but I really can't think of a better name for a collection of things called cacheData

I think I briefly had this as cacheDataObjects but decided to go with cacheDatas. 😕

ezimanyi · 2019-09-05T01:10:33Z

.../netflix/spinnaker/clouddriver/google/provider/agent/GoogleZonalServerGroupCachingAgent.java

+      return ImmutableList.of();
+    }
+    return distributionPolicy.getZones().stream()
+        .map(z -> Utils.getLocalName(z.getZone()))


Maybe more stream-y as:

.map(DistributionPolicyZoneConfiguration::getZone) .map(Utils::getLocalName)

🤷‍♂

I dunno, I'm generally of the mind that merging consecutive map calls aids readability instead of hinders it? (Obviously up to some limit, where the lambda is still relatively readable.)

IOW: I don't think that's necessary.

ezimanyi · 2019-09-05T01:12:10Z

.../netflix/spinnaker/clouddriver/google/provider/agent/GoogleZonalServerGroupCachingAgent.java

+      launchConfig.put("minCpuPlatform", instanceTemplate.getProperties().getMinCpuPlatform());
+      setSourceImage(serverGroup, launchConfig, disks, credentials, providerCache);
+    }
+    serverGroup.setLaunchConfig(launchConfig);


This looks like one of the few places we set a mutable collection on the serverGroup.

Okay, fixed this and the other place (serverGroup.setAsg).

Sadly a GoogleServerGroup is going to be very difficult to make fully immutable because it has several fields that are mutable objects we don't control (from the Google Compute APIs) :/

ezimanyi · 2019-09-05T01:15:26Z

.../netflix/spinnaker/clouddriver/google/provider/agent/GoogleZonalServerGroupCachingAgent.java

+    template.getProperties().getDisks().stream()
+        .filter(disk -> !disk.getBoot())
+        .forEach(sortedDisks::add);
+    return sortedDisks;


Could be immutable (particularly since the first return does return an immutable list).

Absolutely, thanks.

ezimanyi · 2019-09-05T01:16:29Z

.../netflix/spinnaker/clouddriver/google/provider/agent/GoogleZonalServerGroupCachingAgent.java

+    if (autoscaler.getStatusDetails() != null) {
+      serverGroup.setAutoscalingMessages(
+          autoscaler.getStatusDetails().stream()
+              .filter(details -> details.getMessage() != null)


nit: I might invert the filter and map since we're just filtering on the result of the map

Good call, looks much nicer too:

autoscaler.getStatusDetails().stream() .map(AutoscalerStatusDetails::getMessage) .filter(Objects::nonNull) .collect(toImmutableList())

This tests the behavior discussed on the review of spinnaker#4002. I should have added it there, but I didn't think about it until now.

…4007) * refactor(gce): remove an unnecessary checkState() We can survive this condition just fine. * test(gce): Add a test about on-demand caching behavior This tests the behavior discussed on the review of #4002. I should have added it there, but I didn't think about it until now. * refactor(gce): fix an import * refactor(gce): remove `static` from a method * refactor(gce): use getAccountName() consistently * docs(gce): remove a TODO It actually does seem to be true that an autoscaler's name always equals that of its associated instance group. This _might_ even be a GCE rule, but it's definitely something we always do in Spinnaker (see UpsertGoogleAutoscalingPolicyAtomicOperation and GCEUtil#buildAutoscaler). * refactor(gce): Prepare zonal caching agent for an abstract superclass About 90% of the code from GoogleZonalServerGroupCachingAgent was just copied into GoogleRegionalServerGroupCachingAgent. I'm going to extract all the common code into an abstract base class. Pretty much all that will be left after I do that are the methods marked `// @Override`. Those methods will be the abstract ones in the superclass. * refactor(gce): Create an AbstractGoogleServerGroupCachingAgent The code is just copy-pasted directly from GoogleZonalServerGroupCachingAgent except that I replaced the methods marked `// @Override` with abstract methods. No other changes were made. * test(gce): Add tests for GoogleRegionalServerGroupCachingAgent * refactor(gce): Convert GoogleRegionalServerGroupCachingAgent to java * refactor(gce): Convert GoogleRegionalServerGroupCachingAgent to java * test(gce): Move server property tests into a test of the abstract class It seems somewhat pointless to have two copies of these. They're essentially just data transformation tests, not really doing anything fancy depending on the subclass. A lot of the other tests could potentially also be merged but I think that would lead to some pretty weird tests that are less useful, so I'd rather keep those separate. * fix(gce): fix a small bug in GoogleRegionalServerGroupCachingAgent This bug is likely pretty innocuous since callers are looking for a specific key amongst the pendingOnDemandRequests and returning an extra one won't hurt anything, but I might as well fix it. It looks like this bug likely exists in other caching agents, too, but since it's very long-standing and the impact is negligible, I'm going to ignore that. * refactor(gce): convert anonymous test class to inner class * refactor(gce): clean up some of the test code

plumpy added 2 commits September 4, 2019 17:03

refactor(naming): add type boundaries on NamerRegistry

3ed3efb

refactor(gce): convert GoogleZonalServerGroupCachingAgent to Java

ef3ef6c

plumpy requested a review from ezimanyi September 4, 2019 21:10

ezimanyi approved these changes Sep 5, 2019

View reviewed changes

plumpy added 4 commits September 5, 2019 13:13

refactor(gce): address review comments

686ffb5

refactor(gce): remove static from a few methods so we can use fields

0f2991e

refactor(gce): Add @ParametersAreNonnullByDefault

ace7423

Merge branch 'master' into gzsgca_java

dacf87c

plumpy merged commit b6bd26b into spinnaker:master Sep 5, 2019

plumpy deleted the gzsgca_java branch September 5, 2019 18:04

spinnakerbot added the target-release/1.17 label Sep 5, 2019

plumpy added a commit to plumpy/clouddriver that referenced this pull request Sep 6, 2019

test(gce): Add a test about on-demand caching behavior

ce37d1c

This tests the behavior discussed on the review of spinnaker#4002. I should have added it there, but I didn't think about it until now.

plumpy added a commit to plumpy/clouddriver that referenced this pull request Sep 6, 2019

test(gce): Add a test about on-demand caching behavior

18be760

This tests the behavior discussed on the review of spinnaker#4002. I should have added it there, but I didn't think about it until now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(gce): convert GoogleZonalServerGroupCachingAgent to Java #4002

refactor(gce): convert GoogleZonalServerGroupCachingAgent to Java #4002

plumpy commented Sep 4, 2019

ezimanyi left a comment

ezimanyi Sep 5, 2019

plumpy Sep 5, 2019

ezimanyi Sep 5, 2019

plumpy Sep 5, 2019

ezimanyi Sep 5, 2019

plumpy Sep 5, 2019

ezimanyi Sep 5, 2019

plumpy Sep 5, 2019

ezimanyi Sep 5, 2019

plumpy Sep 5, 2019

ezimanyi Sep 5, 2019

plumpy Sep 5, 2019

ezimanyi Sep 5, 2019

plumpy Sep 5, 2019

ezimanyi Sep 5, 2019

plumpy Sep 5, 2019

ezimanyi Sep 5, 2019

plumpy Sep 5, 2019


		List<GoogleServerGroup> serverGroups = getServerGroups(providerCache);

		// If an entry in ON_DEMAND was generated _after_ we started our caching run, add it to the

refactor(gce): convert GoogleZonalServerGroupCachingAgent to Java #4002

refactor(gce): convert GoogleZonalServerGroupCachingAgent to Java #4002

Conversation

plumpy commented Sep 4, 2019

ezimanyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment