Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(gce): convert GoogleZonalServerGroupCachingAgent to Java #4002

Merged
merged 6 commits into from
Sep 5, 2019

Conversation

plumpy
Copy link
Member

@plumpy plumpy commented Sep 4, 2019

No description provided.

@plumpy plumpy requested a review from ezimanyi September 4, 2019 21:10
Copy link
Contributor

@ezimanyi ezimanyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💻 💻 💻 💰 🕵 ☕️ 🎉 🏅


List<GoogleServerGroup> serverGroups = getServerGroups(providerCache);

// If an entry in ON_DEMAND was generated _after_ we started our caching run, add it to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the behavior here is because clouddriver has an endpoint to check the status of on-demand cache refreshes.

When orca requests a cache refresh, it generally polls waiting for the request has been processed. If a pending request disappears before orca can poll for it, it will assume that it was never received and will schedule it again.

So in general we want to leave these on-demand refreshes around for at least one caching cycle before deleting them so that someone (orca) polling for the status can see that the request has been processed. This is even if the way it was "processed" was actually to ignore it because we have a more recent full cache update.

It feels like there's probably a more robust way of doing this, but that's the current way this works, so deleting a pending update too soon would likely introduce a race condition (or rather more of a race condition than there already is).

I also have a vague recollection that you asked my why we did this at some point and I didn't have a good answer, but now that I see this comment right next to the code I remember.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, yeah, that definitely seems like a crazy way of doing things. I can think of other adjectives too! But thanks for the backstory. I updated the comment; let me know if it appears inaccurate or incomplete:

// We don't evict things unless they've been processed because Orca expects to see these keys
// when calling pendingOnDemandRequests, and then knows the on-demand cache request has
// finished when they disappear. If they go away before Orca has a chance to see them, Orca
// decides the on-demand cache request was ignored completely and sends it again. So we have
// to leave them around a while so Orca can observe them. Is this weird? Yes. Yes, it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only slight comment is that orca doesn't know they've succeded "when they disappear" but rather when it sees they have a processedCount of >0, but other than that looks great!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, great, thanks! Updated the comment (again).

}

@Override
public OnDemandResult handle(ProviderCache providerCache, Map<String, ?> data) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The super is annotated @Nullable; not sure if it's good practice to also annotate this as such.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. I think that because @Nullable is not marked @Inherited, you do have to add it here too.

private static final MapSplitter IMAGE_DESCRIPTION_SPLITTER =
Splitter.on(',').withKeyValueSeparator(": ");

private final GoogleNamedAccountCredentials credentials;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how much you want to annotate everything @Nonnull but most of these private final variables are at some point in the class used without null checks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added @ParametersAreNonnullByDefault to the class. (Which then caused IntelliJ to highlight a missing @Nullable annotation.)

.get("cacheResults"),
new TypeReference<Map<String, List<DefaultCacheData>>>() {});
onDemandData.forEach(
(namespace, cacheDatas) -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cacheDatas always confuses me, but I really can't think of a better name for a collection of things called cacheData

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I briefly had this as cacheDataObjects but decided to go with cacheDatas. 😕

return ImmutableList.of();
}
return distributionPolicy.getZones().stream()
.map(z -> Utils.getLocalName(z.getZone()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe more stream-y as:

      .map(DistributionPolicyZoneConfiguration::getZone)
      .map(Utils::getLocalName)

🤷‍♂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno, I'm generally of the mind that merging consecutive map calls aids readability instead of hinders it? (Obviously up to some limit, where the lambda is still relatively readable.)

IOW: I don't think that's necessary.

launchConfig.put("minCpuPlatform", instanceTemplate.getProperties().getMinCpuPlatform());
setSourceImage(serverGroup, launchConfig, disks, credentials, providerCache);
}
serverGroup.setLaunchConfig(launchConfig);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like one of the few places we set a mutable collection on the serverGroup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, fixed this and the other place (serverGroup.setAsg).

Sadly a GoogleServerGroup is going to be very difficult to make fully immutable because it has several fields that are mutable objects we don't control (from the Google Compute APIs) :/

template.getProperties().getDisks().stream()
.filter(disk -> !disk.getBoot())
.forEach(sortedDisks::add);
return sortedDisks;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be immutable (particularly since the first return does return an immutable list).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, thanks.

if (autoscaler.getStatusDetails() != null) {
serverGroup.setAutoscalingMessages(
autoscaler.getStatusDetails().stream()
.filter(details -> details.getMessage() != null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I might invert the filter and map since we're just filtering on the result of the map

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, looks much nicer too:

          autoscaler.getStatusDetails().stream()
              .map(AutoscalerStatusDetails::getMessage)
              .filter(Objects::nonNull)
              .collect(toImmutableList())

@plumpy plumpy merged commit b6bd26b into spinnaker:master Sep 5, 2019
@plumpy plumpy deleted the gzsgca_java branch September 5, 2019 18:04
plumpy added a commit to plumpy/clouddriver that referenced this pull request Sep 6, 2019
This tests the behavior discussed on the review of spinnaker#4002. I should have
added it there, but I didn't think about it until now.
plumpy added a commit to plumpy/clouddriver that referenced this pull request Sep 6, 2019
This tests the behavior discussed on the review of spinnaker#4002. I should have
added it there, but I didn't think about it until now.
plumpy added a commit that referenced this pull request Sep 9, 2019
…4007)

* refactor(gce): remove an unnecessary checkState()

We can survive this condition just fine.

* test(gce): Add a test about on-demand caching behavior

This tests the behavior discussed on the review of #4002. I should have
added it there, but I didn't think about it until now.

* refactor(gce): fix an import

* refactor(gce): remove `static` from a method

* refactor(gce): use getAccountName() consistently

* docs(gce): remove a TODO

It actually does seem to be true that an autoscaler's name always equals
that of its associated instance group. This _might_ even be a GCE rule,
but it's definitely something we always do in Spinnaker (see
UpsertGoogleAutoscalingPolicyAtomicOperation and
GCEUtil#buildAutoscaler).

* refactor(gce): Prepare zonal caching agent for an abstract superclass

About  90% of the code from GoogleZonalServerGroupCachingAgent was just
copied into GoogleRegionalServerGroupCachingAgent. I'm going to extract
all the common code into an abstract base class.

Pretty much all that will be left after I do that are the methods
marked `// @Override`. Those methods will be the abstract ones in the
superclass.

* refactor(gce): Create an AbstractGoogleServerGroupCachingAgent

The code is just copy-pasted directly from
GoogleZonalServerGroupCachingAgent except that I replaced the methods
marked `// @Override` with abstract methods. No other changes were made.

* test(gce): Add tests for GoogleRegionalServerGroupCachingAgent

* refactor(gce): Convert GoogleRegionalServerGroupCachingAgent to java

* refactor(gce): Convert GoogleRegionalServerGroupCachingAgent to java

* test(gce): Move server property tests into a test of the abstract class

It seems somewhat pointless to have two copies of these. They're
essentially just data transformation tests, not really doing anything
fancy depending on the subclass.

A lot of the other tests could potentially also be merged but I think
that would lead to some pretty weird tests that are less useful, so I'd
rather keep those separate.

* fix(gce): fix a small bug in GoogleRegionalServerGroupCachingAgent

This bug is likely pretty innocuous since callers are looking for a
specific key amongst the pendingOnDemandRequests and returning an extra
one won't hurt anything, but I might as well fix it.

It looks like this bug likely exists in other caching agents, too, but
since it's very long-standing and the impact is negligible, I'm going to
ignore that.

* refactor(gce): convert anonymous test class to inner class

* refactor(gce): clean up some of the test code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants