-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(gce): convert GoogleZonalServerGroupCachingAgent to Java #4002
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💻 💻 💻 💰 🕵 ☕️ 🎉 🏅
|
||
List<GoogleServerGroup> serverGroups = getServerGroups(providerCache); | ||
|
||
// If an entry in ON_DEMAND was generated _after_ we started our caching run, add it to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the behavior here is because clouddriver has an endpoint to check the status of on-demand cache refreshes.
When orca
requests a cache refresh, it generally polls waiting for the request has been processed. If a pending request disappears before orca can poll for it, it will assume that it was never received and will schedule it again.
So in general we want to leave these on-demand refreshes around for at least one caching cycle before deleting them so that someone (orca) polling for the status can see that the request has been processed. This is even if the way it was "processed" was actually to ignore it because we have a more recent full cache update.
It feels like there's probably a more robust way of doing this, but that's the current way this works, so deleting a pending update too soon would likely introduce a race condition (or rather more of a race condition than there already is).
I also have a vague recollection that you asked my why we did this at some point and I didn't have a good answer, but now that I see this comment right next to the code I remember.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, yeah, that definitely seems like a crazy way of doing things. I can think of other adjectives too! But thanks for the backstory. I updated the comment; let me know if it appears inaccurate or incomplete:
// We don't evict things unless they've been processed because Orca expects to see these keys
// when calling pendingOnDemandRequests, and then knows the on-demand cache request has
// finished when they disappear. If they go away before Orca has a chance to see them, Orca
// decides the on-demand cache request was ignored completely and sends it again. So we have
// to leave them around a while so Orca can observe them. Is this weird? Yes. Yes, it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only slight comment is that orca doesn't know they've succeded "when they disappear" but rather when it sees they have a processedCount
of >0, but other than that looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, great, thanks! Updated the comment (again).
} | ||
|
||
@Override | ||
public OnDemandResult handle(ProviderCache providerCache, Map<String, ?> data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The super
is annotated @Nullable
; not sure if it's good practice to also annotate this as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good point. I think that because @Nullable
is not marked @Inherited
, you do have to add it here too.
private static final MapSplitter IMAGE_DESCRIPTION_SPLITTER = | ||
Splitter.on(',').withKeyValueSeparator(": "); | ||
|
||
private final GoogleNamedAccountCredentials credentials; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how much you want to annotate everything @Nonnull
but most of these private final
variables are at some point in the class used without null checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added @ParametersAreNonnullByDefault
to the class. (Which then caused IntelliJ to highlight a missing @Nullable
annotation.)
.get("cacheResults"), | ||
new TypeReference<Map<String, List<DefaultCacheData>>>() {}); | ||
onDemandData.forEach( | ||
(namespace, cacheDatas) -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cacheDatas
always confuses me, but I really can't think of a better name for a collection of things called cacheData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I briefly had this as cacheDataObjects
but decided to go with cacheDatas
. 😕
return ImmutableList.of(); | ||
} | ||
return distributionPolicy.getZones().stream() | ||
.map(z -> Utils.getLocalName(z.getZone())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe more stream-y as:
.map(DistributionPolicyZoneConfiguration::getZone)
.map(Utils::getLocalName)
🤷♂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno, I'm generally of the mind that merging consecutive map calls aids readability instead of hinders it? (Obviously up to some limit, where the lambda is still relatively readable.)
IOW: I don't think that's necessary.
launchConfig.put("minCpuPlatform", instanceTemplate.getProperties().getMinCpuPlatform()); | ||
setSourceImage(serverGroup, launchConfig, disks, credentials, providerCache); | ||
} | ||
serverGroup.setLaunchConfig(launchConfig); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like one of the few places we set a mutable collection on the serverGroup
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, fixed this and the other place (serverGroup.setAsg).
Sadly a GoogleServerGroup is going to be very difficult to make fully immutable because it has several fields that are mutable objects we don't control (from the Google Compute APIs) :/
template.getProperties().getDisks().stream() | ||
.filter(disk -> !disk.getBoot()) | ||
.forEach(sortedDisks::add); | ||
return sortedDisks; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be immutable (particularly since the first return
does return an immutable list).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, thanks.
if (autoscaler.getStatusDetails() != null) { | ||
serverGroup.setAutoscalingMessages( | ||
autoscaler.getStatusDetails().stream() | ||
.filter(details -> details.getMessage() != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I might invert the filter
and map
since we're just filtering on the result of the map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, looks much nicer too:
autoscaler.getStatusDetails().stream()
.map(AutoscalerStatusDetails::getMessage)
.filter(Objects::nonNull)
.collect(toImmutableList())
This tests the behavior discussed on the review of spinnaker#4002. I should have added it there, but I didn't think about it until now.
This tests the behavior discussed on the review of spinnaker#4002. I should have added it there, but I didn't think about it until now.
…4007) * refactor(gce): remove an unnecessary checkState() We can survive this condition just fine. * test(gce): Add a test about on-demand caching behavior This tests the behavior discussed on the review of #4002. I should have added it there, but I didn't think about it until now. * refactor(gce): fix an import * refactor(gce): remove `static` from a method * refactor(gce): use getAccountName() consistently * docs(gce): remove a TODO It actually does seem to be true that an autoscaler's name always equals that of its associated instance group. This _might_ even be a GCE rule, but it's definitely something we always do in Spinnaker (see UpsertGoogleAutoscalingPolicyAtomicOperation and GCEUtil#buildAutoscaler). * refactor(gce): Prepare zonal caching agent for an abstract superclass About 90% of the code from GoogleZonalServerGroupCachingAgent was just copied into GoogleRegionalServerGroupCachingAgent. I'm going to extract all the common code into an abstract base class. Pretty much all that will be left after I do that are the methods marked `// @Override`. Those methods will be the abstract ones in the superclass. * refactor(gce): Create an AbstractGoogleServerGroupCachingAgent The code is just copy-pasted directly from GoogleZonalServerGroupCachingAgent except that I replaced the methods marked `// @Override` with abstract methods. No other changes were made. * test(gce): Add tests for GoogleRegionalServerGroupCachingAgent * refactor(gce): Convert GoogleRegionalServerGroupCachingAgent to java * refactor(gce): Convert GoogleRegionalServerGroupCachingAgent to java * test(gce): Move server property tests into a test of the abstract class It seems somewhat pointless to have two copies of these. They're essentially just data transformation tests, not really doing anything fancy depending on the subclass. A lot of the other tests could potentially also be merged but I think that would lead to some pretty weird tests that are less useful, so I'd rather keep those separate. * fix(gce): fix a small bug in GoogleRegionalServerGroupCachingAgent This bug is likely pretty innocuous since callers are looking for a specific key amongst the pendingOnDemandRequests and returning an extra one won't hurt anything, but I might as well fix it. It looks like this bug likely exists in other caching agents, too, but since it's very long-standing and the impact is negligible, I'm going to ignore that. * refactor(gce): convert anonymous test class to inner class * refactor(gce): clean up some of the test code
No description provided.