feat(export): Export deployment strategy #1055

luispollo · 2020-04-21T23:17:32Z

Adds support for exporting the deployment strategy associated with a cluster based on information retrieved from pipelines (for not-yet-migrated apps) and tasks (for already-managed apps or clusters created manually in the UI).

Limitations: currently, the export supports createServerGroup tasks and deploy pipeline stages only. Support for clone operations, if necessary, will be addressed in a separate PR.

Closes #753. As I was working in this area of the code anyway, I've taken the opportunity to also fix #625.

luispollo · 2020-04-21T23:19:36Z

keel-core/src/main/kotlin/com/netflix/spinnaker/keel/core/api/ClusterDeployStrategy.kt

+  val rollbackOnFailure: Boolean? = true,
+  val resizePreviousToZero: Boolean? = false,
+  val maxServerGroups: Int? = 2,
+  val delayBeforeDisable: Duration? = ZERO,
+  val delayBeforeScaleDown: Duration? = ZERO,


I had to make these nullable in order to fix #625

keel-core/src/main/kotlin/com/netflix/spinnaker/keel/core/api/ClusterDeployStrategy.kt

keel-core/src/main/kotlin/com/netflix/spinnaker/keel/plugin/ResourceSpecExportHelper.kt

luispollo · 2020-04-21T23:24:14Z

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

@@ -310,7 +316,7 @@ class ClusterHandler(
    }

  override suspend fun export(exportable: Exportable): ClusterSpec {
-    // Get existing infrastructure
+    // Get existing infrastructure -- this is a very costly call


I added this note because it routinely takes more than 30 seconds to retrieve the details for a single server group from clouddriver, which is insane... I'm not sure we could do much from our end, since it's not really information we would want to cache... Or maybe we do?

Do you think it's worth putting a metric and alert to make sure we don't end up calling this too often?

Good question... One of my goals for the week is to take some time to teach myself metrics and Atlas (for real this time, unlike during onboarding 😝), so I could definitely add one here.

it routinely takes more than 30 seconds to retrieve the details for a single server group from clouddriver

wut? unless we're talking like p99 latency, that seems unbelievably wrong and bad. do you have a sense of what percentile 'routinely' maps to, and if it's not something crazy high like p99 have you considered reaching out to our platform team folks to figure out why we're seeing such unreasonable latency on a regular basis?

Routinely as in > 90% of the calls. I believe I did reach out in chat about this at one point, and meant to do it again when I was testing this, but ended up forgetting. Will do.

Update: collected some data from the logs and it turns out it's not a single call, but a combination of dozens of calls with near-second response times. But it looks like we can improve on our end, so digging into that now.

@luispollo would you mind creating a separate issue to look into why this is taking 30+ seconds? You can assign it to me - I'm happy to look into it. I also feel it shouldn't take that long, maybe I can make it better.

Here it is: #1060
I'm already on it: spinnaker/clouddriver#4540

Worth removing this comment if you've fixed the problem? or do we still need to see if this fixes the problem?

Still testing. Will remove the comment anyway as it's not actionable.

luispollo · 2020-04-21T23:25:17Z

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

+   * server group.
+   */
+  private fun ServerGroup.discoverDeploymentStrategy(): ClusterDeployStrategy? {
+    val entityTags = runBlocking {


First step is to look for the spinnaker:metadata entity tag.

luispollo · 2020-04-21T23:25:48Z

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

+
+    val executionType = spinnakerMetadata["executionType"]!!.toString()
+    val executionId = spinnakerMetadata["executionId"]!!.toString()
+    val execution = runBlocking {


Once found, we look up the corresponding execution in orca.

luispollo · 2020-04-21T23:27:20Z

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

+    }
+
+    // get context from the appropriate execution stage for the execution type, drilling down into the data as needed
+    val context = if (executionType == "pipeline") {


This set of statements finds the right spot in the data structure (which is different between a task execution and a pipeline execution) to extract the context data we need.

keel-core/src/main/kotlin/com/netflix/spinnaker/keel/core/api/ClusterDeployStrategy.kt

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

luispollo · 2020-04-21T23:28:30Z

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

+      }
+
+    return when (val strategy = context?.get("strategy")) {
+      "redblack" -> RedBlack.fromOrcaStageContext(context)


Finally, once we've found the strategy + other context bits, convert those into an instance of the corresponding ClusterDeployStrategy sub-type.

luispollo · 2020-04-21T23:29:03Z

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

@@ -400,18 +502,6 @@ class ClusterHandler(
      }
    }

-  override suspend fun actuationInProgress(resource: Resource<ClusterSpec>) =


Just moved up above to come before the private extension functions.

lorin

I had some questions, but nothing that would block merge.

keel-core/src/main/kotlin/com/netflix/spinnaker/keel/core/api/ClusterDeployStrategy.kt

lorin · 2020-04-22T16:46:35Z

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

@@ -310,7 +316,7 @@ class ClusterHandler(
    }

  override suspend fun export(exportable: Exportable): ClusterSpec {
-    // Get existing infrastructure
+    // Get existing infrastructure -- this is a very costly call


Do you think it's worth putting a metric and alert to make sure we don't end up calling this too often?

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

emjburns · 2020-04-22T21:42:38Z

It looks like you're only doing this for ec2 - do you have any plans to implement this in a more generic way that would work for both ec2 and titus?

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

luispollo · 2020-04-22T23:40:20Z

It looks like you're only doing this for ec2 - do you have any plans to implement this in a more generic way that would work for both ec2 and titus?

You know, I completely forgot that Titus clusters had a deployment strategy too. 🤦‍♂️

luispollo · 2020-04-23T02:48:57Z

keel-orca/src/main/kotlin/com/netflix/spinnaker/keel/orca/ClusterExportHelper.kt

+ * Provides common logic for multiple cloud plugins to export aspects of compute clusters.
+ */
+@Component
+class ClusterExportHelper(


I wasn't sure in which module to put this class since it has a dependency on keel-orca and keel-clouddriver. It turned out that keel-orca already had a dependency on keel-clouddriver, so I decided to put it here. 🤷‍♂️

Sounds reasonable to me!

emjburns · 2020-04-23T18:53:57Z

keel-orca/src/main/kotlin/com/netflix/spinnaker/keel/orca/ClusterExportHelper.kt

+    application: String,
+    serverGroupName: String
+  ): ClusterDeployStrategy? {
+    return kotlinx.coroutines.coroutineScope {


nit: does this import need to to be so fully qualified?

keel-orca/src/main/kotlin/com/netflix/spinnaker/keel/orca/ClusterExportHelper.kt

luispollo · 2020-04-23T18:55:01Z

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt

-    any {
-      !it.isCapacityOrAutoScalingOnly()
-    }
+  override suspend fun actuationInProgress(resource: Resource<ClusterSpec>) =


Just moved up above to come before the private extension functions.

...s-plugin/src/main/kotlin/com/netflix/spinnaker/keel/api/titus/cluster/TitusClusterHandler.kt

keel-orca/src/main/kotlin/com/netflix/spinnaker/keel/orca/ClusterExportHelper.kt

emjburns

LGTM, just two small nits around making an import less fully qualified, and removing the comment around "this call is expensive" if you're actively addressing it

luispollo requested review from robfletcher, lorin, emjburns and gal-yardeni April 21, 2020 23:17

luispollo commented Apr 21, 2020

View reviewed changes

keel-core/src/main/kotlin/com/netflix/spinnaker/keel/core/api/ClusterDeployStrategy.kt Show resolved Hide resolved

luispollo commented Apr 21, 2020

View reviewed changes

keel-core/src/main/kotlin/com/netflix/spinnaker/keel/plugin/ResourceSpecExportHelper.kt Outdated Show resolved Hide resolved

luispollo commented Apr 21, 2020

View reviewed changes

robfletcher reviewed Apr 21, 2020

View reviewed changes

keel-core/src/main/kotlin/com/netflix/spinnaker/keel/core/api/ClusterDeployStrategy.kt Show resolved Hide resolved

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt Outdated Show resolved Hide resolved

luispollo commented Apr 21, 2020

View reviewed changes

luispollo added API export labels Apr 21, 2020

lorin approved these changes Apr 22, 2020

View reviewed changes

luispollo force-pushed the export-deployment-strategy branch from c914248 to c412dd0 Compare April 22, 2020 18:32

lorin mentioned this pull request Apr 22, 2020

REQUEST: New Approver status for lorin spinnaker/governance#121

Closed

6 tasks

emjburns reviewed Apr 22, 2020

View reviewed changes

keel-ec2-plugin/src/main/kotlin/com/netflix/spinnaker/keel/ec2/resource/ClusterHandler.kt Outdated Show resolved Hide resolved

luispollo commented Apr 23, 2020

View reviewed changes

luispollo self-assigned this Apr 23, 2020

emjburns reviewed Apr 23, 2020

View reviewed changes

keel-orca/src/main/kotlin/com/netflix/spinnaker/keel/orca/ClusterExportHelper.kt Outdated Show resolved Hide resolved

luispollo commented Apr 23, 2020

View reviewed changes

luispollo force-pushed the export-deployment-strategy branch from eab0954 to ec8bea1 Compare April 23, 2020 19:00

emjburns reviewed Apr 23, 2020

View reviewed changes

...s-plugin/src/main/kotlin/com/netflix/spinnaker/keel/api/titus/cluster/TitusClusterHandler.kt Show resolved Hide resolved

emjburns reviewed Apr 23, 2020

View reviewed changes

keel-orca/src/main/kotlin/com/netflix/spinnaker/keel/orca/ClusterExportHelper.kt Outdated Show resolved Hide resolved

emjburns reviewed Apr 23, 2020

View reviewed changes

keel-orca/src/main/kotlin/com/netflix/spinnaker/keel/orca/ClusterExportHelper.kt Outdated Show resolved Hide resolved

emjburns reviewed Apr 23, 2020

View reviewed changes

keel-orca/src/main/kotlin/com/netflix/spinnaker/keel/orca/ClusterExportHelper.kt Outdated Show resolved Hide resolved

emjburns approved these changes Apr 23, 2020

View reviewed changes

luispollo added 12 commits April 23, 2020 17:12

feat(export): First stab at exporting deployment strategy

4b63f90

feat(export): Add details to RedBlack export

7d48b5f

feat(export): Omit defaults for deployment strategy

7878d47

tests(export): Fix/add tests for deployment strategy export

9f34059

fix(pr): Make discoverDeploymentStrategy suspend

f3044eb

fix(pr): Remove unused function

21edbcc

fix(pr): Address review feedback

abf634c

feat(export): Export deployment strategy for Titus clusters

05b34ee

fix(pr): Fix defaults for deploy strategy

a897a7d

fix(pr): Cosmetics

ce45cec

fix(pr): Address review feedback

0cb0db0

fix(pr): Remove comment about issue tracked separately

5ed5870

luispollo force-pushed the export-deployment-strategy branch from 66b15b6 to 5ed5870 Compare April 24, 2020 00:15

luispollo merged commit a4cdc68 into spinnaker:master Apr 24, 2020

luispollo deleted the export-deployment-strategy branch April 24, 2020 00:44

luispollo mentioned this pull request Apr 28, 2020

feat(export): export artifacts from clusters #1071

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(export): Export deployment strategy #1055

feat(export): Export deployment strategy #1055

luispollo commented Apr 21, 2020

luispollo Apr 21, 2020

luispollo Apr 21, 2020

lorin Apr 22, 2020

luispollo Apr 22, 2020

erikmunson Apr 22, 2020

luispollo Apr 22, 2020

luispollo Apr 22, 2020

emjburns Apr 22, 2020

luispollo Apr 22, 2020

emjburns Apr 23, 2020

luispollo Apr 24, 2020

luispollo Apr 21, 2020

luispollo Apr 21, 2020

luispollo Apr 21, 2020

luispollo Apr 21, 2020

luispollo Apr 21, 2020

lorin left a comment

lorin Apr 22, 2020

emjburns commented Apr 22, 2020

luispollo commented Apr 22, 2020

luispollo Apr 23, 2020

emjburns Apr 23, 2020

emjburns Apr 23, 2020

luispollo Apr 23, 2020

emjburns left a comment

feat(export): Export deployment strategy #1055

feat(export): Export deployment strategy #1055

Conversation

luispollo commented Apr 21, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lorin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emjburns commented Apr 22, 2020

luispollo commented Apr 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emjburns left a comment

Choose a reason for hiding this comment