Azure · jackfrancis · Jan 31, 2019 · Jan 25, 2019 · Jan 29, 2019 · Jan 30, 2019
diff --git a/docs/topics/upgrade.md b/docs/topics/upgrade.md
@@ -76,6 +76,37 @@ For example,
   --client-secret xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
 ```
 
-### What can go wrong
+## Known Limitations
+
+### Manual reconciliation
+
+The upgrade operation is long running, and for large clusters, more susceptible to single operational failures. This is based on the design principle of upgrade enumerating, one-at-a-time, through each node in the cluster. A transient Azure resource allocation error could thus interrupt the successful progression of the overall transaction. At present, the upgrade operation is implemented to "fail fast"; and so, if a well formed upgrade operation fails before completing, it can be manually retried by invoking the exact same command line arguments as were sent originally. The upgrade operation will enumerate through the cluster nodes, skipping any nodes that have already been upgraded to the desired Kubernetes version. Those nodes that match the *original* Kubernetes version will then, one-at-a-time, be cordon and drained, and upgraded to the desired version. Put another way, an upgrade command is designed to be idempotent across retry scenarios.
+
+### Cluster-autoscaler + VMSS
+
+At present, the Azure cloudprovider cluster-autoscaler implementation for VMSS relies upon the original ARM template deployment specification to inform the Azure IaaS (VM, NIC, CustomScriptExtension, etc) configuration to scale out new nodes.
+
+Because `aks-engine upgrade` also employs ARM template deployments to evolve the cluster state going forward, as soon as you upgrade your cluster, the cluster-autoscaler will no longer scale out your cluster according to the latest version. You will see cluster-autoscaler scale out scenarios that look like this one:
+
+```
+$ kubectl get nodes
+NAME                                STATUS    ROLES     AGE       VERSION
+k8s-agentpool-27988949-vmss00000a   Ready     agent     1h        v1.11.5
+k8s-agentpool-27988949-vmss00000m   Ready     agent     1m        v1.10.12
+k8s-agentpool-27988949-vmss00000n   Ready     agent     1m        v1.10.12
+k8s-agentpool-27988949-vmss00000o   Ready     agent     1m        v1.10.12
+k8s-agentpool-27988949-vmss00000p   Ready     agent     1m        v1.10.12
+k8s-agentpool-27988949-vmss00000q   Ready     agent     1m        v1.10.12
+k8s-agentpool-27988949-vmss00000r   Ready     agent     1m        v1.10.12
+k8s-master-27988949-0               Ready     master    2h        v1.11.5
+```
+
+For this reason, we do not recommend incorporating `aks-engine upgrade` into your operational workflow if you also rely upon cluster-autoscaler + VMSS
+
+### Cluster-autoscaler + VMAS
+
+A similar scenario exists for VMAS as well, but because the cluster-autoscaler spec includes a configurable ARM template deployment reference, you may manually maintain that reference over time to be current with the ARM template deployment that `aks-engine upgrade` creates during an upgrade operation.
+
+// TODO describe the above
+
 
-By its nature, the upgrade operation is long running and potentially could fail for various reasons, such as temporary lack of resources, etc. In this case, rerun the command. The `upgrade` command is idempotent, and will pick up execution from the point it failed on.