Skip to content
This repository has been archived by the owner on Oct 24, 2023. It is now read-only.

fix: VM and VMSS do not have user assigned identity in its dependsOn #2152

Closed
wants to merge 2 commits into from

Conversation

norshtein
Copy link
Member

Reason for Change:

Background is described in #2082 .

Issue Fixed:

Fixes #2082

Requirements:

Notes:

@acs-bot acs-bot added the size/M label Oct 14, 2019
@jackfrancis
Copy link
Member

/azp run pr-e2e

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jackfrancis
Copy link
Member

@yangl900 @serbrech FYI

@jackfrancis
Copy link
Member

@xuto2 FYI

@codecov
Copy link

codecov bot commented Oct 14, 2019

Codecov Report

Merging #2152 into master will decrease coverage by 3.03%.
The diff coverage is 75%.

@@            Coverage Diff             @@
##           master    #2152      +/-   ##
==========================================
- Coverage   72.47%   69.43%   -3.04%     
==========================================
  Files         140      144       +4     
  Lines       25589    32762    +7173     
==========================================
+ Hits        18545    22748    +4203     
- Misses       5976     8689    +2713     
- Partials     1068     1325     +257

@@ -34,6 +34,9 @@ func CreateMasterVM(cs *api.ContainerService) VirtualMachineARM {
if isStorageAccount {
dependencies = append(dependencies, "[variables('masterStorageAccountName')]")
}
if userAssignedIDEnabled {
dependencies = append(dependencies, "[concat('Microsoft.ManagedIdentity/userAssignedIdentities/', variables('userAssignedID'))]")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we filter the user assigned identity resource and dependency for scalingup/upgrading template? There is some normalization method in transform.go that filter those infra resources/dependencies like nsg/vnet/rt, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not familiar with that. What's the purpose of filtering resources and dependencies? To prevent the redeployment swiping existing configuration? If so, for user assigned identity, that will be ok. Even after redeployment, the identity itself, all role assignment on it and all association relationship will keep unchanged; and keep identity in VM's dependency is also fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the identity have the same principal ID after being redeployed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Contributor

@xuto2 xuto2 Oct 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally when we scale/upgrade a cluster, we should only deploy new VM/VMSS without redeploying the cluster infrastructures, such as VNET/RouteTable/NSG and this UserAssignedIdentity. The consequence of touching these infrastructure is nondeterministic and is tied to how these infra are built/maintained.

If I were you I would filter them out(Look for these below methods in transform.go)

NormalizeMasterResourcesForScaling
NormalizeForK8sVMASScalingUp
NormalizeResourcesForK8sMasterUpgrade
NormalizeResourcesForK8sAgentUpgrade

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think user assigned identity is different from VNET/RouteTable/NSG:
As you said "the consequence of touching these infrastructure is nondeterministic and is tied to how these infra are built/maintained", the biggest difference is it is guaranteed by identity team that a redeployment on a user assigned identity will not change current user assigned identity -- it's a deterministic behavior. And in current AKS-Engine's implementation, the definition of user assigned identity and role assignment on it(at the scope of cluster resource group, that is MC_ resource group in AKS) are not removed from the template when upgrading and scaling, as far as I can see, it brings following benefit:

  • When the identity or the role assignment on it is deleted by mistake, a redeployment can bring them back, that is similar as a "reconciliation". And we don't need to worry about a redeployment on user assigned identity will wipe out anything so a redeployment will never harm.

The main purpose of this PR is to add the user assigned identity to VM's dependsOn, I think we may want to keep current AKS-Engine's transformer implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @xuto2

@jackfrancis jackfrancis added this to In progress in backlog Oct 16, 2019
xuto2
xuto2 previously approved these changes Oct 28, 2019
@xuto2
Copy link
Contributor

xuto2 commented Oct 28, 2019

/lgtm

@acs-bot
Copy link

acs-bot commented Oct 28, 2019

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: norshtein, xuto2
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: devigned

If they are not already assigned, you can assign the PR to them by writing /assign @devigned in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@CecileRobertMichon
Copy link
Contributor

/azp run pr-e2e

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jackfrancis jackfrancis moved this from In progress to Under Review in backlog Oct 31, 2019
@jackfrancis jackfrancis added this to the v0.44.0 milestone Oct 31, 2019
@jackfrancis jackfrancis modified the milestones: v0.44.0, v0.45.0 Nov 20, 2019
@mboersma
Copy link
Member

/azp run pr-e2e

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mboersma
Copy link
Member

mboersma commented Dec 4, 2019

/azp run pr-e2e

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jackfrancis jackfrancis modified the milestones: v0.45.0, v0.46.0 Dec 16, 2019
@jackfrancis
Copy link
Member

Is this stale?

@jackfrancis jackfrancis modified the milestones: v0.46.0, Next Dec 18, 2019
@norshtein
Copy link
Member Author

@jackfrancis not stale. Any thing I need to do to get this PR merged into master branch?

@acs-bot
Copy link

acs-bot commented Dec 19, 2019

New changes are detected. LGTM label has been removed.

@acs-bot acs-bot removed the lgtm label Dec 19, 2019
@jackfrancis
Copy link
Member

@norshtein the reason it wasn't merged is the E2E test failures. I've rebased against master, let's run E2E tests again.

@jackfrancis
Copy link
Member

/azp run pr-e2e

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@stale
Copy link

stale bot commented Jan 19, 2020

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 19, 2020
@stale stale bot removed the stale label Jan 24, 2020
@jackfrancis jackfrancis moved this from Under Review to Blocked in backlog Jan 29, 2020
@timja
Copy link

timja commented Mar 4, 2020

@norshtein the reason it wasn't merged is the E2E test failures. I've rebased against master, let's run E2E tests again.

just glancing through the PRs on this repo, it's green now, is there anything else blocking it?

@jackfrancis
Copy link
Member

I'm running a back-compat test to ensure this change works w/ aks-engine upgrade flows of pre-existing clusters (i.e., clusters created before this change)

@craiglpeters craiglpeters moved this from Blocked to Under Review in backlog Mar 4, 2020
@mboersma
Copy link
Member

mboersma commented Apr 3, 2020

/azp run pr-e2e

I'm running a back-compat test...

@jackfrancis were you able to run this and get any good signal?

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@jackfrancis
Copy link
Member

We've done some work in this surface area recently, so I'd prefer we start any net new changes from scratch at this point.

@jackfrancis jackfrancis closed this Apr 3, 2020
backlog automation moved this from Under Review to Done Apr 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
backlog
  
Done
Development

Successfully merging this pull request may close these issues.

Deployment will fail for cluster using user assigned identity when identity service is throttled
7 participants