Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Modules] Fix/Add purge during removal of machine learning workspace #3597

Merged

Conversation

MariusStorhaug
Copy link
Contributor

@MariusStorhaug MariusStorhaug commented Aug 20, 2023

Fixes #3596

Description

  • ML Removal
    • The ML Removal uses a REST API call that returns HTTP 202 (Accepted) which kicks of a deletion job and returns to the caller. This seems to leave the workspace with a chance of being removed during resource group deletion, causing the resource to not be purged while deleted, aka just soft-deleted. To handle this I added a wait'n'check loop that runs for upto 1 hour (240 checks x 15s break).
  • Out of original scope:
    • Resource locks
      • Resource Locks are now handled before a resource is attempted to be removed/purged.
        • During some of my testing for fixing the ML workspace deletion, I found that when the deployment failed the locks were not detected. Reason unknown, but it never showed up on the list so I added this to detect such cases and remove them regardless.
      • Resource Lock removal code has a new home, a helper script Invoke-ResourceLockRemoval.
        • Does a wait'n'check (10 x 10) for regular resource lock deletion, increasing the chance of a successful deletion.
      • New script to get resource locks, as the default Remove-AzResourceLock does not handle LockId.
    • Script cleanup
      • Parameter casing updated in whole script
      • Remove unused parameters
      • Align wording in logs
      • Remove DevOpsLabs section as new default flow does lock removal before each special case is handled.
Before log: Before log
This PR This PR
Reran a deployment, where I went in to add some 'rogue' locks, result is as follows: Reran a deployment, where I went in to add some 'rogue' locks, result is as follows:

Can be seen on the pipeline linked below on the enc tests.

Pipeline references

Pipeline
MachineLearningServices - Workspaces

Type of Change

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Update to documentation

Checklist

  • I'm sure there are no other open Pull Requests for the same update/change
  • My corresponding pipelines / checks run clean and green without any errors or warnings
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (readme)
  • I did format my code

@MariusStorhaug MariusStorhaug requested a review from a team as a code owner August 20, 2023 18:47
@MariusStorhaug MariusStorhaug marked this pull request as draft August 20, 2023 19:06
@MariusStorhaug MariusStorhaug changed the title [Modules] Add purge during removal of machine learning workspace [Modules] Fix/Add purge during removal of machine learning workspace Aug 20, 2023
@MariusStorhaug MariusStorhaug force-pushed the feat/purgeMachineLearningWorkspace branch from 4436e37 to fd73e7c Compare August 20, 2023 19:14
@MariusStorhaug
Copy link
Contributor Author

The temp commit is to fix the privateEndpoint API version issues. This is just to show the functionality of the removal script changes and will be reverted as it is part of #3585.

@MariusStorhaug MariusStorhaug force-pushed the feat/purgeMachineLearningWorkspace branch from febb9e5 to 6ea01aa Compare August 21, 2023 19:38
@MariusStorhaug MariusStorhaug force-pushed the feat/purgeMachineLearningWorkspace branch from f55d543 to 91c3f53 Compare August 21, 2023 22:04
@MariusStorhaug MariusStorhaug marked this pull request as ready for review August 21, 2023 22:05
@AlexanderSehr AlexanderSehr added enhancement New feature or request [cat] pipelines category: pipelines [cat] utilities category: utilities labels Aug 22, 2023
@AlexanderSehr AlexanderSehr added this to the Release v0.11.0 milestone Aug 30, 2023
Co-authored-by: Alexander Sehr <ASehr@hotmail.de>
MariusStorhaug and others added 4 commits September 4, 2023 10:46
…val.ps1

Co-authored-by: Alexander Sehr <ASehr@hotmail.de>
…val.ps1

Co-authored-by: Alexander Sehr <ASehr@hotmail.de>
…val.ps1

Co-authored-by: Alexander Sehr <ASehr@hotmail.de>
…val.ps1

Co-authored-by: Alexander Sehr <ASehr@hotmail.de>
AlexanderSehr
AlexanderSehr previously approved these changes Sep 4, 2023
@MariusStorhaug
Copy link
Contributor Author

…Removal.ps1

Co-authored-by: Alexander Sehr <ASehr@hotmail.de>
…Removal.ps1

Co-authored-by: Alexander Sehr <ASehr@hotmail.de>
@MariusStorhaug
Copy link
Contributor Author

Pipelines are not green, but failing for other reasons. Feel free to merge.

@AlexanderSehr AlexanderSehr merged commit 2bd0601 into Azure:main Sep 4, 2023
1 check passed
@MariusStorhaug MariusStorhaug deleted the feat/purgeMachineLearningWorkspace branch September 4, 2023 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[cat] pipelines category: pipelines [cat] utilities category: utilities enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Purge Azure Machine Learning workspaces when removing sometimes fails or is incomplete
3 participants