Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mounts now handle deleted cluster correctly #87

Merged
merged 13 commits into from Jun 11, 2020

Conversation

storey247
Copy link

  • Reworked the logic of the mounts to check for the error state that the cluster is invalid
  • When the error is detected the id of the resource is cleared to indicate it must be re-created
  • Tests added to check updated logic

Ideally there is no need to re-create the cluster, but as we cannot check that the mount really does exist, this feels like the safest way to solve the issue.

This addresses issue #86

@@ -39,7 +67,7 @@ func TestAccAzureAdlsGen2Mount_capture_error(t *testing.T) {
}

func testAccAzureAdlsGen2Mount_correctly_mounts() string {
clientID := os.Getenv("ARM_CLIENT_ID")
clientID := os.Getenv("DATABRICKS_AZURE_CLIENT_ID")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason for changing this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmmmm, i think this was when i was struggling to get the tests running as the .env wasn't applying and clientID was being set to null causing tests to fail.

I since figured out the issue there so can probably roll this back. The actual values of ARM_CLIENT_ID and DATABRICKS_AZURE_CLIENT_ID are the same though.

Will revert this change :)

@codecov-commenter
Copy link

codecov-commenter commented Jun 9, 2020

Codecov Report

Merging #87 into master will decrease coverage by 0.11%.
The diff coverage is 6.12%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #87      +/-   ##
==========================================
- Coverage   47.27%   47.16%   -0.12%     
==========================================
  Files          53       53              
  Lines        6815     6838      +23     
==========================================
+ Hits         3222     3225       +3     
- Misses       3539     3559      +20     
  Partials       54       54              
Flag Coverage Δ
#client 76.45% <ø> (ø)
#provider 38.63% <6.12%> (-0.12%) ⬇️
Impacted Files Coverage Δ
databricks/mounts.go 0.00% <0.00%> (ø)
...ricks/resource_databricks_azure_adls_gen1_mount.go 21.87% <0.00%> (-0.71%) ⬇️
...ricks/resource_databricks_azure_adls_gen2_mount.go 36.70% <0.00%> (-0.96%) ⬇️
databricks/resource_databricks_azure_blob_mount.go 21.66% <0.00%> (-0.75%) ⬇️
databricks/resource_databricks_cluster.go 42.61% <ø> (+0.11%) ⬆️
databricks/utils.go 5.88% <100.00%> (+5.88%) ⬆️

@stuartleeks
Copy link
Contributor

@storey247 - Nice! I know that we've bumped into this issue quite a few times. Will be good to have it fixed 😄

@stikkireddy
Copy link
Contributor

stikkireddy commented Jun 9, 2020

hey quick question on the implementation, if we cant find the cluster, we are setting the id of the resource to null, but it would go through the process of mounting again if the user tries to re applies the resource correct? Then wouldnt the mount fail because it is already found?

@storey247 @stuartleeks

@storey247
Copy link
Author

storey247 commented Jun 9, 2020

@stikkireddy so i tested this out and because we set the id to be "" this causes the tf plan to mark this resource as needing to be destroyed. As the destroy will remove the mount from the workspace it should be quite safe to create it again.

As I said, this isn't ideal as the mount isn't exactly tightly coupled to the cluster id, but this removes the issue where you can no longer plan if the cluster id is invalid because the provider throws an error

I also added acceptance tests to check this scenario too 😄

@stikkireddy
Copy link
Contributor

@storey247 awesome thanks for the explanation, just a quick question, so it will get past the plan phase because during the refreshing of state the read will work as it will set it to "", but during the destroy the cluster is invalid so it will throw runtime during the apply right?

@storey247
Copy link
Author

That’s actually a good point, my acceptance tests didn’t fail on this, so not sure if the tests are missing something 🤔

Will test the scenario manually using a local build of the provider and see what happens

@storey247
Copy link
Author

storey247 commented Jun 9, 2020

@stikkireddy ok so i double checked all this, setting the ID to be empty doesn't actually cause the old mount to be destroyed. Interestingly it just seems to make terraform forget about it during the plan.

However, there was an issue that arose because it would attempt to re-create the mount with the same configuration, and this would throw an error inside the mount create logic due to directory already being mounted. I have now added a check prior to creating the mount to check for an existing mount, and if it already exists, skip trying to re-create it.

The reason why my tests didn't pick this up is because the mount name is generated by including the cluster name in the mount resource, that way it always had a unique name and never caused the conflict i noticed running locally 😄

@stikkireddy
Copy link
Contributor

@storey247 can you also please update the AzureBlobMount as well to add this capability and add this as a warning/note to the documentation so that the user is aware that when making the resource, if the cluster is deleted, this resource would also be deleted but, when recreated it will not attempt to create the resource/mount again.

@storey247
Copy link
Author

@stikkireddy thanks for the great feedback. I have made the following updates as you requested:

  • Updated the ADLS gen 1 mount read to fix the issue with cluster not being available. Now gen 1, gen 2 and blob mounts all work as expected during plan.
  • Added fix to AzureBlobMount to prevent mount creation failing as mount already exists
  • Updated docs to flag that the mount will be tainted for re-creation by tf if the cluster is deleted, but that it will not actually get recreated in the workspace unless the configuration has changed

I have tested all these changes locally and all work correctly and have added functional tests where necessary.

I hope this fix is now satisfactory and can be merged? 😃

@stikkireddy
Copy link
Contributor

@storey247 thank you very much for contributing this fix 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ISSUE] ADLS Mount with cluster that has been deleted causes error in tf plan
4 participants