Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResourceNotFound error during Terraform Apply in a functional test #7060

Closed
1 task
ytimocin opened this issue Jan 21, 2024 · 9 comments
Closed
1 task

ResourceNotFound error during Terraform Apply in a functional test #7060

ytimocin opened this issue Jan 21, 2024 · 9 comments
Assignees
Labels
bug Something is broken or not working as expected flaky-test Flaky functional/unit tests. maintenance Issue is a non-user-facing task like updating tests, improving automation, etc.. triaged This issue has been reviewed and triaged

Comments

@ytimocin
Copy link
Contributor

ytimocin commented Jan 21, 2024

Steps to reproduce

I have seen this error twice in 2 days during long running tests: #7040, #7050.

Observed behavior

Resource on Azure is not yet created when we are executing a recipe. I am not sure how this works exactly but I believe that we should wait for the resource to be up and running.

Desired behavior

Resource should be available when we are executing a recipe.

Workaround

Rerunning the test works.

rad Version

RELEASE VERSION BICEP COMMIT
0.29.0 v0.29.0 0.29.0 6abd7bf

Operating system

macOS Sonoma 14.2.1, i386

Additional context

#7040
#7050

Would you like to support us?

  • Yes, I would like to support you

AB#10962

@ytimocin ytimocin added the bug Something is broken or not working as expected label Jan 21, 2024
@radius-triage-bot
Copy link

👋 @ytimocin Thanks for filing this bug report.

A project maintainer will review this report and get back to you soon. If you'd like immediate help troubleshooting, please visit our Discord server.

For more information on our triage process please visit our triage overview

@kachawla
Copy link
Contributor

kachawla commented Jan 22, 2024

@ytimocin are we seeing this happen consistently long running tests or is it passing sometimes?

I believe that we should wait for the resource to be up and running.

The error is happening during terraform apply, so managing dependencies between resources is entirely handled by terraform at this point and isn't something that we can control. Do we re-use same resource group for concurrent executions of functional test runs in the long running cluster?

"message": "terraform apply failure: exit status 1\n\nError: failed creating container: failed creating container: containers.Client#Create: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code=\"ResourceNotFound\" Message=\"The specified resource does not exist.\\nRequestId:3aeacbfe-601e-0115-1956-4bae7f000000\\nTime:2024-01-20T04:10:25.4948804Z\"\n\n  with module.default.azurerm_storage_container.test_container,\n  on .terraform/modules/default/main.tf line 18, in resource \"azurerm_storage_container\" \"test_container\":\n  18: resource \"azurerm_storage_container\" \"test_container\" {\n\n"
    cli.go:418: [rad]         }

@ytimocin
Copy link
Contributor Author

@ytimocin are we seeing this happen consistently long running tests or is it passing sometimes?

I believe that we should wait for the resource to be up and running.

The error is happening during terraform apply, so managing dependencies between resources is entirely handled by terraform at this point and isn't something that we can control. Do we re-use same resource group for concurrent executions of functional test runs in the long running cluster?

"message": "terraform apply failure: exit status 1\n\nError: failed creating container: failed creating container: containers.Client#Create: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code=\"ResourceNotFound\" Message=\"The specified resource does not exist.\\nRequestId:3aeacbfe-601e-0115-1956-4bae7f000000\\nTime:2024-01-20T04:10:25.4948804Z\"\n\n  with module.default.azurerm_storage_container.test_container,\n  on .terraform/modules/default/main.tf line 18, in resource \"azurerm_storage_container\" \"test_container\":\n  18: resource \"azurerm_storage_container\" \"test_container\" {\n\n"
    cli.go:418: [rad]         }

I don't think we are. Resource groups are generated per run and they are unique. Not exactly sure why this happens. Probably needs a deeper investigation because I have seen this happen a few times in last few days.

@shalabhms shalabhms added the triaged This issue has been reviewed and triaged label Jan 22, 2024
@radius-triage-bot
Copy link

👍 We've reviewed this issue and have agreed to add it to our backlog. Please subscribe to this issue for notifications, we'll provide updates when we pick it up.

We also welcome community contributions! If you would like to pick this item up sooner and submit a pull request, please visit our contribution guidelines and assign this to yourself by commenting "/assign" on this issue.

For more information on our triage process please visit our triage overview

@shalabhms shalabhms added flaky-test Flaky functional/unit tests. maintenance Issue is a non-user-facing task like updating tests, improving automation, etc.. labels Jan 22, 2024
@radius-triage-bot
Copy link

👋 @ytimocin Thanks for filing this issue.

A project maintainer will review this issue and get back to you soon.

We also welcome community contributions! If you would like to pick this item up sooner and submit a pull request, please visit our contribution guidelines and assign this to yourself by commenting "/assign" on this issue.

For more information on our triage process please visit our triage overview

@vinayada1
Copy link
Contributor

Another instance of failure:
logs_104396.zip

@kachawla
Copy link
Contributor

kachawla commented Feb 5, 2024

Looks like this was automatically closed due to the refence in my PR #7108. Re-opening until we can confirm that the resource name uniqueness actually fixed the issue. I'll keep an eye on this.

@kachawla kachawla reopened this Feb 5, 2024
@kachawla kachawla reopened this Feb 8, 2024
@ytimocin
Copy link
Contributor Author

ytimocin commented Feb 8, 2024

I think after @kachawla's changes went into the main branch, I haven't seen this error. We can keep an eye a few more days and then close it.

@kachawla kachawla self-assigned this Feb 8, 2024
@kachawla
Copy link
Contributor

kachawla commented Feb 8, 2024

I think after @kachawla's changes went into the main branch, I haven't seen this error. We can keep an eye a few more days and then close it.

Yeah I have been tracking this. Assigned it to myself, will close it early next week if we don't see it again.

willdavsmith pushed a commit to willdavsmith/radius that referenced this issue Mar 4, 2024
# Description

Trying a fix for radius-project#7060.
Resource name uniqueness is tied to resource group right now,
potentially causing concurrency issues.

## Type of change

<!--

Please select **one** of the following options that describes your
change and delete the others. Clearly identifying the type of change you
are making will help us review your PR faster, and is used in authoring
release notes.

If you are making a bug fix or functionality change to Radius and do not
have an associated issue link please create one now.

-->

- This pull request fixes a bug in Radius and has an approved issue
(issue link required).
- This pull request is a minor refactor, code cleanup, test improvement,
or other maintenance task and doesn't change the functionality of Radius
(issue link optional).

<!--

Please update the following to link the associated issue. This is
required for some kinds of changes (see above).

-->

Fixes: radius-project#7060

Signed-off-by: karishma-chawla <74574173+karishma-chawla@users.noreply.github.com>
Co-authored-by: karishma-chawla <74574173+karishma-chawla@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken or not working as expected flaky-test Flaky functional/unit tests. maintenance Issue is a non-user-facing task like updating tests, improving automation, etc.. triaged This issue has been reviewed and triaged
Projects
None yet
Development

No branches or pull requests

4 participants