Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws/resource_aws_emr_cluster.go: Filter completed EMR steps from ListSteps API #20871

Merged

Conversation

dsc133
Copy link
Contributor

@dsc133 dsc133 commented Sep 10, 2021

Community Note

  • Please vote on this pull request by adding a 👍 reaction to the original pull request comment to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for pull request followers and do not help prioritize the request

It was found that on long standing EMR clusters, performing a terraform plan can run into a grpc: Resource Exhaustion issue when the number of steps or the size of each step reached a large number. This is caused by the default 4 mb size limit on grpc's.

This PR filters out completed EMR steps that are returned from the AWS ListSteps API. It maintains returning all steps that have not successfully completed. On long standing EMR clusters this should avoid the issue of steps marked as completed being sent over and using up all the available space in grpc.

Closes #9888
Closes #14976
Closes #17015

Output from acceptance testing:
Unable to run acceptance testing due to resource constraints.

$ make testacc TESTARGS='-run=TestAccXXX'

...

@github-actions github-actions bot added size/XS Managed by automation to categorize the size of a PR. needs-triage Waiting for first response or review from a maintainer. service/emr Issues and PRs that pertain to the emr service. labels Sep 10, 2021
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome @davidCarroll1421 👋

It looks like this is your first Pull Request submission to the Terraform AWS Provider! If you haven’t already done so please make sure you have checked out our CONTRIBUTING guide and FAQ to make sure your contribution is adhering to best practice and has all the necessary elements in place for a successful approval.

Also take a look at our FAQ which details how we prioritize Pull Requests for inclusion.

Thanks again, and welcome to the community! 😃

@justinretzolk justinretzolk added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Sep 13, 2021
@zhelding
Copy link
Contributor

Pull request #21306 has significantly refactored the AWS Provider codebase. As a result, most PRs opened prior to the refactor now have merge conflicts that must be resolved before proceeding.

Specifically, PR #21306 relocated the code for all AWS resources and data sources from a single aws directory to a large number of separate directories in internal/service, each corresponding to a particular AWS service. This separation of code has also allowed for us to simplify the names of underlying functions -- while still avoiding namespace collisions.

We recognize that many pull requests have been open for some time without yet being addressed by our maintainers. Therefore, we want to make it clear that resolving these conflicts in no way affects the prioritization of a particular pull request. Once a pull request has been prioritized for review, the necessary changes will be made by a maintainer -- either directly or in collaboration with the pull request author.

For a more complete description of this refactor, including examples of how old filepaths and function names correspond to their new counterparts: please refer to issue #20000.

For a quick guide on how to amend your pull request to resolve the merge conflicts resulting from this refactor and bring it in line with our new code patterns: please refer to our Service Package Refactor Pull Request Guide.

@ewbankkit ewbankkit added the pre-service-packages Includes pre-Service Packages aspects. label Dec 16, 2021
@Thiago-Dantas
Copy link

I wonder if this won't re-submit steps described in terraform configuration?

@ewbankkit
Copy link
Contributor

The maximum gRPC buffer size between Terraform CLI and the AWS Provider was increased to 256M in v4.7.0 via

hashicorp/terraform-plugin-go#139 -> hashicorp/terraform-plugin-sdk#856 -> #23742

but there are still extreme scenarios where this new limit may be exceeded.

@ewbankkit
Copy link
Contributor

ewbankkit commented Apr 27, 2022

In order to maintain backwards compatibility here (and prevent the potential drift that @Thiago-Dantas mentions) I think I will add a new optional argument list_steps_states (name can be fine-tuned) which will be a list of step states (see https://docs.aws.amazon.com/emr/latest/APIReference/API_StepStatus.html) that will be returned when returning steps.
If the list is empty then all steps are returned.

For example

list_steps_states = ["PENDING", "RUNNING"]

@github-actions github-actions bot added documentation Introduces or discusses updates to documentation. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure. size/XL Managed by automation to categorize the size of a PR. and removed documentation Introduces or discusses updates to documentation. size/XS Managed by automation to categorize the size of a PR. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure. labels Apr 28, 2022
@ewbankkit ewbankkit removed bug Addresses a defect in current functionality. size/XL Managed by automation to categorize the size of a PR. pre-service-packages Includes pre-Service Packages aspects. labels Apr 28, 2022
@github-actions github-actions bot removed the pre-service-packages Includes pre-Service Packages aspects. label Apr 28, 2022
@ewbankkit ewbankkit added enhancement Requests to existing resources that expand the functionality or scope. size/XS Managed by automation to categorize the size of a PR. documentation Introduces or discusses updates to documentation. size/XL Managed by automation to categorize the size of a PR. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure. and removed size/XS Managed by automation to categorize the size of a PR. labels Apr 28, 2022
Copy link
Contributor

@ewbankkit ewbankkit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀.

% make testacc TESTS=TestAccEMRCluster_Step_ PKG=emr ACCTEST_PARALLELISM=3
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./internal/service/emr/... -v -count 1 -parallel 3 -run='TestAccEMRCluster_Step_'  -timeout 180m
=== RUN   TestAccEMRCluster_Step_basic
=== PAUSE TestAccEMRCluster_Step_basic
=== RUN   TestAccEMRCluster_Step_mode
=== PAUSE TestAccEMRCluster_Step_mode
=== RUN   TestAccEMRCluster_Step_multiple
=== PAUSE TestAccEMRCluster_Step_multiple
=== RUN   TestAccEMRCluster_Step_multiple_listStates
=== PAUSE TestAccEMRCluster_Step_multiple_listStates
=== CONT  TestAccEMRCluster_Step_basic
=== CONT  TestAccEMRCluster_Step_multiple
=== CONT  TestAccEMRCluster_Step_multiple_listStates
--- PASS: TestAccEMRCluster_Step_multiple_listStates (402.72s)
=== CONT  TestAccEMRCluster_Step_mode
--- PASS: TestAccEMRCluster_Step_multiple (396.35s)
--- PASS: TestAccEMRCluster_Step_basic (442.79s)
--- PASS: TestAccEMRCluster_Step_mode (777.90s)
PASS
ok  	github.com/hashicorp/terraform-provider-aws/internal/service/emr	784.537s

@ewbankkit
Copy link
Contributor

Ran full set of EMR Cluster tests:

% make testacc TESTS=TestAccEMRCluster_ PKG=emr ACCTEST_PARALLELISM=3
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./internal/service/emr/... -v -count 1 -parallel 3 -run='TestAccEMRCluster_'  -timeout 180m
=== RUN   TestAccEMRCluster_basic
=== PAUSE TestAccEMRCluster_basic
=== RUN   TestAccEMRCluster_autoTerminationPolicy
=== PAUSE TestAccEMRCluster_autoTerminationPolicy
=== RUN   TestAccEMRCluster_additionalInfo
=== PAUSE TestAccEMRCluster_additionalInfo
=== RUN   TestAccEMRCluster_disappears
=== PAUSE TestAccEMRCluster_disappears
=== RUN   TestAccEMRCluster_sJSON
=== PAUSE TestAccEMRCluster_sJSON
=== RUN   TestAccEMRCluster_CoreInstanceGroup_autoScalingPolicy
=== PAUSE TestAccEMRCluster_CoreInstanceGroup_autoScalingPolicy
=== RUN   TestAccEMRCluster_CoreInstanceGroup_bidPrice
=== PAUSE TestAccEMRCluster_CoreInstanceGroup_bidPrice
=== RUN   TestAccEMRCluster_CoreInstanceGroup_instanceCount
=== PAUSE TestAccEMRCluster_CoreInstanceGroup_instanceCount
=== RUN   TestAccEMRCluster_CoreInstanceGroup_instanceType
=== PAUSE TestAccEMRCluster_CoreInstanceGroup_instanceType
=== RUN   TestAccEMRCluster_CoreInstanceGroup_name
=== PAUSE TestAccEMRCluster_CoreInstanceGroup_name
=== RUN   TestAccEMRCluster_EC2Attributes_defaultManagedSecurityGroups
=== PAUSE TestAccEMRCluster_EC2Attributes_defaultManagedSecurityGroups
=== RUN   TestAccEMRCluster_Kerberos_clusterDedicatedKdc
=== PAUSE TestAccEMRCluster_Kerberos_clusterDedicatedKdc
=== RUN   TestAccEMRCluster_MasterInstanceGroup_bidPrice
=== PAUSE TestAccEMRCluster_MasterInstanceGroup_bidPrice
=== RUN   TestAccEMRCluster_MasterInstanceGroup_instanceCount
=== PAUSE TestAccEMRCluster_MasterInstanceGroup_instanceCount
=== RUN   TestAccEMRCluster_MasterInstanceGroup_instanceType
=== PAUSE TestAccEMRCluster_MasterInstanceGroup_instanceType
=== RUN   TestAccEMRCluster_MasterInstanceGroup_name
=== PAUSE TestAccEMRCluster_MasterInstanceGroup_name
=== RUN   TestAccEMRCluster_security
=== PAUSE TestAccEMRCluster_security
=== RUN   TestAccEMRCluster_Step_basic
=== PAUSE TestAccEMRCluster_Step_basic
=== RUN   TestAccEMRCluster_Step_mode
=== PAUSE TestAccEMRCluster_Step_mode
=== RUN   TestAccEMRCluster_Step_multiple
=== PAUSE TestAccEMRCluster_Step_multiple
=== RUN   TestAccEMRCluster_Step_multiple_listStates
=== PAUSE TestAccEMRCluster_Step_multiple_listStates
=== RUN   TestAccEMRCluster_Bootstrap_ordering
=== PAUSE TestAccEMRCluster_Bootstrap_ordering
=== RUN   TestAccEMRCluster_terminationProtected
=== PAUSE TestAccEMRCluster_terminationProtected
=== RUN   TestAccEMRCluster_keepJob
=== PAUSE TestAccEMRCluster_keepJob
=== RUN   TestAccEMRCluster_visibleToAllUsers
=== PAUSE TestAccEMRCluster_visibleToAllUsers
=== RUN   TestAccEMRCluster_s3Logging
=== PAUSE TestAccEMRCluster_s3Logging
=== RUN   TestAccEMRCluster_s3LogEncryption
=== PAUSE TestAccEMRCluster_s3LogEncryption
=== RUN   TestAccEMRCluster_tags
=== PAUSE TestAccEMRCluster_tags
=== RUN   TestAccEMRCluster_RootVolume_size
=== PAUSE TestAccEMRCluster_RootVolume_size
=== RUN   TestAccEMRCluster_StepConcurrency_level
=== PAUSE TestAccEMRCluster_StepConcurrency_level
=== RUN   TestAccEMRCluster_ebs
=== PAUSE TestAccEMRCluster_ebs
=== RUN   TestAccEMRCluster_CustomAMI_id
=== PAUSE TestAccEMRCluster_CustomAMI_id
=== RUN   TestAccEMRCluster_InstanceFleet_basic
=== PAUSE TestAccEMRCluster_InstanceFleet_basic
=== RUN   TestAccEMRCluster_InstanceFleetMaster_only
=== PAUSE TestAccEMRCluster_InstanceFleetMaster_only
=== CONT  TestAccEMRCluster_basic
=== CONT  TestAccEMRCluster_Step_basic
=== CONT  TestAccEMRCluster_s3LogEncryption
--- PASS: TestAccEMRCluster_basic (382.40s)
=== CONT  TestAccEMRCluster_InstanceFleetMaster_only
--- PASS: TestAccEMRCluster_Step_basic (391.79s)
=== CONT  TestAccEMRCluster_InstanceFleet_basic
--- PASS: TestAccEMRCluster_s3LogEncryption (724.10s)
=== CONT  TestAccEMRCluster_CustomAMI_id
=== CONT  TestAccEMRCluster_InstanceFleetMaster_only
    cluster_test.go:1557: Step 1/2 error: Error running apply: exit status 1
        
        Error: failed creating IAM Role (tf-acc-test-8568204340894705206_default_role): EntityAlreadyExists: Role with name tf-acc-test-8568204340894705206_default_role already exists.
        	status code: 409, request id: c4cdf828-62f1-4aa4-a91e-f2aeafed99cf
        
          with aws_iam_role.emr_service,
          on terraform_plugin_test.tf line 81, in resource "aws_iam_role" "emr_service":
          81: resource "aws_iam_role" "emr_service" {
        
        
        Error: failed creating IAM Role (tf-acc-test-8568204340894705206_profile_role): EntityAlreadyExists: Role with name tf-acc-test-8568204340894705206_profile_role already exists.
        	status code: 409, request id: b3c0e28b-b99e-46e6-8534-891d2d940d32
        
          with aws_iam_role.emr_instance_profile,
          on terraform_plugin_test.tf line 112, in resource "aws_iam_role" "emr_instance_profile":
         112: resource "aws_iam_role" "emr_instance_profile" {
        
--- FAIL: TestAccEMRCluster_InstanceFleetMaster_only (960.79s)
=== CONT  TestAccEMRCluster_ebs
--- PASS: TestAccEMRCluster_CustomAMI_id (656.74s)
=== CONT  TestAccEMRCluster_tags
--- PASS: TestAccEMRCluster_ebs (463.41s)
=== CONT  TestAccEMRCluster_StepConcurrency_level
--- PASS: TestAccEMRCluster_InstanceFleet_basic (1612.75s)
=== CONT  TestAccEMRCluster_CoreInstanceGroup_name
--- PASS: TestAccEMRCluster_tags (769.90s)
=== CONT  TestAccEMRCluster_security
--- PASS: TestAccEMRCluster_StepConcurrency_level (445.57s)
=== CONT  TestAccEMRCluster_MasterInstanceGroup_name
--- PASS: TestAccEMRCluster_security (410.71s)
=== CONT  TestAccEMRCluster_MasterInstanceGroup_instanceType
--- PASS: TestAccEMRCluster_CoreInstanceGroup_name (865.65s)
=== CONT  TestAccEMRCluster_RootVolume_size
--- PASS: TestAccEMRCluster_MasterInstanceGroup_name (807.40s)
=== CONT  TestAccEMRCluster_terminationProtected
--- PASS: TestAccEMRCluster_MasterInstanceGroup_instanceType (748.02s)
=== CONT  TestAccEMRCluster_MasterInstanceGroup_instanceCount
--- PASS: TestAccEMRCluster_terminationProtected (440.12s)
=== CONT  TestAccEMRCluster_s3Logging
--- PASS: TestAccEMRCluster_RootVolume_size (778.08s)
=== CONT  TestAccEMRCluster_MasterInstanceGroup_bidPrice
--- PASS: TestAccEMRCluster_s3Logging (532.84s)
=== CONT  TestAccEMRCluster_Kerberos_clusterDedicatedKdc
--- PASS: TestAccEMRCluster_MasterInstanceGroup_instanceCount (1044.20s)
=== CONT  TestAccEMRCluster_EC2Attributes_defaultManagedSecurityGroups
--- PASS: TestAccEMRCluster_Kerberos_clusterDedicatedKdc (397.74s)
=== CONT  TestAccEMRCluster_visibleToAllUsers
--- PASS: TestAccEMRCluster_MasterInstanceGroup_bidPrice (874.28s)
=== CONT  TestAccEMRCluster_Step_multiple_listStates
--- PASS: TestAccEMRCluster_Step_multiple_listStates (393.26s)
=== CONT  TestAccEMRCluster_Bootstrap_ordering
=== CONT  TestAccEMRCluster_visibleToAllUsers
    cluster_test.go:1188: Step 3/4 error: Check failed: Check 2/2 error: aws_emr_cluster.test: Attribute 'visible_to_all_users' expected "false", got "true"
--- FAIL: TestAccEMRCluster_visibleToAllUsers (738.03s)
=== CONT  TestAccEMRCluster_keepJob
--- PASS: TestAccEMRCluster_EC2Attributes_defaultManagedSecurityGroups (879.15s)
=== CONT  TestAccEMRCluster_CoreInstanceGroup_instanceCount
--- PASS: TestAccEMRCluster_keepJob (391.82s)
=== CONT  TestAccEMRCluster_CoreInstanceGroup_bidPrice
--- PASS: TestAccEMRCluster_Bootstrap_ordering (1086.60s)
=== CONT  TestAccEMRCluster_CoreInstanceGroup_instanceType
--- PASS: TestAccEMRCluster_CoreInstanceGroup_instanceCount (1268.79s)
=== CONT  TestAccEMRCluster_disappears
--- PASS: TestAccEMRCluster_CoreInstanceGroup_bidPrice (982.09s)
=== CONT  TestAccEMRCluster_sJSON
--- PASS: TestAccEMRCluster_CoreInstanceGroup_instanceType (788.28s)
=== CONT  TestAccEMRCluster_Step_multiple
--- PASS: TestAccEMRCluster_disappears (421.32s)
=== CONT  TestAccEMRCluster_additionalInfo
--- PASS: TestAccEMRCluster_sJSON (390.62s)
=== CONT  TestAccEMRCluster_CoreInstanceGroup_autoScalingPolicy
--- PASS: TestAccEMRCluster_Step_multiple (405.16s)
=== CONT  TestAccEMRCluster_autoTerminationPolicy
--- PASS: TestAccEMRCluster_additionalInfo (396.80s)
=== CONT  TestAccEMRCluster_Step_mode
--- PASS: TestAccEMRCluster_CoreInstanceGroup_autoScalingPolicy (507.06s)
--- PASS: TestAccEMRCluster_autoTerminationPolicy (657.61s)
--- PASS: TestAccEMRCluster_Step_mode (821.54s)
FAIL
FAIL	github.com/hashicorp/terraform-provider-aws/internal/service/emr	8146.278s
FAIL
make: *** [testacc] Error 1

Failures are unrelated to this change and occur in nightly CI.

@ewbankkit ewbankkit merged commit 297832d into hashicorp:main Apr 28, 2022
@ewbankkit
Copy link
Contributor

@dsc133 Thanks for the contribution 🎉 👏.

@github-actions github-actions bot added this to the v4.12.0 milestone Apr 28, 2022
@github-actions
Copy link

This functionality has been released in v4.12.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Introduces or discusses updates to documentation. enhancement Requests to existing resources that expand the functionality or scope. service/emr Issues and PRs that pertain to the emr service. size/XL Managed by automation to categorize the size of a PR. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure.
Projects
None yet
6 participants