Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallelism flag on specific resources #24433

Closed
gtmtech opened this issue Mar 23, 2020 · 8 comments
Closed

parallelism flag on specific resources #24433

gtmtech opened this issue Mar 23, 2020 · 8 comments

Comments

@gtmtech
Copy link

gtmtech commented Mar 23, 2020

Current Terraform Version

terraform 0.11 (but probably terraform 0.12 as well

Use-cases

Some actions are not threadsafe in the aws cli, and other utilities, and so setting parallelism=1 is necessary in those cases.

Unfortunately this then applies to the whole of terraform making the run extremely slow. It would be much better to be able to set parallelism on a specific resource, and then keep the rest quick, ensuring that this resource blocks out all other resources until complete.

Is there any way to do that? This would allow to define different parallelisms for different resources and would be a major speed improvement in such cases.

Attempted Solutions

None as none supported. Only solution found is to set parallelism=1 and suffer the huge slowdown in terraforming

@jbardin
Copy link
Member

jbardin commented Mar 24, 2020

Hi @gtmtech

Unfortunately the mechanism which traverses the graph and limits concurrency is quite separate from the internals of the resources themselves, and it would be best to not couple those concepts for some resources that are behaving incorrectly. Resources that can't be accessed concurrently should be serialized by the provider itself, or documented as requiring a dependency between the resources. Failure to follow these constraints should be filed as a bug in the provider itself.

Do you have any specific examples of problematic resources that we can use to verify their behavior?

@jbardin jbardin added the waiting-response An issue/pull request is waiting for a response from the community label Mar 24, 2020
@gtmtech
Copy link
Author

gtmtech commented Apr 26, 2020

@jbardin an example is the aws organizations list-accounts operation which you can only do 4 of at any one time, otherwise you get error rates. So anything which does this, a resource or a null resource needs to have serialisation set to <5 - but its annoying then this slows down the terraforming of the rest of the project.

@ghost ghost removed waiting-response An issue/pull request is waiting for a response from the community labels Apr 26, 2020
@jbardin
Copy link
Member

jbardin commented Apr 28, 2020

@gtmtech, In that case, the provider itself should use a semaphore to limit concurrent requests for the same resource if it's required. In general, providers are responsible for handling any API limitations imposed by the services wit which they interact.

@gtmtech
Copy link
Author

gtmtech commented Apr 29, 2020

@jbardin The aws provider does, in this case its a null resource so the null provider. Should I raise a bug in the null resource provider instead?

@jbardin
Copy link
Member

jbardin commented Apr 29, 2020

Sorry, I'm not sure I follow here. The null resource doesn't do anything at all, so there's no way for concurrent creation of instances to conflict. Can you give an example where null resource is failing?

@gtmtech
Copy link
Author

gtmtech commented May 13, 2020

Sorry it wasnt the null_resource that was at fault (but it could have been).

So suppose I have a bunch of different external datasources which runs a script which jsonifies some information about aws organizations list-accounts in different ways.

At the moment these all get run concurrently, but that breaches by limit of 4 on this api call.

I could change parellelism to 4 or to 1, but that then affects all resources.

Are you saying the solution here is to put some kind of semaphore in the script which does the aws organizations list-accounts ?

@jbardin
Copy link
Member

jbardin commented May 13, 2020

Yes, if you have external data sources which require some sort of coordination, that coordination needs to be handled in those data sources. Control of concurrency is one of only many things that could be required when running external code, and Terraform cannot be tasked with handling all possibilities. If this is a common enough case with the external provider (since its task is running arbitrary code), perhaps an enhancement proposal could be made on that provider.

@gtmtech gtmtech closed this as completed Jun 3, 2020
@ghost
Copy link

ghost commented Jul 4, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Jul 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants