Memory continously increasing after 6 workspaces #180

balu-ce · 2023-07-17T10:36:17Z

What happened?

We have noticed a constant increase of pod memory of terraform provider.
The pod memory has touched nearly 1.38 GiB

How can we reproduce it?

We are running a terraform provider to create EKS cluster across accounts. So currently we have 12 workspaces

What environment did it happen in?

Crossplane Version: 10.2
Provider Version: v0.7.0
Kubernetes Version: 1.25
Kubernetes Distribution: EKS

ytsarev · 2023-07-17T21:54:05Z

Thanks for your report! What kind of terraform providers are involved in the workspace operation? is it terraform-provider-aws only? could you please share the version?

ytsarev · 2023-07-17T21:57:23Z

High-level, without actually reproducing it, hashicorp/terraform-provider-aws#31722 looks related.

bobh66 · 2023-07-18T12:35:44Z

We have seen this problem with terraform-provider-aws >= 4.67.0 - we had to pin the provider to < 4.67.0 in order to prevent the memory issues.

balu-ce · 2023-08-04T07:55:11Z

@bobh66 @ytsarev since the terraform provider is an kube operator and only one pod runs the workspace through leader election. can we run the workspace horizontally in a pods by any chance ?

bobh66 · 2023-08-04T13:35:51Z

Unfortunately no - kubernetes controllers can only work as single processes, there is no way to share a single kubernetes resource type across multiple controller instances. Both instances would be receiving and processing events and there is no way to ensure that a single resource is only handled by a single controller.

Have you identified what is using the memory? The provider cache will usually solve most memory usage issues, except for the known problem with the latest AWS provider using excessive amounts of memory.

balu-ce · 2023-08-14T05:45:43Z

@bobh66 @ytsarev I can able see the go program which is running terraform commands itself taking 1.29 Gib of memory although I make the workspaces annotation reconcilation value paused to true

bobh66 · 2023-08-16T15:58:54Z

I would exec into the pod and run "du -s /tf/*" to see what is using all of the memory (which equates to disk usage).

For example:

$ kubectl exec -it -n crossplane-system provider-terraform-official-5f29c294f0da-66c7476ddb-jrn79 -- bash
bash-5.1$ du -s /tf/*
40	/tf/a639f793-e2ca-41cc-993b-628ef529429e
24	/tf/bd05c9b5-0655-4374-8d8b-f987abf15296
24	/tf/c88627d4-48f0-4043-88c0-6ff4774f0f2b
24	/tf/d0428460-eafd-4833-bd08-32a29bc37ab5
24	/tf/d2642baf-c8a7-47d4-8642-a19168aa08ec
24	/tf/e163d21d-3d6f-43ec-8f06-4168b27fcfc0
28	/tf/e4ed18af-729b-488f-b026-b9d22f9e10b6
24	/tf/edfb5678-adb2-40cb-be9e-df724c7a107f
24	/tf/f4c5a091-0729-4586-970f-b3a10e415a4e
24	/tf/ff282ba0-2a33-49a5-826c-1574f5cdc378
2137260	/tf/plugin-cache

shows the bulk of the usage is in the provider cache, which I would expect, and:

bash-5.1$ du -s /tf/plugin-cache/registry.terraform.io/hashicorp/aws/*
326044	/tf/plugin-cache/registry.terraform.io/hashicorp/aws/5.11.0
359176	/tf/plugin-cache/registry.terraform.io/hashicorp/aws/5.3.0
360212	/tf/plugin-cache/registry.terraform.io/hashicorp/aws/5.4.0
354640	/tf/plugin-cache/registry.terraform.io/hashicorp/aws/5.5.0
355756	/tf/plugin-cache/registry.terraform.io/hashicorp/aws/5.6.1
355936	/tf/plugin-cache/registry.terraform.io/hashicorp/aws/5.6.2

shows that there are 6 versions of the AWS provider cached which is using all of the space. If we pinned the AWS provider to a specific version it would use a lot less space.

balu-ce added bug Something isn't working needs:triage labels Jul 17, 2023

ytsarev added the community label Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory continously increasing after 6 workspaces #180

Memory continously increasing after 6 workspaces #180

balu-ce commented Jul 17, 2023 •

edited

Loading

ytsarev commented Jul 17, 2023

ytsarev commented Jul 17, 2023

bobh66 commented Jul 18, 2023

balu-ce commented Aug 4, 2023

bobh66 commented Aug 4, 2023

balu-ce commented Aug 14, 2023

bobh66 commented Aug 16, 2023

Memory continously increasing after 6 workspaces #180

Memory continously increasing after 6 workspaces #180

Comments

balu-ce commented Jul 17, 2023 • edited Loading

What happened?

How can we reproduce it?

What environment did it happen in?

ytsarev commented Jul 17, 2023

ytsarev commented Jul 17, 2023

bobh66 commented Jul 18, 2023

balu-ce commented Aug 4, 2023

bobh66 commented Aug 4, 2023

balu-ce commented Aug 14, 2023

bobh66 commented Aug 16, 2023

balu-ce commented Jul 17, 2023 •

edited

Loading