Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

duplicative env var ARGO_TEMPLATE result in etcd size 1MB limit #8790

Closed
MagicFrogSJTU opened this issue May 18, 2022 · 16 comments · Fixed by #13742 or #13761
Closed

duplicative env var ARGO_TEMPLATE result in etcd size 1MB limit #8790

MagicFrogSJTU opened this issue May 18, 2022 · 16 comments · Fixed by #13742 or #13761
Labels
area/controller Controller issues, panics area/executor type/feature Feature request

Comments

@MagicFrogSJTU
Copy link

Summary

I am running a containerSet with lots of steps. And I have large parameter input for the step.
Because there is an env variable called ARGO_TEMPLATE each container, the overall pod description is too large to fit in etcd, exceeding the 1MB limit.
(10 + init + wait) = 12 containers
12 * 100k = 1200k > 1MB
What is the usage of ARGO_TEMPLATE? Can we delete it?

I am using the v3.1.10


Message from the maintainers:

Love this enhancement proposal? Give it a 👍. We prioritise the proposals with the most 👍.

@MagicFrogSJTU MagicFrogSJTU added the type/feature Feature request label May 18, 2022
@alexec alexec added the area/controller Controller issues, panics label May 18, 2022
@alexec
Copy link
Contributor

alexec commented May 18, 2022

What is the usage of ARGO_TEMPLATE? Can we delete it?

No.

However, we could replace it with just the information needed by the emissary. The emissary only needs to know which containers it must wait on. It does not need the whole template.

@MagicFrogSJTU
Copy link
Author

That would be nice. Look forward to the coming improve!

@terrytangyuan
Copy link
Member

The emissary only needs to know which containers it must wait on.

@alexec Looks like tests are breaking after I removed the parameters. #9698 Any hints on what I may have missed?

@juliev0
Copy link
Contributor

juliev0 commented Oct 2, 2023

At Intuit we are actually relying on the $ARGO_TEMPLATE environment variable inputs because we had an issue in which we were originally trying to do this in our Step:

image: "docker.intuit.com/dev/patterns/kubernetes/dev/kubectl-awscli:v1.22.15"
command:
  - sh
  - -c
args:
  - |
    echo "{{`{{inputs.parameters.manifest}}`}}" | ....

but we found that we exceeded the maximum argument size limit allowed by Linux. So, we changed the Step to use the ARGO_TEMPLATE environment variable:

image: "docker.intuit.com/dev/patterns/kubernetes/dev/kubectl-awscli:v1.22.15"
command:
  - sh
  - -c
args: ["
  getconf ARG_MAX;
  echo $ARGO_TEMPLATE | yq '.inputs.parameters[0].value' --unwrapScalar > /tmp/manifest-64;
  ....
"]

@juliev0
Copy link
Contributor

juliev0 commented Oct 2, 2023

(We had also tried an alternative solution of mounting the Input parameter as a RawArtifact but that faced an issue and didn't work either)

@juliev0
Copy link
Contributor

juliev0 commented Oct 2, 2023

now perhaps there could be an alternative solution in which the value of the environment variable is compressed or something...

@sebagarayco
Copy link

sebagarayco commented May 15, 2024

but we found that we exceeded the maximum argument size limit allowed by Linux. So, we changed the Step to use the ARGO_TEMPLATE environment variable:

Did this work for you? I'm having exact same issue.

@jswxstw
Copy link
Member

jswxstw commented Sep 30, 2024

* `/var/run/argo/template` A JSON encoding of the template.

I think ARGO_TEMPLATE is needed only for init container, since it writes a JSON encoding of the template to /var/run/argo/template.
wait and other main containers can obtain the template from /var/run/argo/template as below:

data, err := os.ReadFile(varRunArgo + "/template")

cc @tooptoop4

@tooptoop4
Copy link
Contributor

tooptoop4 commented Sep 30, 2024

what about

func substitutePodParams(pod *apiv1.Pod, globalParams common.Parameters, tmpl *wfv1.Template) (*apiv1.Pod, error) {
podParams := globalParams.DeepCopy()
for _, inParam := range tmpl.Inputs.Parameters {
using Inputs? @jswxstw

@jswxstw
Copy link
Member

jswxstw commented Oct 8, 2024

I don't understand, what do you mean? @tooptoop4

@tooptoop4
Copy link
Contributor

@jswxstw the code i linked seems to take inputs from the template, are u saying have another variable for that purpose not ARGO_TEMPLATE?

@jswxstw
Copy link
Member

jswxstw commented Oct 8, 2024

@tooptoop4 No, I don't quite understand your question and how it is related to this issue.
I actually meant that I think env variable ARGO_TEMPLATE should only be set in init container, and other containers should read it from the file /var/run/argo/template if needed.
Additionally, inputs.parameters in ARGO TEMPLATE seems useless, maybe we can just keep inputs.artifacts, but I'm not sure with that.

@agilgur5 agilgur5 changed the title duplicate env variable ARGO_TEMPLATE result in etcd size 1MB limit duplicative env var ARGO_TEMPLATE result in etcd size 1MB limit Oct 12, 2024
@jswxstw
Copy link
Member

jswxstw commented Oct 12, 2024

#12325 will offload the env ARGO_TEMPLATE to configmap when ARGO_TEMPLATE is larger than 128KB, which can solve this issue to some extent. @tooptoop4

But I think #12325 still has some areas for improvement:

  • Removing inputs.parameters from ARGO_TEMPLATE can significantly reduce its size.
  • ARGO_TEMPLATE in configmap will be mount to /argo/config if offloadEnvVarTemplate is enabled and init container will also write ARGO_TEMPLATE to /var/run/argo/template, which is duplicated.

I don't have any permissions to operate issues. Is there something wrong? @agilgur5

@agilgur5
Copy link
Contributor

agilgur5 commented Oct 14, 2024

I don't have any permissions to operate issues. Is there something wrong? @agilgur5

You pop up correctly as a "Member". Screenshot:
Screenshot 2024-10-14 at 2 54 34 PM

So you should have "triage" permissions.

But looking at the Argoproj teams, it looks like you're not part of the "members" team for some reason 👀 I'm guessing that's the team assigned to triage per repo or something.
@terrytangyuan could you add @jswxstw to the "members" team? (I don't have permissions to do so, org owners only). Might want to double-check other members too as there is a diff between that team and the "people" list (some of those are seemingly correct diffs, like bots, at least a few others seem to incorrectly have 0 teams)
EDIT: I also asked in the Approvers Slack channel (private link) if some org admins could take a look at this diff

@terrytangyuan
Copy link
Member

Added

@agilgur5
Copy link
Contributor

agilgur5 commented Oct 14, 2024

I actually meant that I think env variable ARGO_TEMPLATE should only be set in init container, and other containers should read it from the file /var/run/argo/template if needed.

This part wasn't solved by #13742, so it sounds like this issue should still be open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment