Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The argo agent will request two plugin interfaces at the same time for the same task #10304

Open
GhangZh opened this issue Jan 3, 2023 · 17 comments · May be fixed by #12705
Open

The argo agent will request two plugin interfaces at the same time for the same task #10304

GhangZh opened this issue Jan 3, 2023 · 17 comments · May be fixed by #12705
Labels
area/plugins type/support User support issue - likely not a bug

Comments

@GhangZh
Copy link
Contributor

GhangZh commented Jan 3, 2023

If I have two plugins set up at the same time, on ports 4355 and 5678, then the argo agent pod will request the interface of each of these two containers for each task. Is there any way to circumvent this?

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  name: test-wf-argo-plugin
  namespace: argo
spec:
  arguments: {}
  entrypoint: hello
  podGC:
    strategy: OnPodCompletion
  serviceAccountName: argo
  templates:
  - dag:
      tasks:
      - arguments:
          parameters:
          - name: taskName
            value: "aaa"
        name: task-packing
        template: test-plugin-one
      - arguments:
          parameters:
          - name: args
            value: "aaa"
        name: async-job-a
        template: test-plugin-two
    inputs: {}
    metadata: {}
    name: hello
    outputs: {}
  - inputs:
      parameters:
      - name: taskName
    metadata: {}
    name: test-plugin-one
    outputs: {}
    plugin:
      test-plugin-one:
        taskName: '{{inputs.parameters.taskName}}'
  - inputs:
      parameters:
      - name: args
    metadata: {}
    name: test-plugin-two
    outputs: {}
    plugin:
      test-plugin-two:
        args: '{{inputs.parameters.args}}'

test-plugin-two was listening on port 6789, but instead he requested 4355
image

@sarabala1979
Copy link
Member

@GhangZh Did you configure your plugin port on ExecutorPlugin 6789?
https://github.com/argoproj/argo-workflows/blob/master/docs/executor_plugins.md

@GhangZh
Copy link
Contributor Author

GhangZh commented Jan 4, 2023

@GhangZh Did you configure your plugin port on ExecutorPlugin 6789? https://github.com/argoproj/argo-workflows/blob/master/docs/executor_plugins.md
@sarabala1979
Yes, I'm sure, and the plugin works.

apiVersion: v1
data:
  sidecar.automountServiceAccountToken: "false"
  sidecar.container: |
    args:
    - test_plugin_two.py
    image: docker.xxx.com/argo-plugins:2023-01-03-16-18
    name: test-plugin-two-plugin
    ports:
    - containerPort: 6789
    resources:
      limits:
        cpu: 500m
        memory: 128Mi
      requests:
        cpu: 250m
        memory: 64Mi
    securityContext:
      runAsNonRoot: true
      runAsUser: 65534
kind: ConfigMap
metadata:
  labels:
    workflows.argoproj.io/configmap-type: ExecutorPlugin
  name: test-plugin-two-plugin
  namespace: argo

argo-agent pod
image

@GhangZh
Copy link
Contributor Author

GhangZh commented Jan 4, 2023

Look at the code here it will be executed for each plugin
image

@alexec
Copy link
Contributor

alexec commented Jan 5, 2023

If a plugin cannot execute a task, it should return nil node.

@GhangZh
Copy link
Contributor Author

GhangZh commented Jan 5, 2023

If a plugin cannot execute a task, it should return nil node.

I don't think this is very friendly either, it should be executed according to the matching plugin

@stale

This comment was marked as resolved.

@stale stale bot added the problem/stale This has not had a response in some time label Jan 21, 2023
@terrytangyuan terrytangyuan removed the problem/stale This has not had a response in some time label Sep 20, 2023
@agilgur5 agilgur5 added the type/support User support issue - likely not a bug label Oct 17, 2023
@whybeyoung
Copy link
Contributor

If a plugin cannot execute a task, it should return nil node.
agree with you .. why we conisder the chain case , it's the pulugin of argo.. we should matching the plugin according to the peoples' defination...

If a plugin cannot execute a task, it should return nil node.

and this is very not friendly too. plugin is developed by different team, i dont' think they will consider that whether they should deal with different case ...

@luyang93
Copy link
Contributor

Is it possible that only execute the plugin whose name matches the plugin name in the template?

@luyang93
Copy link
Contributor

how about not creating unnecessary sidecar containers for unused plugins?

for _, plug := range woc.controller.executorPlugins[namespace] {
s := plug.Spec.Sidecar
c := s.Container.DeepCopy()
c.VolumeMounts = append(c.VolumeMounts, apiv1.VolumeMount{
Name: volumeMountVarArgo.Name,
MountPath: volumeMountVarArgo.MountPath,
ReadOnly: true,
// only mount the token for this plugin, not others
SubPath: c.Name,
})
if s.AutomountServiceAccountToken {
volume, volumeMount, err := woc.getServiceAccountTokenVolume(ctx, plug.Name+"-executor-plugin")
if err != nil {
return nil, nil, err
}
volumes = append(volumes, *volume)
c.VolumeMounts = append(c.VolumeMounts, *volumeMount)
}
sidecars = append(sidecars, *c)
}

@luyang93
Copy link
Contributor

luyang93 commented Dec 5, 2023

@jswxstw
Copy link
Member

jswxstw commented Jan 24, 2024

how about not creating unnecessary sidecar containers for unused plugins?

for _, plug := range woc.controller.executorPlugins[namespace] {
s := plug.Spec.Sidecar
c := s.Container.DeepCopy()
c.VolumeMounts = append(c.VolumeMounts, apiv1.VolumeMount{
Name: volumeMountVarArgo.Name,
MountPath: volumeMountVarArgo.MountPath,
ReadOnly: true,
// only mount the token for this plugin, not others
SubPath: c.Name,
})
if s.AutomountServiceAccountToken {
volume, volumeMount, err := woc.getServiceAccountTokenVolume(ctx, plug.Name+"-executor-plugin")
if err != nil {
return nil, nil, err
}
volumes = append(volumes, *volume)
c.VolumeMounts = append(c.VolumeMounts, *volumeMount)
}
sidecars = append(sidecars, *c)
}

Yes, this will be really helpful.
All executor plugins in both workflow and controller namespace will be loaded into the agent pod as sidecars now which contains many unused plugins.

@jswxstw
Copy link
Member

jswxstw commented Feb 2, 2024

https://github.com/luyang93/argo-workflows/blob/51154bfed2110525f4046c2bcfc938d8ccc051f4/workflow/controller/agent.go#L148-L153

Some executor plugins may be lost, if the workflow spec uses templateRef to import plugin template.

@luyang93
Copy link
Contributor

luyang93 commented Feb 2, 2024

https://github.com/luyang93/argo-workflows/blob/51154bfed2110525f4046c2bcfc938d8ccc051f4/workflow/controller/agent.go#L148-L153

Some executor plugins may be lost, if the workflow spec uses templateRef to import plugin template.

Yep, I'm stuck on how to recursively list plugins that use ref in the workflow.

@alexec
Copy link
Contributor

alexec commented Feb 5, 2024

This seems reasonable. It does not make sense for two plugins to execute the same task.

@jswxstw
Copy link
Member

jswxstw commented Feb 6, 2024

https://github.com/luyang93/argo-workflows/blob/51154bfed2110525f4046c2bcfc938d8ccc051f4/workflow/controller/agent.go#L148-L153

Some executor plugins may be lost, if the workflow spec uses templateRef to import plugin template.

Yep, I'm stuck on how to recursively list plugins that use ref in the workflow.

Although the agent pod loads many unused plugins, during execution, it is easy to select the corresponding plugin based on the plugin name specified in the template, instead of iterating through all of them. as a workaround?
But it would be best to avoid loading unnecessary plugins to save on scheduling and resource overhead.

jswxstw pushed a commit to jswxstw/argo-workflows that referenced this issue Feb 28, 2024
Signed-off-by: oninowang <oninowang@tencent.com>
jswxstw added a commit to jswxstw/argo-workflows that referenced this issue Feb 28, 2024
Signed-off-by: oninowang <oninowang@tencent.com>
@jswxstw jswxstw linked a pull request Feb 28, 2024 that will close this issue
jswxstw added a commit to jswxstw/argo-workflows that referenced this issue Feb 28, 2024
Signed-off-by: oninowang <oninowang@tencent.com>
jswxstw added a commit to jswxstw/argo-workflows that referenced this issue Feb 28, 2024
Signed-off-by: oninowang <oninowang@tencent.com>
@dgolja
Copy link

dgolja commented May 27, 2024

I came across the same issue while working on a solution for #13026 which makes this problem even more relevant, because you may end up with a lot of unnecessary volumes.

My worry is that some users may already rely on this undocumented behaviour. For example with the current implementation one custom plugin can serve multiple plugin RPC calls. If we start filtering which plugin will be loaded in the agent pod based on the name alone some workflows may break.

@jswxstw
Copy link
Member

jswxstw commented May 27, 2024

My worry is that some users may already rely on this undocumented behaviour. For example with the current implementation one custom plugin can serve multiple plugin RPC calls. If we start filtering which plugin will be loaded in the agent pod based on the name alone some workflows may break.

What's worse is that the plugin actually has two names: one is the name of the executor plugin configmap (without the '-executor-plugin' suffix), and the other is the name of the sidecar container.
Moreover, they are both in use: the former is used to retrieve the service account, while the latter is written into agent pod env EnvVarPluginNames, which is extremely confusing.

jswxstw added a commit to jswxstw/argo-workflows that referenced this issue Jun 28, 2024
Signed-off-by: oninowang <oninowang@tencent.com>
jswxstw added a commit to jswxstw/argo-workflows that referenced this issue Sep 9, 2024
Signed-off-by: jswxstw <jswxstw@gmail.com>
jswxstw added a commit to jswxstw/argo-workflows that referenced this issue Sep 9, 2024
Signed-off-by: oninowang <oninowang@tencent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/plugins type/support User support issue - likely not a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants