Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] resource can not be passes as parameter #703

Closed
dmonakhov opened this issue Jan 26, 2018 · 17 comments · Fixed by #1687
Closed

[BUG] resource can not be passes as parameter #703

dmonakhov opened this issue Jan 26, 2018 · 17 comments · Fixed by #1687
Assignees
Labels

Comments

@dmonakhov
Copy link
Contributor

dmonakhov commented Jan 26, 2018

I try to pass container resources (cpu in my case) as workflow parameter( see workflow below)
But argo cli validation failed like follows:

$ argo  submit resource-param.yaml 
2018/01/26 19:00:54 resource-param.yaml failed to parse: error unmarshaling JSON: quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'

AFAIU this is happens because expression was not actually substituted.

WORKFLOW

# Try to pass cpu resource as an argument                                                                                                   
#                                                                                                                                           
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: resource-param-
spec:
  arguments:
    parameters:
    - name: nrcpu
      value: 4
  entrypoint: stress-ng
  templates:
  - name: stress-ng
    inputs:
      parameters:
      - name: nrcpu
    container:
      image: lorel/docker-stress-ng
      args: [ "--cpu", "{{workflow.parameters.nrcpu}}", "--timeout", "15s", "--metrics-brief" ]
      resources:
        requests:
          memory: 1Gi
          #cpu: 4                                                                                                                           
          cpu: "{{workflow.parameters.nrcpu}}"

@jessesuen jessesuen added the bug label Jan 26, 2018
@jessesuen
Copy link
Member

jessesuen commented Jan 26, 2018

K8s has custom unmarshal/marshal functions for the container.resources field. Need to find a way to workaround this.

@jessesuen jessesuen added this to the v2.0.0-beta2 milestone Jan 27, 2018
@jessesuen
Copy link
Member

jessesuen commented Jan 27, 2018

This is going to be a tricky one to fix. Since we reuse the Container data structure from the k8s API types (among others), we inherit all of logic that comes with marshaling and unmarshaling of the various types. This includes the custom unmarshalling functions deeply nested any of their sub fields (in this case resource/quantity.go Quantity). We definitely do not want to move away from reusing Container data type since we need to keep in-line with k8s. Our only option it appears is to implement custom unmarshalling at the template level. I'm also very weary of doing that since that is always tricky to get right.

@jessesuen jessesuen removed this from the v2.0.0-beta2 milestone Jan 27, 2018
@dmonakhov
Copy link
Contributor Author

dmonakhov commented Jan 28, 2018

@jessesuen Thank you for explanation.
May be there is workaround for this I'm not an exert in k8s API.
My scenario:
a CI job which want to compile. So I want complete it ASAP, for that reason I use
'make -j $NRCPU', where NRCPU is $(numcpu-1) on kube-nodes. It is obvious that hardcoding NRCPU is bad for portability because it NRCPU varies from cluster to cluster

Is it possible to define resource via generic kubernetes API? Via configMap or other config?

@dmonakhov
Copy link
Contributor Author

dmonakhov commented Jan 28, 2018

Yeah. It seems that 'kind: LimitRange' resource does exactly what I need. https://kubernetes.io/docs/tasks/administer-cluster/cpu-constraint-namespace .
Since workaround exists this issue can be moved to enhancement section.

@jessesuen
Copy link
Member

jessesuen commented Jan 28, 2018

Thanks for the legwork on the workaround. Yes, container.resources obtained from parameters is a use case we should support, so eventually this is something I'd like fixed. The right way to go about doing it, escapes me at the moment.

@discordianfish
Copy link
Contributor

discordianfish commented Oct 25, 2018

Is there a workaround for this? This is almost a show-stopper for the usecase here. We have several tasks that have vastely different resource requirements, which forces us to create many templates.

@epa095
Copy link
Contributor

epa095 commented May 20, 2019

A bit of brainstorming on this.
Would it maybe make sense to be able to define a "patch" to a workflow step, which could override parts of the yaml definition of the step. I am thinking something similar to overlays in kustomize.
This could take in exactly the same parameters as the step itself, and be applied to the workflow step right before execution. This would allow you to still use the k8s API types for the workflow step, but people would be "on their own" in regard to what they write in the patch/overlay step.

@igor47
Copy link

igor47 commented May 20, 2019

also got bitten by this issue. as @discordianfish alludes to, the only known workaround so far is to have multiple copies of the same template but with different resource requests. for example:

spec:
  entrypoint: test
  templates:
    - name: test
      dag:
        tasks:
          - name: task1
             template: pod-lowmem
          - name: task2
             template: pod-highmem
      - name: pod-lowmem
         container:
           image: myimage:latest
           resources:
             requests:
               memory: 500Mi
       - name: pod-highmem
          container:
            image: myimage:latest
            resources:
               requests:
                 memory: 2Gi

some way to DRY this manifest would be greatly appreciated. i guess it's possible to use YAML anchors but the syntax there is quite confusing

@elsonrodriguez
Copy link

elsonrodriguez commented Aug 14, 2019

As a workaround for this I ended up using a resource to create a Job. Messy but workable.

Food for thought: A killer use case for this would be in combination with GKE's node auto-provisioner, especially for compute-heavy jobs. This way memory/cpu/gpu could be defined as a parameter, and eventually it will just run without needing to define/provision nodes.

@Snapple49
Copy link

Snapple49 commented Sep 4, 2019

@elsonrodriguez Hi, would you mind explaining your workaround please? Not sure if I follow, but I'm interested in solutions to this problem as well

@elsonrodriguez
Copy link

elsonrodriguez commented Sep 5, 2019

@Snapple49 Basically I just use this pattern:

https://github.com/argoproj/argo/blob/baf37052976458401a6c0e44d06f30dc8d819680/examples/k8s-jobs.yaml

The manifest portion is fully template compatible, so you can swap out anything for a variable, including resource allocation.

For example:

                resources:
                  limits:
                    memory: "{{inputs.parameters.memory}}"
                    cpu: "{{inputs.parameters.cpu}}"

The downside is that logs won't show up directly associated with the workflow, it's less readable, and more complex.

@Snapple49
Copy link

Snapple49 commented Sep 6, 2019

I see, thanks a lot! That's neat, I'll have a play with that, messy or not it's a workaround 😁

@Oskoss
Copy link

Oskoss commented Oct 2, 2019

+1 Just running into this issue for us as well.

@jessesuen
Copy link
Member

jessesuen commented Oct 4, 2019

Would it maybe make sense to be able to define a "patch" to a workflow step, which could override parts of the yaml definition of the step. I am thinking something similar to overlays in kustomize. This could take in exactly the same parameters as the step itself, and be applied to the workflow step right before execution. This would allow you to still use the k8s API types for the workflow step, but people would be "on their own" in regard to what they write in the patch/overlay step.

So far, this is the the most promising proposal I've seen to address this problem. I think a first class patch spec, which supports parameter substitution, and then applied on top of the container before submission, might be able to handle this in a generic way (not just for resources, but for other non-string fields like bools and ints).

@jgbaum
Copy link

jgbaum commented Oct 25, 2019

Hi @sarabala1979.

Is this in release 2.4.2? If so, can you please provide an example on how to pass the cpu or memory resource requirements?

Thanks!

-J

@jgbaum
Copy link

jgbaum commented Oct 26, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants