Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job submission: allow custom resource requests #32

Comments

@lukasheinrich
Copy link
Member

#31 would introduce default resources, but of course, custom one could be submitted at job submission. I'm thinking we could just reuse the k8s syntax straight away

https://kubernetes.io/docs/user-guide/compute-resources/

@alintulu
Copy link
Member

alintulu commented Sep 1, 2020

Continuing the discussion on resource requests. In the reana.yaml there is already the use of the clause resources which is used when the workflow needs to access CVMFS.

workflow:
  resources:
    cvmfs:
      - fcc.cern.ch

This clause could maybe be used for requesting CPU/memory or setting an accounting group for HTCondor jobs. One suggestion:

workflow:
  resources:
    htcondor:
      accounting_group: 'physics_group'
      cpu: 4

However when it comes to setting HTCondor submission parameters such as max runtime, a problem arises. Since this is set workflow-wide, there is one value for the whole workflow. Specifying max runtime for HTCondor jobs would need to be set at every step independently. The value could even be flexible, changing depending on the sample.

@tiborsimko any thoughts on how to best implement this? Maybe max_runtime could follow the same structure as kerberos: true

workflow:
      type: serial
      specification:
        steps:
          - environment: 'reanahub/reana-env-root6'
            kerberos: true
            max_runtime: 3600
            commands:
              - echo hello

@tiborsimko
Copy link
Member

Yes, I fully agree to specify them alongside each workflow step and not globally as for the CVFMS volumes. (That one is for volume mounting, so could have been globally defined more easily.)

For a Yadage example, see kubernetes_uid in resources clause:

eventselection:
  process:
    process_type: interpolated-script-cmd
    interpreter: bash
    script: |
      source /home/atlas/release_setup.sh
      source /analysis/build/x86*/setup.sh

      cat << 'EOF' > recast_xsecs.txt
      id/I:name/C:xsec/F:kfac/F:eff/F:relunc/F
      {did} {name} {xsec_in_pb} 1.0 1.0 1.0
      EOF

      echo {dxaod_file} > recast_inputs.txt
      myEventSelection {submitDir} recast_inputs.txt recast_xsecs.txt {lumi_in_ifb}
  publisher:
    publisher_type: interpolated-pub
    publish:
      histfile: '{submitDir}/hist-sample.root'
  environment:
    environment_type: 'docker-encapsulated'
    image: reanahub/reana-demo-atlas-recast-eventselection
    imagetag: '1.0'
    resources:
      - kubernetes_uid: 500

We could have there:

resources:
  - kubernetes_uid: 500
  - htcondor_accounting_group: 'physics_group'
  - htcondor_cpu: 4

Perhaps we can use a simple flat structure (i.e. htcondor_foo) since we already have kubernetes_uid there.

Something to cross-check with Yadage's native use of resources? CC @lukasheinrich @danikam are you also interested in Kubernetes or HTCondor memory/processor settings?

(2) For CWL it is similar, we are using "hints" there:

steps:
  first:
    hints:
      reana:
        compute_backend: htcondorcern

(3) For Serial, we can do as you suggest, and let them live alongside kerberos: true clause. However, it might also be nice to have a special "resources" sub-clause there, such as Yadage's "resources" or CWL's "hints", for consistency. Although this would mean to alter current behaviour of kerberos, which is perhaps not a good moment to address right now... Something to think about for later?

@alintulu
Copy link
Member

alintulu commented Sep 2, 2020

@clelange

@alintulu
Copy link
Member

alintulu commented Sep 2, 2020

Concerning the parameter max_runtime, it could be implemented more generally (not as htcondor_maxruntime) and used for the HTCondor parameter +MaxRuntime when HTCondor is specified as backend, and for the pods when the backend is Kubernetes.

@clelange
Copy link

clelange commented Sep 2, 2020

I guess at some point things could get a bit complicated if depending on the compute backend one needs to always prepend the backend name to the resource. This makes it somewhat annoying if one wanted to switch from one to the other. This is something I would usually want to do, since I expect Kubernetes jobs to start much faster, i.e. I would use Kubernetes for validation and HTCondor for the full processing.

There are of course certain resources that only make sense for a given platform, e.g. the account group on HTCondor, so they should probably have this prefix while +MaxRuntime on HTCondor is largely equivalent to activeDeadlineSeconds of a Kubernetes Job spec, see https://kubernetes.io/docs/concepts/workloads/controllers/job/#job-termination-and-cleanup. This does not translate 1-to-1 to a Pod, but is probably pretty close in the HEP case.

For CPU and memory requests, I would think that plain cpu and memory fields as for Kubernetes Pod resources can be used while there is a differentiation between resources and limits:

resources:
  requests:
    memory: "64Mi"
    cpu: "250m"
  limits:
    memory: "128Mi"
    cpu: "500m"

Not sure if REANA should reflect the default on HTCondor, i.e. request cpu: 1000m and memory: 2Gi since that would make scheduling more difficult due to the limits number of resources, but at the same time maybe make jobs more stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment