Scale Up #43

sean-smith · 2018-07-20T00:12:27Z

New script, scaling.py, scales up based on number of nodes requested in the scheduler's queue. This gets rid of slow scale up times.

I merged in the boto3 branch, only files that need review are scaling/*

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

enrico-usai · 2018-07-20T12:30:20Z

scaling/scaling.cfg

@@ -0,0 +1,5 @@
+[scaling]


Is it a sample file? Could you add a comment at the top?

comment added. This file is used for local testing.

enrico-usai · 2018-07-20T12:41:03Z

sqswatcher/send_sqs.sh

+cfn_region=us-east-1
+cfn_sqs_queue=https://sqs.us-east-1.amazonaws.com/822857487308/AS-SQS
+cfn_instance_slots=3
+instance_id=i-04fcafee570cd7073


What is this instance_id? Is this file a sample?

Removed this file. It was just used for testing.

mauri-melato · 2018-07-23T13:40:00Z

scaling/plugins/sge.py

+    args = shlex.split(command)
+    pending = 0
+    try:
+       _output = subprocess.check_output(args)


Alas check_output was introduced in Python 2.7 and this is not going to work on Centos6 with Python 2.6.6.
We could make an helper module to cope with this. See also PR #32
(The same comment applies to all the subprocess.check_output below)

ah yes, I can change this function to use subprocess.Popen

mauri-melato · 2018-07-23T18:16:40Z

scaling/plugins/sge.py

+    slots = run_command(command)
+    return nodes(slots, instance_type)
+
+# get nodes requested from pending jobs


The comment needs to be changed (the same for the running() functions of the other modules).

mauri-melato

As a general comment about the current implementation, I'd suggest to better decouple the scheduler plugin modules from other implementation details and choices.
In particular I wouldn't pass the instance_type down to the sge/slurm/torque but a dict of properties, today just the 'cpus' or 'processors' property.
The scaling module should then cope with the details of retrieving the AWS instance details - today the json file put somewhere it knows, etc.... and then pass the properties downstream,

mauri-melato · 2018-07-23T20:23:50Z

scaling/plugins/torque.py

+          line_arr = line.split()
+          print(line_arr)
+          if len(line_arr) >= 10 and line_arr[9] in status:
+             pending += int(line_arr[5])


the variable should be renamed to something like 'njobs' since they are not necessarily pending jobs

mauri-melato · 2018-07-23T20:27:31Z

scaling/plugins/sge.py

+       for line in output:
+          line_arr = line.split()
+          if len(line_arr) >= 8:
+             pending += int(line_arr[7])


Rename in 'njobs' or similar, see similar comment below

mauri-melato · 2018-07-23T20:31:20Z

scaling/plugins/slurm.py

+       for line in output:
+          line_arr = line.split()
+          if len(line_arr) == 2 and line_arr[0] == status:
+             pending += int(line_arr[1])


Please rename it, see comment for 'torque'.

fnubalaj

Are we no longer supporting openlava.py? Don't see the changes to that file?

On reviewing the cfncluster.cfn.json, I think maybe we should delete openlava.py from the plugins in a separate task?

mauri-melato · 2018-08-10T21:16:20Z

jobwatcher/jobwatcher.py

+        try:
+            vcpus = int(instances[instance_type]["vcpus"])
+            log.info("Instance %s has %s slots." % (instance_type, vcpus))
+            return {'compute_instance': vcpus}


This should be the name of the property of the instance. Tomorrow we may need to pass memory or gpus etc. Avoiding to call it vcpu, we can call it processors, slots ...?

changed to slots.

mauri-melato · 2018-08-10T21:17:53Z

jobwatcher/jobwatcher.py

+        instance_properties = get_instance_properties(instance_type)
+
+        # Get number of nodes requested
+        pending = s.pending(instance_properties)


I'd name the methods for what they actually return, get_required_nodes() and get_busy_nodes() or something similar.

mauri-melato · 2018-08-10T21:43:35Z

jobwatcher/jobwatcher.py

+        # check to make sure it's in limits
+        desired = running + pending
+        if desired > max:
+            log.info("%d requested nodes is greater than max %d. Requesting max %d." % (desired, max, max))


This is worth to be a warning level message since we are touching the max limit and some of the pending jobs are not going to be served in the next scale up round..

This is normal behaviour. I wouldn't expect it to be a warning.

mauri-melato · 2018-08-13T13:35:13Z

jobwatcher/plugins/torque.py

+log = logging.getLogger(__name__)
+
+# get nodes requested from pending jobs
+def pending(instance_properties):


Should we consider to "raise an exception" (log an error, have some sort of user notification...) when a pending job requires a single node with a number of processors greater than of the num of processors provided by the instance type?

mauri-melato · 2018-08-13T13:39:19Z

jobwatcher/plugins/sge.py

+    return -(-slots // vcpus)
+
+# get nodes reserved by running jobs
+def running(instance_properties):


Here we're returning the theoretical number of busy hosts computed from the busy slots.
Consider the case we have 2 instances of 2 slots each and 1-slot-job running on each of them. Current ASG-desired = 2.
Then assume a 2-slots-job is submitted:
#busy_hosts = 2/2 = 1
#requested_hosts = 2/2 = 1
==> #new_desired = 1 + 1 = 2 === ASG_current_desired
==> no scaling

We should discuss about the possible scenarios and the best effort approach we want to follow.

Changed. New behaviour counts a whole node as busy if any of the slots are used/reserved.

For example, in the following, there's 8 slots filled which previously would report 2 nodes. With the change it'll report 3 nodes as busy.

$ qstat -f queuename qtype resv/used/tot. load_avg arch states --------------------------------------------------------------------------------- all.q@ip-172-31-16-37.ec2.inte BIP 0/2/4 0.00 lx-amd64 --------------------------------------------------------------------------------- all.q@ip-172-31-19-96.ec2.inte BIP 0/4/4 0.00 lx-amd64 --------------------------------------------------------------------------------- all.q@ip-172-31-20-40.ec2.inte BIP 0/0/4 0.27 lx-amd64 --------------------------------------------------------------------------------- all.q@ip-172-31-92-215.ec2.int BIP 0/2/4 0.00 lx-amd64

fnubalaj · 2018-08-15T18:31:32Z

jobwatcher/plugins/sge.py

+    _command = shlex.split(command)
+    try:
+        DEV_NULL = open(os.devnull, "rb")
+        process = sub.Popen(_command, env=dict(os.environ), stdout=sub.PIPE, stderr=sub.STDOUT, stdin=DEV_NULL)


Should you add /usr/bin and the likes to this environment. I see that qstat is not being used with the fullpath, how will qstat be resolved?

fnubalaj · 2018-08-15T18:33:34Z

jobwatcher/plugins/sge.py

+        process = sub.Popen(_command, env=dict(os.environ), stdout=sub.PIPE, stderr=sub.STDOUT, stdin=DEV_NULL)
+        _output = process.communicate()[0]
+        exitcode = process.poll()
+        if exitcode != 0:


This could be an issue if there is nothing written to stdout. I think in such cases exitcode is almost always 1. Could you please verify that?

that's just for grep. For qstat, it returns a 0 exit code even with no output.

fnubalaj

Please take a look at the comments.

Changes the scaling logic from a cloudwatch metric produced every two minutes to a cronjob on the node that directly sets the asg values. Signed-off-by: Sean Smith <seaam@amazon.com>

enrico-usai reviewed Jul 20, 2018

View reviewed changes

mauri-melato reviewed Jul 23, 2018

View reviewed changes

fnubalaj reviewed Jul 24, 2018

View reviewed changes

sean-smith changed the base branch from develop to scale July 25, 2018 17:38

mauri-melato reviewed Aug 13, 2018

View reviewed changes

fnubalaj reviewed Aug 15, 2018

View reviewed changes

fnubalaj previously approved these changes Aug 15, 2018

View reviewed changes

sean-smith dismissed fnubalaj’s stale review via e8ca087 August 16, 2018 00:27

Sean Smith and others added 2 commits August 15, 2018 17:41

Scale Up Based on Nodes Requested

2155542

Changes the scaling logic from a cloudwatch metric produced every two minutes to a cronjob on the node that directly sets the asg values. Signed-off-by: Sean Smith <seaam@amazon.com>

Merge branch 'scale' into Scaling

86e5490

fnubalaj approved these changes Aug 16, 2018

View reviewed changes

sean-smith merged commit 80df33f into aws:scale Aug 16, 2018

sean-smith mentioned this pull request Aug 30, 2018

Rq not seen as queue to increase capacity aws/aws-parallelcluster#77

Closed

Scale Up #43

Scale Up #43

Uh oh!

Conversation

sean-smith commented Jul 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mauri-melato Jul 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mauri-melato left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fnubalaj left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mauri-melato Aug 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fnubalaj left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

sean-smith commented Jul 20, 2018 •

edited

Loading

mauri-melato Jul 23, 2018 •

edited

Loading

mauri-melato left a comment •

edited

Loading

fnubalaj left a comment •

edited

Loading

mauri-melato Aug 13, 2018 •

edited

Loading