Skip to content

pman: long process

Rudolph Pienaar edited this page Mar 21, 2017 · 12 revisions

pman: long process

Abstract

This page describes using pman to start a process that will run for a reasonable amount of time, and show how to query for this process, and also how pman reports failure conditions.

Preconditions

  • This page assumes that pman is listening on: 172.17.0.2:5010.
  • Make sure that pman has been started (see here for more info):
pman --rawmode 1 --http  --port 5010 --listeners 12
  • This page assumes that a previous run has been managed with parameters
{  "action": "run",
        "meta": {
                "cmd":      "sleep 30",
                "auid":     "rudolphpienaar",
                "jid":      "sleep-1234",
                "threaded": true
        }
}

This spawns a simple sleep process that essentially just does nothing for 30 seconds. In this example we will explore what is returned while the process is still running, what is returned when it ends, and also what happens if a process dies unexpectedly.

start sleeping...

Type a new purl command starting with (just copy/paste the following into a terminal):

purl --content-type application/vnd.collection+json --content-type application/vnd.collection+json  --verb POST  --raw  --http 172.17.0.2:5010/api/v1/cmd  --jsonwrapper 'payload'  --msg \

and finish with the relevant msg payload:

'{  "action": "run",
        "meta": {
                "cmd":      "sleep 30",
                "auid":     "rudolphpienaar",
                "jid":      "sleep-1234",
                "threaded": true
        }
}' --quiet --jsonpprintindent 4 

to start the sleep process.

check on process while it runs...

Now, copy paste above purl prefix and use as msg payload:

start status

'{  "action": "status",
        "meta": {
                "key":          "jid",
                "value":        "sleep-1234"
        }
}' --quiet --jsonpprintindent 4  

which should return

{
    "payloadsize": 80,
    "action": "status",
    "meta": {
        "value": "sleep-1234",
        "key": "jid"
    },
    "d_ret": {
        "0.end": {
            "jobRoot": "20170308164211.942222_5f540393-9ed4-489e-8a65-1ae636d16424",
            "returncode": []
        },
        "l_status": [
            "started"
        ],
        "0.start": {
            "jobRoot": "20170308164211.942222_5f540393-9ed4-489e-8a65-1ae636d16424",
            "startTrigger": [
                true
            ]
        }
    },
    "status": true,
    "RESTverb": "POST",
    "RESTheader": "POST /api/v1/cmd HTTP/1.1\r",
    "receivedByServer": [
        "POST /api/v1/cmd HTTP/1.1\r",
        "Host: 172.17.0.2:5010\r",
        "User-Agent: PycURL/7.43.0 libcurl/7.47.0 GnuTLS/3.4.10 zlib/1.2.8 libidn/1.32 librtmp/2.3\r",
        "Accept: */*\r",
        "Content-type: application/vnd.collection+json\r",
        "Content-Length: 80\r",
        "\r",
        "{\"payload\": {\"action\": \"status\", \"meta\": {\"key\": \"jid\", \"value\": \"sleep-1234\"}}}"
    ],
    "path": "/api/v1/cmd"
}

Note that the l_status is returned as started. Note also that the 0.end->returncode list is empty, denoting that the process has not ended.

end status

Once the job has finished, we can ask for its status. As before, copy paste the purl prefix and use the same msg payload:

'{  "action": "status",
        "meta": {
                "key":          "jid",
                "value":        "sleep-1234"
        }
}' --quiet --jsonpprintindent 4  

This time the return JSON contains:

    "d_ret": {
      ...
      "l_status": [
            "finishedSuccessfully"
        ]
      ...
    }

Interrupted process

Let us now simulate an interrupted/failed process. As before we will start a sleep process, but before it terminates normally, we will explicitly kill it.

First, start a new sleep process with the following msg payload:

'{  "action": "run",
        "meta": {
                "cmd":      "sleep 300",
                "auid":     "rudolphpienaar",
                "jid":      "sleep-died",
                "threaded": true
        }
}' --quiet --jsonpprintindent 4 

Note we are giving a different jid and making the sleep want to run for 300 seconds (i.e. 5 minutes). Once transmitted, and on the machine that is running pman, check for the sleep process in the process table,

ps -Af | grep sleep

which should return something like:

rudolph+  1202   826  0 10:11 ?        00:00:00 /bin/sh -c sleep 300
rudolph+  1203  1202  0 10:11 ?        00:00:00 sleep 300
rudolph+  1205   867  0 10:11 ?        00:00:00 grep sleep

of course, your PIDs will be different. Now kill the sleep from the command line (using of course the correct PID for your example) -- in my case, this is PID == 1203.

kill -9 1203

Now, let's query the status of this process from pman. Send the following msg payload:

'{  "action": "status",
        "meta": {
                "key":          "jid",
                "value":        "sleep-died"
        }
}' --quiet --jsonpprintindent 4  

which should return

{
    "RESTverb": "POST",
    "action": "status",
    "path": "/api/v1/cmd",
    "RESTheader": "POST /api/v1/cmd HTTP/1.1\r",
    "payloadsize": 80,
    "d_ret": {
        "l_status": [
            "finishedWithError"
        ],
        "0.start": {
            "startTrigger": [
                true
            ],
            "jobRoot": "20170309101124.645898_fc21bad8-6044-4a4f-9e60-3176912e141f"
        },
        "0.end": {
            "returncode": [
                -9
            ],
            "jobRoot": "20170309101124.645898_fc21bad8-6044-4a4f-9e60-3176912e141f"
        }
    },
    "receivedByServer": [
        "POST /api/v1/cmd HTTP/1.1\r",
        "Host: 172.17.0.2:5010\r",
        "User-Agent: PycURL/7.43.0 libcurl/7.47.0 GnuTLS/3.4.10 zlib/1.2.8 libidn/1.32 librtmp/2.3\r",
        "Accept: */*\r",
        "Content-type: application/vnd.collection+json\r",
        "Content-Length: 80\r",
        "\r",
        "{\"payload\": {\"action\": \"status\", \"meta\": {\"value\": \"sleep-died\", \"key\": \"jid\"}}}"
    ],
    "status": true,
    "meta": {
        "value": "sleep-died",
        "key": "jid"
    }
}

Note that the l_status list contains finishedWithError and the actual returncode is -9.

--30--