<a href="https://colab.research.google.com/github/jchen6727/batchtk/blob/development/examples/colab_driveless/batchtk1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Tutorial 1**

**Note 0** This tutorial will set use the `batchtk` from tutorial 1 with socket communication features


In [None]:
#jupyter 0
!pip install batchtk
import site
site.addsitedir('/usr/local/lib/python3.10/dist-packages')

**Note 1** This time we'll focus on the **INETDispatcher** which uses `INET` (TCP) to communicate with the host. Let's inherit the `INETDispatcher` and an example `Submit`, the one that handles sockets on the `SH` environment.

In [None]:
#jupyter 1
from batchtk.runtk import INETDispatcher, Submit, SHSubmitSOCK, Template
print(SHSubmitSOCK())


**Note 2** Notice the important line:
```
export SOCNAME="{sockname}"
```
this is what we can use to establish the socket address which the `Dispatcher` and `Runner` can communicate through.

Let's extend the `Submit` with a new `class` that extends functionality to allow for `socket` communication on `Google Colab`

For reference, here is our old code:
```
class GCSubmit(Submit):
  def __init__(self):
    # creates a Submit with the templates we define
    super().__init__(
        submit_template = Template("sh {output_path}/{label}.sh"),
        script_template = Template("""\
#!/bin/bash
cd {project_path}
export FOO={foovalue}
export BAR={barvalue}
export BAZ={bazvalue}
{env}
nohup python /content/runner.py > {output_path}/{label}.run 2>&1 &
pid=$!
echo $pid >&1
"""
        )
    )
  def submit_job(self):
    # using this submit_job, we can add some handling of stdout, job failure (i.e. if stdout does not return an integer value as expected),
    # extending the functionality of Submit with this exception handling.
    proc = super().submit_job()
    try:
      self.job_id = int(proc.stdout)
    except Exception as e:
      raise(Exception("{}\nJob submission failed:\n{}\n{}\n{}\n{}".format(e, self.submit, self.script, proc.stdout, proc.stderr)))
    return self.job_id

gcs = GCSubmit()
print(gcs)
```

In [None]:
#jupyter 2
class GCSubmitSOCK(Submit):
  def __init__(self):
    # creates a Submit with the templates we define
    super().__init__(
        submit_template = Template("sh {output_path}/{label}.sh"),
        script_template = Template("""\
#!/bin/bash
cd {project_path}
export SOCNAME="{sockname}"
{env}
nohup python /content/runner.py > {output_path}/{label}.run 2>&1 &
pid=$!
echo $pid >&1
"""
        )
    )
  def submit_job(self):
    # using this submit_job, we can add some handling of stdout, job failure (i.e. if stdout does not return an integer value as expected),
    # extending the functionality of Submit with this exception handling.
    proc = super().submit_job()
    try:
      self.job_id = int(proc.stdout)
    except Exception as e:
      raise(Exception("{}\nJob submission failed:\n{}\n{}\n{}\n{}".format(e, self.submit, self.script, proc.stdout, proc.stderr)))
    return self.job_id

gcs = GCSubmitSOCK()

**Note 3.0** N.B, the `socname={sockname}` provides a field that the INETDispatcher can fill as it requests a TCP port for communication. This happens at job creation

**Note 3.1** Now we can pass the custom submission to our **INETDispatcher** which extends the base **SHDispatcher** with support for communication.

In [None]:
#jupyter 3
dispatcher = INETDispatcher(project_path='/content', output_path='./batch', submit=gcs, gid='sock_example')
print(dispatcher.submit) # prints the dispatcher.submit

**Note 6** To pass arguments to the **Runner** script, we will call `update_env` from the dispatcher. The argument is a dictionary of `key:value` pairs. Additionally, we can update the arbitrary `FOO`, `BAR` and `BAZ` values from the `dispatcher.submit`

In [None]:
#jupyter 6
dispatcher.update_env({'strvalue': '1',
                       'intvalue': 2,
                       'fltvalue': 3.0})
print(dispatcher.submit)

**Note 7** Upon job creation, the `{env}`, `{project_path}`, `{output_path}`  and `{label}` are filled.

the `{env}` will be replaced with a custom `serialization` (in this case, exported string values) that can then be deserialized by the **runner** in the `runner.py` script

In [None]:
#jupyter 7
dispatcher.create_job()
print(dispatcher.submit) # see the new submit

**Note 8** Let's download and check a more advanced `socket_runner.py` using the `Runner` class.

In [None]:
#jupyter 8
!curl https://raw.githubusercontent.com/jchen6727/batchtk/development/examples/colab/socket_runner.py > /content/runner.py
!cat /content/runner.py

**Note 9** This new runner is similar to the one in the `batchtk0.ipynb`. Of note, it also calls a connect in order to establish communication with the `SOCNAME` TCP port, and finally calls `runner.close()` in order to deallocate the opened socket.

On the same end, the `dispatcher` also calls a `dispatcher.clean` to deallocate the opened socket.

N.B. generally sockets are deallocated with the completion of the function. We can use `lsof` to check this; however, it is considered good programming practice to `deallocate` and `free` resources when they are no longer in use...

In [None]:
#jupyter 9
dispatcher.submit_job()
connection, runner_address = dispatcher.accept()
recv_message = dispatcher.recv()
dispatcher.job_id # prints the job_id, should match the printed pid from the runner.py script
print(connection)
print(recv_message)
dispatcher.clean()


In [None]:
!lsof
