# Integrating third-party tools in *pyrpipe*
Executing any shell command with pyrpipe is easy and straight-forward. 

The `Runnable` class can used to import any Unix command into python in an object oriented manner. The `Runnable` class executes all commands via the `pyrpipe_engine` module, which provides helper functions to easily execute and log shell commands. 
Users can directly use `execute_command()` function from `pyrpipe_engine` to directly run Unix commands.

## The Runnable class

To import a Unix command one can directly create a Runnable object and specify the command name. The following example imports the [orfipy](https://github.com/urmi-21/orfipy) command into python.

In [1]:
from pyrpipe.runnable import Runnable
orfipy=Runnable(command='orfipy')
#specify orfipy options; these can be specified into orfipy.yaml too
param={'--outdir':'orfipy_out','--procs':'3','--dna':'orfs.fa'}
orfipy.run(infile,**param)

ModuleNotFoundError: No module named 'pyrpipe'

### Targets and dependencies
One can specify required dependencies and expected target files in the run() method
Replacing the call to `run()` with the following will verify the required files and the target files.
If command is interrupted, pyrpipe will scan for `Locked` taget files and resume from where the pipeline was interrupted.

In [None]:
orfipy.run(infile,requires=infile,target='orfipy_out/orfs.fa',**param)

## Building APIs
One can extend the Runnable class to provide custom APIs to Unix tools. The RNA-Seq API provided by pyrpipe uses this framework. As a small example is provided in the [tutorial](https://pyrpipe.readthedocs.io/en/latest/?badge=latest)

## The pyrpipe_engine module

The `pyrpipe_engine` module contains the necessary functions to execute the commands. User can directly use these functions to run commands. All these function are decorated by the `dryable` decorator and are automatically turned off if pyrpipe scripts are run with `--dry-run` option.

A list of these functions is provided here. For details refer to the [API docs](https://pyrpipe.readthedocs.io/en/latest/?badge=latest)

| Function | Description |
| --- | --- |
| execute_command | Runs a command, logs the status and returns the status (True or False) |
| get_shell_output | Runs a command and returns a tuple (returncode, stdout and stderr) |
| get_return_status | Runs a command and returns True if command succeeded or False otherwise |
| execute_commandRealtime | Runs a command and print output in real-time |


### The execute_command() method

Execute a command, log the details and return the status (True or False).

The following example executes a simple `ls -l` command. The command is not logged (`logs=False`) and the stdout is printed to screen as (`verbose=True`). See API docs for more information [`execute_command()`](https://pyrpipe.readthedocs.io/en/latest/pyrpipe.html#pyrpipe.pyrpipe_engine.execute_command)



In [3]:
#Import necessary modules
from pyrpipe import pyrpipe_engine as pe

#run a shell commad
pe.execute_command(['ls', '-l'],logs=False,verbose=True)

ModuleNotFoundError: No module named 'pyrpipe'

## Commands in a `string`
A command in a `string` for mat can be easily converted to a list.

In [2]:
cmd="blastx -query sample_data/test.fa -db sample_data/pldb/mydb -qcov_hsp_perc 30 -num_threads 2 -out sample_data/blast_out"
cmdList=cmd.split()
pe.execute_command(cmdList,verbose=True,logs=False,objectid="",command_name="")

#head the output
pe.execute_command(['head','-20','sample_data/blast_out'],verbose=True,logs=False,objectid="",command_name="")

[94m$ blastx -query sample_data/test.fa -db sample_data/pldb/mydb -qcov_hsp_perc 30 -num_threads 2 -out sample_data/blast_out[0m
[92mTime taken:0:00:04[0m
[94m$ head -20 sample_data/blast_out[0m
[94mSTDOUT:
BLASTX 2.7.1+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.



Database: mydb
           250 sequences; 128,483 total letters



Query= CNT0043697

Length=699


[0m
[92mTime taken:0:00:00[0m


True

## Commands in a `dict`
The `pyrpipe_utils` module contains helper functions [`parse_unix_args()`](https://pyrpipe.readthedocs.io/en/latest/pyrpipe.html#pyrpipe.pyrpipe_utils.parse_unix_args) and [`parse_java_args()`](https://pyrpipe.readthedocs.io/en/latest/pyrpipe.html#pyrpipe.pyrpipe_utils.parse_java_args) to convert commands in a `dict` to a list. This option can be useful to read commands or rules stored in .json format and execute them with pyrpipe.

In [3]:
from pyrpipe import pyrpipe_utils as pu
#run blast
"""NOTE: python 3.6 and higher keeps the order in which dict elements are inserted.
To provide positional arguments use "--" as key followed by a tuple. for example:
dict={'-threads':'10','--':('file1','file2')} will be parsed as

-threads 10 file1 file2

"""

blast_parameters={'-query':'sample_data/test.fa',
                  '-db': 'sample_data/pldb/mydb',
                  '-qcov_hsp_perc': '30',
                  '-num_threads': '2',
                  '-out': 'sample_data/blast_out2'
}

blast_cmd=['blastx']

param_list=pu.parse_unix_args([],blast_parameters) 
#Note: the first argument, valid_args_list, can be provided to ignore invalid arguments

#add parameters
blast_cmd.extend(param_list)
pe.execute_command(blast_cmd,verbose=True,quiet=False,logs=False,objectid="",command_name="")

#head the output
pe.execute_command(['head','-20','sample_data/blast_out2'],verbose=True,quiet=False,logs=False,objectid="",command_name="")


[94m$ blastx -query sample_data/test.fa -db sample_data/pldb/mydb -qcov_hsp_perc 30 -num_threads 2 -out sample_data/blast_out2[0m
[92mTime taken:0:00:03[0m
[94m$ head -20 sample_data/blast_out2[0m
[94mSTDOUT:
BLASTX 2.7.1+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.



Database: mydb
           250 sequences; 128,483 total letters



Query= CNT0043697

Length=699


[0m
[92mTime taken:0:00:00[0m


True

## Getting stdout from command
The [`getShellOutput()`](https://pyrpipe.readthedocs.io/en/latest/pyrpipe.html#pyrpipe.pyrpipe_engine.getShellOutput) can directly return stdout, stderr and returncode as a tuple.

In [5]:
result=pe.getShellOutput(['du', '-sh','sample_data/blast_out2'])
#result contains return code, stdout, stderr
print(result)

#check if command was successful
if result[0] == 0:
    #get the stdout as string
    print(result[1].decode("utf-8"))
    

(0, b'364K\tsample_data/blast_out2\n', None)
364K	sample_data/blast_out2



## Get realtime output from shell
The `execute_commandRealtime()` produces outputs to screen in realtime.

In [5]:
cmd=['ping','-c','4','google.com']

for output in pe.execute_commandRealtime(cmd):
    print (output)

PING google.com (172.217.8.206) 56(84) bytes of data.

64 bytes from ord37s09-in-f14.1e100.net (172.217.8.206): icmp_seq=1 ttl=53 time=20.1 ms

64 bytes from ord37s09-in-f14.1e100.net (172.217.8.206): icmp_seq=2 ttl=53 time=20.1 ms

64 bytes from ord37s09-in-f14.1e100.net (172.217.8.206): icmp_seq=3 ttl=53 time=20.1 ms

64 bytes from ord37s09-in-f14.1e100.net (172.217.8.206): icmp_seq=4 ttl=53 time=20.0 ms



--- google.com ping statistics ---

4 packets transmitted, 4 received, 0% packet loss, time 3003ms

rtt min/avg/max/mdev = 20.087/20.140/20.193/0.146 ms

