# Python Subprocesses

### Running System Command in Our Script

if we needed to run a system program from a Python script? Say, for example, that as part of a Python script, we needed to send ICMP packets to a host to check if it's responding.

We could try to look for an external module that provides this functionality. Or we can just run the ping command, which will send packets for us.

Sometimes it's easier or faster to use a system command as part of our Python script to accomplish a task, or use some functionality that doesn't exist in the Python modules, neither built in or external.

For these cases, Python provides a way to execute system commands in scripts, using functions provided by the subprocess module.

Let's check out an example. First, we'll import a subprocess module, and then we'll call the date command, which shows the current date usingthe subprocess.run function.

The run function returns an object of the CompletedProcess type. This object includes information related to the execution of the command.


In [1]:
import subprocess
print(subprocess.run(["date"]))

CompletedProcess(args=['date'], returncode=0)


To run the external command a secondary environment is created for the child process or subprocess where the command is executed.

While the parent process, which is our script, is waiting on the subprocess to finish,it's blocked, which means that the parent can't do any work until the child finishes.

After the external command completes its work, the child process exits and the flow of control returns to the parent. Then the script can continue with normal execution.

In [2]:
print(subprocess.run(["sleep", "2"]))

CompletedProcess(args=['sleep', '2'], returncode=0)


In [3]:
result= subprocess.run(["ls", "area.py"]) #the file exists so the return code be 0

In [4]:
print(result.returncode)

0


In [5]:
result= subprocess.run(["ls", "file_doesnot_exists.txt"])  #file doesnot exist

In [6]:
print(result.returncode)

2


In [7]:
print(result)

CompletedProcess(args=['ls', 'file_doesnot_exists.txt'], returncode=2)


LS will print an error and return an exit status different than 0.

This will be stored in the return code attribute of the completed process instance, and
we can access that value in our code. We can see that the command failed and
the returncode stored was 2, letting us know that there was an error.

We could use this information in the script to do something different in case the failure.
Using the run function like this is useful if we just want to run a command and only care about whether or
not it was successful. 

The output of the command will be printed to the screen, which means that our script
has no control over it. This can be handy for system commands that either don't have useful output like cp,
chmod, sleep, and many others, or when we don't care about processing the output any further.

In other words, when it's just fine to have the output, print it to the screen.
For example, if we're writing a script that's changing the permissions of a bunch of files in a tree of directories,

we don't care about the output of the chmod command. 
We only want to know if it was successful or not.

If instead, we want to capture the output of an external command and then operate with the results,
we need a different strategy.

### Obtainig output of a System Commnad

If we want our Python scripts to manipulate the output of system commandm that we're executing, we need to tell the run function to capture it for us.

This might be helpful when we need to extract information from a command and then use it for something
else in our script. 

For example, say you wanted to create some stats on which users are logging into a server throughout the day. You could do this with a script that calls the **who** command, which prints the users currently logged into a computer. 

The script could parse the output of the command, storing the list of
logged-in users once per hour and at the end of the date to generate
a daily report.

To be able to process the output of commands, we'll set a parameter
called **capture output** to **true** when calling the run function. 

For our next example, we'll call the **host** command, which can convert a host name to an IP address and vice versa.


In [8]:
res= subprocess.run(["host", "8.8.8.8"], capture_output= True)
print(res.returncode)

0


In [9]:
print(res.stdout)

b'8.8.8.8.in-addr.arpa domain name pointer dns.google.\n'



that **b** tells us that this string is not a proper string for Python. It's actually an array of bytes.

Data in computers is stored and transmitted in bytes and each can represent up to 256 characters. But there are thousands of
possible characters out there used to write in various languages. 

Chinese, for example, requires over 10,000 different characters. To be able to write in those languages, several specifications called encodings have been created over time to indicate which sequences of bytes represent which characters.

Nowadays, most people use UTF-8 encoding, which is part of the
Unicode standard that lists all the possible characters that can be represented. 

So going back to our example when we execute the command using run, Python doesn't know which encoding to use to process the output of the command. 

So it simply represents it as a series of bytes. If we want this to
become a proper string, we can call the decode method. This method applies an encoding to transform the bytes into a string. 

By default, it uses a UTF-8 encoding which is what we want. So with all that said, let's transform our array of bytes into a string and then split it into several pieces. 

In [10]:
print(res.stdout.decode().split())

['8.8.8.8.in-addr.arpa', 'domain', 'name', 'pointer', 'dns.google.']



In this way, we're operating with the output of the command that we ran, and we can do whatever we need to do with it.

For example, we can choose to keep the last element of the list, which is the name that corresponds to the IP that we're looking for.


But what about standard error? 
If we use the capture output parameter and the command writes any
output to standard error, it will be stored in the **stderr** attribute of the completed process instance.



In [11]:
result= subprocess.run(["rm", "doesnot_exists"], capture_output= True)
print(result.returncode)

1


In [12]:
print(result.stdout)

b''


In [13]:
print(result.stderr)

b"rm: cannot remove 'doesnot_exists': No such file or directory\n"


### Advanced Subprocess Management

one way of providing information to our processes is to modify the
environment variables. Using this mechanism, we can change where the process looks for executable files, which commands it uses interact with some parts of the system, the kind of output it'll generate and a bunch more things. 

The usual strategy for modifying the environment of a child process is to first copy the environment seen by our process, do any necessary changes, and then pass that as the environment that the
child process will see. 


In [None]:
import os
import subprocess

my_env= os.environ.copy()
#my_env["PATH"] = os.pathsep.join(["/home/sapan/anaconda3/", my_env["PATH"]])
my_env["PATH"]= os.pathsep.join(["/home/sapan/anaconda3/myapp", my_env["PATH"]])

result= subprocess.run(["myapp"], env= my_env)

So in this code, we start by calling the **copy** method of the OS environ dictionary that contains the current environment variables.

This creates a new dictionary that we can change as needed without modifying the original environment.

The change that we're doing in this script is adding one extra directory to the path variable.

Remember, the path variable indicates where the operating system will look for the executable programs. By adding one entry to the path, we're telling the OS to look for programs in an
additional location. 

To create the new value, we're calling the join method on the OS path substring. This joins elements of the list that we're passing with a path separator corresponding to the current operating system. 

So here, we're joining **/opt/myapp/** and the old value of the path variable to the path separator.

Finally, we call the **myapp** command, setting the env parameter to the new environment that we've just prepared.

So to recap, this script is modifying the contents of the
path environment variable by adding a directory to it.

We then call the **myapp** command with that modified variable. Doing it this way, the command will run in the modified environment with the updated value of path.





There are a bunch more options that we can use with the run function. 

For example, we can use the **CWD** parameter to change the current
working directory where the command will be executed. This can be really helpful when working with a set of directories where you need to run a command on each of them. 

We could also set the **timeout** parameter. This will cause the run
function to kill the process if it takes longer than a given number of seconds to finish.
This might be useful if you're running a command that you
know might get stuck. For example, if it'strying to connect to a network and your computer is offline,

or we can also set the **shell** parameter. If we set this to true, Python will first execute an instance of the default system shell and then run the given command inside of it. 

This means our command line could include variable expansions and
other shell operations. Without the **shell** parameter, this would not be possible. 



For now, just keep in mind that if you need to expand variables or globs, you'll need to set this parameter. But using this can be a security risk. So make sure you actually need it and be careful when
using it if you do. 




a word of caution. Interfacing the underlying system directly in your Python scripts via subprocesses and system commands can be useful especially if you need to do a specific task quickly. 

But it comes with some drawbacks. Using these system-level commands built assumptions into our scripts about the infrastructure, our automation will run on. If those assumptions change, it can lead to unexpected effects or failures. These kinds of assumptions
can change in multiple ways.

What would happen to our automation is the flags where terminal command change and our script continues to use the old flags? 
What happens if we switch operating systems from Linux to Windows? Will our scripts fail outright or will they succeed in unintended and
possibly harmful ways? 

Any change to the system or external commands our scripts use increases the chances of something breaking. Sometimes that break might be obvious and other times it might be difficult to detect.

If we're automating a one-off, well-defined task, we're developing a solution quickly is the biggest requirement, then using system commands and subprocesses can help a lot. 

But if we're doing something more complex or long-running, it's usually a good idea to use the bait in or external modules
that Python provides. 

So before deciding to use a sub processes, it's a good idea to check the standard library or **pypi** repository to see if we can do the task with native Python and to check if someone has already created the automation that we wanted to write. 

Remember that we never want to reinvent the wheel. 



