## Running OS Commands

Let us understand how to run OS commands using Python using libraries such as `os` and `subprocess`.

* Python provides several libraries which can be used to run OS commands. `os` and `subprocess` are most popular ones.
* We can import the libraries such as `os` and `subprocess` to start using them.
* There are bunch of commands to create directories, change ownership, change permission, run general system commands etc.
* `os` library is extensively used to read environment variables at run time of the application. It is used to pass keys and credentials to work with databases, external applications etc.
* Typically keys and credentials should not be part of the source code.
* `subprocess` can be used to run the commands and also to process the output.

In [1]:
import os

* Get current working directory.

In [2]:
os.getcwd()

'/home/nghiaht7/data-engineer/data-engineering-essentials/05_programming_essentials_using_python/06_basic_programming_constructs'

* Read environment variables

In [3]:
os.environ.get('PATH')

'/home/nghiaht7/data-engineer/.venv/bin:/home/nghiaht7/.poetry/bin:/home/nghiaht7/.pyenv/shims:/home/nghiaht7/.pyenv/bin:/home/nghiaht7/.pyenv/versions/3.8.10/bin:/home/nghiaht7/.pyenv/libexec:/home/nghiaht7/.pyenv/plugins/python-build/bin:/home/nghiaht7/.pyenv/plugins/pyenv-virtualenv/bin:/home/nghiaht7/.pyenv/plugins/pyenv-update/bin:/home/nghiaht7/.pyenv/plugins/pyenv-installer/bin:/home/nghiaht7/.pyenv/plugins/pyenv-doctor/bin:/home/nghiaht7/.pyenv/plugins/python-build/bin:/home/nghiaht7/.pyenv/plugins/pyenv-virtualenv/bin:/home/nghiaht7/.pyenv/plugins/pyenv-update/bin:/home/nghiaht7/.pyenv/plugins/pyenv-installer/bin:/home/nghiaht7/.pyenv/plugins/pyenv-doctor/bin:/home/nghiaht7/.poetry/bin:/home/nghiaht7/.sdkman/candidates/spark/current/bin:/home/nghiaht7/.sdkman/candidates/scala/current/bin:/home/nghiaht7/.sdkman/candidates/sbt/current/bin:/home/nghiaht7/.sdkman/candidates/java/current/bin:/home/nghiaht7/.sdkman/candidates/hadoop/current/bin:/home/nghiaht7/.pyenv/shims:/home/nghi

In [4]:
os.environ.get('USER')

'nghiaht7'

In [5]:
os.environ.get('HOME')

'/home/nghiaht7'

In [6]:
%%sh

env

SDKMAN_VERSION=5.12.2
POETRY_ACTIVE=1
LANGUAGE=en_US:en
USER=nghiaht7
LC_TIME=vi_VN
SBT_HOME=/home/nghiaht7/.sdkman/candidates/sbt/current
MPLBACKEND=module://matplotlib_inline.backend_inline
SSH_AGENT_PID=4215
XDG_SESSION_TYPE=x11
SHLVL=2
LESS=-R
HOME=/home/nghiaht7
OLDPWD=/home/nghiaht7/data-engineer
DESKTOP_SESSION=ubuntu
LSCOLORS=Gxfxcxdxbxegedabagacad
ZSH=/home/nghiaht7/.oh-my-zsh
GNOME_SHELL_SESSION_MODE=ubuntu
GTK_MODULES=gail:atk-bridge
SCALA_HOME=/home/nghiaht7/.sdkman/candidates/scala/current
PAGER=cat
}}${_p9k__e::=${${_p9k__1ldir+00}:-${${(%):-$_p9k__c%1(l.1.0)}[-1]}1}}}+}${${_p9k__e:#00}:+${${_p9k_t[$_p9k__n]/<_p9k__ss>/$_p9k__ss}/<_p9k__s>/$_p9k__s}${_p9k__v}${${(M)_p9k__e:#11}:+ }${_p9k__c}%b%K{004\}%F{254\} ${${:-${_p9k__s::=%F{004\}}${_p9k__ss::=│}${_p9k__sss::=%F{004\}}${_p9k__i::=2}${_p9k__bg::=004}}+}}${(e)_p9k__vcs}%b%k$_p9k__sss%b%k%f}}}+}${(e)_p9k_t[6]}${${_p9k__h::=0}+}${${_p9k__d::=$((_p9k__m-_p9k__h))}+}${_p9k__lprompt/\%\{d\%\}*\%\{d\%\}/${_p9k__1ldir-${:-"%B

In [7]:
os.environ.get?

[0;31mSignature:[0m [0mos[0m[0;34m.[0m[0menviron[0m[0;34m.[0m[0mget[0m[0;34m([0m[0mkey[0m[0;34m,[0m [0mdefault[0m[0;34m=[0m[0;32mNone[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m D.get(k[,d]) -> D[k] if k in D, else d.  d defaults to None.
[0;31mFile:[0m      ~/.pyenv/versions/3.8.10/lib/python3.8/_collections_abc.py
[0;31mType:[0m      method


In [8]:
os.environ.get('PASSWORD', 'Passwords should be confidential')

'Passwords should be confidential'

* Run `ls -ltr` command to get list of files in the current directory.

In [9]:
import subprocess

In [10]:
%%sh

ls -ltr

total 64
-rw-rw-r-- 1 nghiaht7 nghiaht7  9929 Thg 8  12 20:35 08_all_about_for_loops.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  6339 Thg 8  12 20:35 07_conditionals.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  8006 Thg 8  12 20:35 05_operators_in_python.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  1081 Thg 8  12 20:35 04_data_types_commonly_used.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  2990 Thg 8  12 20:35 03_variables_and_objects.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  1462 Thg 8  12 20:35 02_getting_help.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7 22947 Thg 8  13 16:31 09_running_os_commands.ipynb


In [11]:
output = subprocess.check_call(['ls', '-ltr'])

total 64
-rw-rw-r-- 1 nghiaht7 nghiaht7  9929 Thg 8  12 20:35 08_all_about_for_loops.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  6339 Thg 8  12 20:35 07_conditionals.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  8006 Thg 8  12 20:35 05_operators_in_python.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  1081 Thg 8  12 20:35 04_data_types_commonly_used.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  2990 Thg 8  12 20:35 03_variables_and_objects.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  1462 Thg 8  12 20:35 02_getting_help.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7 22947 Thg 8  13 16:31 09_running_os_commands.ipynb


In [12]:
output

0

In [13]:
output = subprocess.check_output(['ls', '-ltr'])

In [14]:
output # output is of type binary

b'total 64\n-rw-rw-r-- 1 nghiaht7 nghiaht7  9929 Thg 8  12 20:35 08_all_about_for_loops.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7  6339 Thg 8  12 20:35 07_conditionals.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7  8006 Thg 8  12 20:35 05_operators_in_python.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7  1081 Thg 8  12 20:35 04_data_types_commonly_used.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7  2990 Thg 8  12 20:35 03_variables_and_objects.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7  1462 Thg 8  12 20:35 02_getting_help.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7 22947 Thg 8  13 16:31 09_running_os_commands.ipynb\n'

In [15]:
type(output)

bytes

In [16]:
output.decode('utf-8') # converts to string of type utf-8

'total 64\n-rw-rw-r-- 1 nghiaht7 nghiaht7  9929 Thg 8  12 20:35 08_all_about_for_loops.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7  6339 Thg 8  12 20:35 07_conditionals.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7  8006 Thg 8  12 20:35 05_operators_in_python.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7  1081 Thg 8  12 20:35 04_data_types_commonly_used.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7  2990 Thg 8  12 20:35 03_variables_and_objects.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7  1462 Thg 8  12 20:35 02_getting_help.ipynb\n-rw-rw-r-- 1 nghiaht7 nghiaht7 22947 Thg 8  13 16:31 09_running_os_commands.ipynb\n'

In [17]:
type(output.decode('utf-8'))

str

```{note}
Let us convert string into list of strings. Once it is broken into list of strings we can process the data as per our requirements either by using Map Reduce libraries or Pandas based libraries.
```

In [18]:
output.decode('utf-8').splitlines()

['total 64',
 '-rw-rw-r-- 1 nghiaht7 nghiaht7  9929 Thg 8  12 20:35 08_all_about_for_loops.ipynb',
 '-rw-rw-r-- 1 nghiaht7 nghiaht7  6339 Thg 8  12 20:35 07_conditionals.ipynb',
 '-rw-rw-r-- 1 nghiaht7 nghiaht7  8006 Thg 8  12 20:35 05_operators_in_python.ipynb',
 '-rw-rw-r-- 1 nghiaht7 nghiaht7  1081 Thg 8  12 20:35 04_data_types_commonly_used.ipynb',
 '-rw-rw-r-- 1 nghiaht7 nghiaht7  2990 Thg 8  12 20:35 03_variables_and_objects.ipynb',
 '-rw-rw-r-- 1 nghiaht7 nghiaht7  1462 Thg 8  12 20:35 02_getting_help.ipynb',
 '-rw-rw-r-- 1 nghiaht7 nghiaht7 22947 Thg 8  13 16:31 09_running_os_commands.ipynb']

In [19]:
type(output.decode('utf-8').splitlines())

list

In [20]:
# splitlines is the function available on string type
# It converts string with line breaks into list of strings
for rec in output.decode('utf-8').splitlines():
    print(rec)

total 64
-rw-rw-r-- 1 nghiaht7 nghiaht7  9929 Thg 8  12 20:35 08_all_about_for_loops.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  6339 Thg 8  12 20:35 07_conditionals.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  8006 Thg 8  12 20:35 05_operators_in_python.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  1081 Thg 8  12 20:35 04_data_types_commonly_used.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  2990 Thg 8  12 20:35 03_variables_and_objects.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7  1462 Thg 8  12 20:35 02_getting_help.ipynb
-rw-rw-r-- 1 nghiaht7 nghiaht7 22947 Thg 8  13 16:31 09_running_os_commands.ipynb
