# Operating System Module

The Operating System ```os``` module is used to navigate around the Operating System. It therefore performs similar behaviour to the native shell programming languages (PowerShell in Windows and bash in Linux) and is primarily functional based.

## Categorize_Identifiers Module

This notebook will use the following functions ```dir2```, ```variables``` and ```view``` in the custom module ```categorize_identifiers``` which is found in the same directory as this notebook file. ```dir2``` is a variant of ```dir``` that groups identifiers into a ```dict``` under categories and ```variables``` is an IPython based a variable inspector. ```view``` is used to view a ```Collection``` in more detail:

In [1]:
from categorize_identifiers import dir2, variables, view

## OS Module

The module can be imported:

In [2]:
import os

The identifiers for the ```os``` module can be examined. 

There are a number of uppercase constants, which are separated from lowercase constants listed as attributes. The uppercase constants are represent various flags and options used in file operations, process management, and file permissions and are typically used internally by developers. 

The lowercase constants are typically used by the end user and are sometimes OS specific.

The functions allow navigation around the operating system, changing of file permissions, creating and deleting folders:

In [3]:
dir2(os)

{'attribute': ['altsep',
               'confstr_names',
               'curdir',
               'defpath',
               'devnull',
               'environ',
               'environb',
               'extsep',
               'linesep',
               'name',
               'pardir',
               'pathconf_names',
               'pathsep',
               'sep',
               'supports_bytes_environ',
               'supports_dir_fd',
               'supports_effective_ids',
               'supports_fd',
               'supports_follow_symlinks',
               'sysconf_names'],
 'constant': ['CLD_CONTINUED',
              'CLD_DUMPED',
              'CLD_EXITED',
              'CLD_KILLED',
              'CLD_STOPPED',
              'CLD_TRAPPED',
              'CLONE_FILES',
              'CLONE_FS',
              'CLONE_NEWIPC',
              'CLONE_NEWNET',
              'CLONE_NEWNS',
              'CLONE_NEWPID',
              'CLONE_NEWUSER',
              'CLONE_NEWUTS',
   

Most of the functions will operate in the current working directory by default. There is the submodule ```os.path``` which groups together path based identifiers, which are used to construct a file path. This file path can be used to change the directory used by ```os``` functions:

In [4]:
dir2(os.path)

{'attribute': ['altsep',
               'curdir',
               'defpath',
               'devnull',
               'extsep',
               'pardir',
               'pathsep',
               'sep',
               'supports_unicode_filenames'],
 'module': ['genericpath', 'os', 'stat', 'sys'],
 'method': ['abspath',
            'basename',
            'commonpath',
            'commonprefix',
            'dirname',
            'exists',
            'expanduser',
            'expandvars',
            'getatime',
            'getctime',
            'getmtime',
            'getsize',
            'isabs',
            'isdir',
            'isfile',
            'isjunction',
            'islink',
            'ismount',
            'join',
            'lexists',
            'normcase',
            'normpath',
            'realpath',
            'relpath',
            'samefile',
            'sameopenfile',
            'samestat',
            'split',
            'splitdrive',
            'spl

Note that the most commonly used lowercase constants are also imported directly into the ```os``` namespace for convenience:

In [5]:
os.sep == os.path.sep

True

A summary about the module can be found using ```?```

In [6]:
os?

[0;31mType:[0m        module
[0;31mString form:[0m <module 'os' (frozen)>
[0;31mFile:[0m        ~/anaconda3/envs/jupyter-env/lib/python3.12/os.py
[0;31mDocstring:[0m  
OS routines for NT or Posix depending on what system we're on.

This exports:
  - all functions from posix or nt, e.g. unlink, stat, etc.
  - os.path is either posixpath or ntpath
  - os.name is either 'posix' or 'nt'
  - os.curdir is a string representing the current directory (always '.')
  - os.pardir is a string representing the parent directory (always '..')
  - os.sep is the (or a most common) pathname separator ('/' or '\\')
  - os.extsep is the extension separator (always '.')
  - os.altsep is the alternate pathname separator (None or '/')
  - os.pathsep is the component separator used in $PATH etc
  - os.linesep is the line separator in text files ('\r' or '\n' or '\r\n')
  - os.defpath is the default search path for executables
  - os.devnull is the file path of the null device ('/dev/null', etc.)

Prog

More details can be seen using ```help```:

In [7]:
help(os)

Help on module os:

NAME
    os - OS routines for NT or Posix depending on what system we're on.

MODULE REFERENCE
    https://docs.python.org/3.12/library/os.html

    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    This exports:
      - all functions from posix or nt, e.g. unlink, stat, etc.
      - os.path is either posixpath or ntpath
      - os.name is either 'posix' or 'nt'
      - os.curdir is a string representing the current directory (always '.')
      - os.pardir is a string representing the parent directory (always '..')
      - os.sep is the (or a most common) pathname separator ('/' or '\\')
      - os.extsep is the extension separator (always '.')
      - os.altsep is the alternate pathname 

## Attributes

The ```os.name``` attribute will give the name of the operating system. ```nt``` for Windows and ```posix``` for Linux. On this Linux machine:

In [8]:
os.name

'posix'

```os.path.curdir```, also available as ```os.curdir``` is the current directory:

In [9]:
os.curdir

'.'

```os.path.pardir```, also available as ```os.pardir``` is the parent directory:

In [10]:
os.pardir

'..'

These have the same values on Windows and Linux so ```.``` and ```..``` are commonly used directly.

```os.path.extsep```, also available as ```os.extsep``` is the extension separator which splits the file name from the file extension:

In [11]:
os.extsep

'.'

```os.path.sep```, also available as ```os.sep``` is the separator which splits each directory name:

In [12]:
os.sep

'/'

```os.sep``` is OS specific. Linux uses the forward slash ```/``` while Windows uses the backslash ```\```.

In Python ```\``` is used to insert an escape character into a ```str``` instance and therefore this is shown as ```\\```.

On Linux this notebook file is found in:

In [13]:
'./notebook.ipynb'

'./notebook.ipynb'

On Windows this notebook file is found in:

In [14]:
'.\\notebook.ipynb'

'.\\notebook.ipynb'

When expressed using the constants below the representation appropriate for each Operating System will display:

In [15]:
os.curdir + os.sep + 'notebook' + os.extsep +'ipynb'

'./notebook.ipynb'

```os.linesep``` is an attribute for the line separator in a file:

In [17]:
os.linesep

'\n'

There is again a slight difference on Linux and Windows. Linux uses ```'\n'``` and Windows uses ```'\r\n'```.

My home folder ```~``` on this Linux computer is found in:

In [None]:
'/home/philip'

Alternatively it could be constructed using:

In [18]:
os.sep + 'home' + os.sep + 'philip' 

'/home/philip'

It is error prone to manually place backslashes as it is easy to miss one or include one additional. The  ```os.path.join``` function will automatically include separators:

In [19]:
os.path.join('home', 'philip')

'home/philip'

## Environmental Variables

**Hardcoding an absolute path like the above is bad practice.** If a file is being searched for using the absolute path above, the code will work on my computer but it won't work on yours because your user profile will be different. ```os.environ``` is a ```dict``` instance which is used to access environmental variables. An environmental variable is a relative location on your computer, for example your userprofile:

In [20]:
os.environ

environ{'SHELL': '/bin/bash',
        'WSL2_GUI_APPS_ENABLED': '1',
        'CONDA_EXE': '/home/philip/anaconda3/bin/conda',
        '_CE_M': '',
        'WSL_DISTRO_NAME': 'Ubuntu',
        'XML_CATALOG_FILES': 'file:///home/philip/anaconda3/envs/jupyter-env/etc/xml/catalog file:///etc/xml/catalog',
        'NAME': 'pc',
        'PWD': '/home/philip',
        'GSETTINGS_SCHEMA_DIR': '/home/philip/anaconda3/envs/jupyter-env/share/glib-2.0/schemas',
        'LOGNAME': 'philip',
        'CONDA_PREFIX': '/home/philip/anaconda3/envs/jupyter-env',
        'GSETTINGS_SCHEMA_DIR_CONDA_BACKUP': '',
        'HOME': '/home/philip',
        'LANG': 'C.UTF-8',
        'WSL_INTEROP': '/run/WSL/971_interop',
        'LS_COLORS': 'rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tl

Although the ```dict``` instance can be indexed into, using the function ```os.getenv``` is generally preferred as a ```KeyError``` won't display for an unrecognised key, this is important as the ```'USERPROFILE'``` environmental variable doesn't exist on Linux:

In [21]:
os.getenv('USERPROFILE')

The ```os.path.expanduser``` function works on Linux and Windows as ```'~'```

In [22]:
os.path.expanduser('~') 

'/home/philip'

Care needs to be taken with separators with this function:

In [23]:
os.path.expanduser('~' + os.sep + 'Documents') 

'/home/philip/Documents'

For this reason it is commonly used with ```os.path.join```:

In [24]:
userprofile = os.path.expanduser('~')

In [25]:
os.path.join(userprofile, 'Documents')

'/home/philip/Documents'

The current working directory can be found using the ```os.getcwd```:

In [26]:
os.getcwd()

'/home/philip/Documents/python-notebooks/os_module'

## File Operations

The files and subdirectories in this directory can be listed using the ```os.listdir``` function:

In [27]:
os.listdir()

['notebook_linux.ipynb',
 'notebook.ipynb',
 '__pycache__',
 'categorize_identifiers.py',
 'images',
 '.ipynb_checkpoints',
 'text.txt']

This will by default be the folder containing the Interactive Python Notebook file. The directory can be changed using the ```os``` function ```chdir``` for example to the parent directory using ```..``` or ```os.pardir```:

In [28]:
os.chdir(os.pardir)

In [29]:
os.getcwd()

'/home/philip/Documents/python-notebooks'

This parent directory can be assigned to a variable using the ```os``` function ```getcwd```:

In [30]:
parent = os.getcwd()

In [31]:
variables(['parent'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
parent,str,39,/home/philip/Documents/python-notebooks


And the ```os.path``` function ```join``` can be used to join this with the folder and the name of the notebook itself:

In [32]:
notebook_path = os.path.join(parent, 'os_module', 'notebook' + os.extsep + 'ipynb')

In [33]:
variables(['notebook_path'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
notebook_path,str,64,/home/philip/Documents/python-notebooks/os_module/notebook.ipynb


The ```os.path``` function ```exists``` can be used to check whether a file exists returning a boolean:

In [34]:
os.path.exists(notebook_path)

True

The ```os.path``` function ```split``` returns a ```tuple``` where the first element is the directory of the file and the second element is the file including the file extension:

In [35]:
os.path.split(notebook_path)

('/home/philip/Documents/python-notebooks/os_module', 'notebook.ipynb')

In [36]:
file_path, file = os.path.split(notebook_path)

In [37]:
variables(['file_path', 'file'])

Unnamed: 0_level_0,Type,Size/Shape,Value
Instance Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
file_path,str,49,/home/philip/Documents/python-notebooks/os_module
file,str,14,notebook.ipynb


The ```os.path.splitext``` function splits a file path from its file extension, once again returning a 2 element ```tuple``` where the first element is the file path including the file name and the second element is the extension respectively:

In [38]:
os.path.splitext(file)

('notebook', '.ipynb')

In [39]:
os.path.splitext(notebook_path)

('/home/philip/Documents/python-notebooks/os_module/notebook', '.ipynb')

The current working directory can be changed to the folder of this notebook file:

In [40]:
os.chdir(file_path)

The contents can be listed using:

In [41]:
os.listdir()

['notebook_linux.ipynb',
 'notebook.ipynb',
 '__pycache__',
 'categorize_identifiers.py',
 'images',
 '.ipynb_checkpoints',
 'text.txt']

A directory can be made using the ```os``` function make directory ```mkdir```. Here a check will be made to see if the directory exists and if it doesn't to create it:

In [42]:
if not os.path.exists('directory1'):
    os.mkdir('directory1')

In [43]:
if not os.path.exists('directory2'):
    os.mkdir('directory2')

In [44]:
os.listdir()

['notebook_linux.ipynb',
 'notebook.ipynb',
 '__pycache__',
 'directory1',
 'categorize_identifiers.py',
 'images',
 '.ipynb_checkpoints',
 'text.txt',
 'directory2']

```'directory2'``` can be removed using the ```os``` command remove directory ```rmdir```:

In [45]:
os.rmdir('directory2')

In [46]:
os.listdir()

['notebook_linux.ipynb',
 'notebook.ipynb',
 '__pycache__',
 'directory1',
 'categorize_identifiers.py',
 'images',
 '.ipynb_checkpoints',
 'text.txt']

The current working directory can be changed to ```directory1```:

In [47]:
os.chdir('directory1')

In [48]:
os.listdir()

[]

And a file can be created, this time using ```newline=os.linesep``` selecting the defaults of the operating system:

In [49]:
with open('text.txt', mode='w', encoding='utf-8', errors='strict', newline=os.linesep) as file:
    file.write('Hello World!\nBye World!')

A Python file can be created in the same manner:

In [50]:
with open('script.py', mode='w', encoding='utf-8', errors='strict', newline=os.linesep) as file:
    file.write("print('Hello World!')\n")

These files can be seen:

In [51]:
os.listdir()

['script.py', 'text.txt']

If the parent folder is selected using:

In [52]:
os.chdir(os.pardir)

If this folder is attempted to be deleted using ```rmdir``` an ```OSError: The directory is not empty``` will display:

```python
os.rmdir('directory1')
```

This is done as a background check to make sure files aren't also accidently deleted. Individual files can be deleted using the ```os.remove``` function:

In [53]:
os.remove('directory1' + os.sep + 'text.txt')
os.remove('directory1' + os.sep + 'script.py')

And now because it is empty it can be deleted:

In [54]:
os.rmdir('directory1')

The ```os.makedirs``` function is more powerful and can be used to create multiple subfolders:

In [55]:
os.makedirs('directory1' + os.sep + 'subdirectory1')

In [56]:
os.listdir()

['notebook_linux.ipynb',
 'notebook.ipynb',
 '__pycache__',
 'directory1',
 'categorize_identifiers.py',
 'images',
 '.ipynb_checkpoints',
 'text.txt']

In [57]:
os.listdir('directory1')

['subdirectory1']

In [58]:
os.listdir('directory1' + os.sep + 'subdirectory1')

[]

The ```os.removedirs``` function can be used to remove a directory of empty subdirectories:

In [59]:
os.removedirs('directory1' + os.sep + 'subdirectory1')

The ```os.replace``` function can be used to replace a source with a destination, in essence allowing renaming of a directory or file:

In [60]:
if not os.path.exists('directory1'):
    os.makedirs('directory1' + os.sep + 'subdirectory1')

In [61]:
file_path = os.path.join(os.getcwd(), 'directory1', 'subdirectory1', 'script.py')

with open(file_path, mode='w', encoding='utf-8', errors='strict', newline=os.linesep) as file:
    file.write("print('Hello World!')\n")

In [62]:
os.listdir('directory1')

['subdirectory1']

In [63]:
os.listdir('directory1' + os.sep + 'subdirectory1')

['script.py']

This script file can be renamed using:

In [64]:
source =  os.path.join(os.getcwd(), 'directory1', 'subdirectory1', 'script.py')
destination = os.path.join(os.getcwd(), 'directory1', 'subdirectory1', 'pscript.py')
os.replace(source, destination)

In [65]:
os.listdir('directory1')

['subdirectory1']

In [66]:
os.listdir('directory1' + os.sep + 'subdirectory1')

['pscript.py']

It can also be renamed and moved:

In [67]:
source = os.path.join(os.getcwd(), 'directory1', 'subdirectory1', 'pscript.py')
destination = os.path.join(os.getcwd(), 'directory1', 'script.py')
os.replace(source, destination)

In [68]:
os.listdir('directory1')

['script.py', 'subdirectory1']

In [69]:
os.listdir('directory1' + os.sep + 'subdirectory1')

[]

Another Python script file can be created:

In [70]:
file_path = os.path.join(os.getcwd(), 'directory1', 'subdirectory1', 'script1.py')

with open(file_path, mode='w', encoding='utf-8', errors='strict', newline=os.linesep) as file:
    file.write("print('Hello World!')\n")

Its contents can be seen using ```os.listdir```:

In [71]:
os.listdir('directory1')

['script.py', 'subdirectory1']

In [72]:
os.listdir('directory1' + os.sep + 'subdirectory1')

['script1.py']

## Walk

The ```os.walk``` function can be used to create a generator:

In [73]:
forward = os.walk('directory1')
forward

<generator object walk at 0x7fc6db8218b0>

When next is used a three element ```tuple``` is generated of the parent folder, a list of subfolders and a list of files:

In [74]:
next(forward)

('directory1', ['subdirectory1'], ['script.py'])

In [75]:
next(forward)

('directory1/subdirectory1', [], ['script1.py'])

This is typically used in a loop:

In [76]:
top = os.walk('directory1')
for root, dirs, files in top:
    print(root, end='\n')
    print('\t', dirs)
    print('\t', files)

directory1
	 ['subdirectory1']
	 ['script.py']
directory1/subdirectory1
	 []
	 ['script1.py']


The ```topdown``` input argument can be assigned to ```False``` showing longer file paths first:

In [77]:
top = os.walk('directory1', topdown=False)
for root, dirs, files in top:
    print(root, end='\n')
    print('\t', dirs)
    print('\t', files)

directory1/subdirectory1
	 []
	 ['script1.py']
directory1
	 ['subdirectory1']
	 ['script.py']


This can be used in with a number of nested ```for``` loops for example to recursively delete all files and subdirectories within a directory: 

In [78]:
for root, dirs, files in os.walk('directory1', topdown=False):
    for name in files:
        os.remove(os.path.join(root, name))
    for name in dirs:
        os.rmdir(os.path.join(root, name))
        
os.rmdir('directory1')

[Return to Python Tutorials](../readme.md)