# os modules

The Operating System ```os``` module is used to navigate around the Operating System. The ```os``` module is mainly functional based. It contains string attributes that are file path related and functions that return a file path in the form of a Python string. It also has equivalent functions to navigate around the operating system, similar to common commands in other operating system scripting languages like PowerShell in Windows or bash in Linux.

## os module

To import the module use:

In [1]:
import os

A summary about the module can be found using ```?```

In [2]:
os?

[1;31mType:[0m        module
[1;31mString form:[0m <module 'os' (frozen)>
[1;31mFile:[0m        c:\users\philip\miniconda3\envs\vscode\lib\os.py
[1;31mDocstring:[0m  
OS routines for NT or Posix depending on what system we're on.

This exports:
  - all functions from posix or nt, e.g. unlink, stat, etc.
  - os.path is either posixpath or ntpath
  - os.name is either 'posix' or 'nt'
  - os.curdir is a string representing the current directory (always '.')
  - os.pardir is a string representing the parent directory (always '..')
  - os.sep is the (or a most common) pathname separator ('/' or '\\')
  - os.extsep is the extension separator (always '.')
  - os.altsep is the alternate pathname separator (None or '/')
  - os.pathsep is the component separator used in $PATH etc
  - os.linesep is the line separator in text files ('\r' or '\n' or '\r\n')
  - os.defpath is the default search path for executables
  - os.devnull is the file path of the null device ('/dev/null', etc.)

Progr

More details can be seen using ```help```:

In [3]:
help(os)

Help on module os:

NAME
    os - OS routines for NT or Posix depending on what system we're on.

MODULE REFERENCE
    https://docs.python.org/3.11/library/os.html
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    This exports:
      - all functions from posix or nt, e.g. unlink, stat, etc.
      - os.path is either posixpath or ntpath
      - os.name is either 'posix' or 'nt'
      - os.curdir is a string representing the current directory (always '.')
      - os.pardir is a string representing the parent directory (always '..')
      - os.sep is the (or a most common) pathname separator ('/' or '\\')
      - os.extsep is the extension separator (always '.')
      - os.altsep is the alternate pathn

The ```print_identifier_group``` function from the custom ```helper_module``` can be imported to view the identifiers in more detail:

In [4]:
from helper_module import print_identifier_group

In [5]:
print_identifier_group(os, kind='attribute')

['abc', 'altsep', 'curdir', 'defpath', 'devnull', 'environ', 'extsep', 'linesep', 'name', 'pardir', 'path', 'pathsep', 'sep', 'st', 'supports_bytes_environ', 'supports_dir_fd', 'supports_effective_ids', 'supports_fd', 'supports_follow_symlinks', 'sys']


In [6]:
print_identifier_group(os, kind='method')

['_Environ', '_check_methods', '_execvpe', '_exists', '_exit', '_fspath', '_get_exports_list', '_walk', 'abort', 'access', 'add_dll_directory', 'chdir', 'chmod', 'close', 'closerange', 'cpu_count', 'device_encoding', 'dup', 'dup2', 'execl', 'execle', 'execlp', 'execlpe', 'execv', 'execve', 'execvp', 'execvpe', 'fdopen', 'fsdecode', 'fsencode', 'fspath', 'fstat', 'fsync', 'ftruncate', 'get_exec_path', 'get_handle_inheritable', 'get_inheritable', 'get_terminal_size', 'getcwd', 'getcwdb', 'getenv', 'getlogin', 'getpid', 'getppid', 'isatty', 'kill', 'link', 'listdir', 'lseek', 'lstat', 'makedirs', 'mkdir', 'open', 'pipe', 'popen', 'putenv', 'read', 'readlink', 'remove', 'removedirs', 'rename', 'renames', 'replace', 'rmdir', 'scandir', 'set_handle_inheritable', 'set_inheritable', 'spawnl', 'spawnle', 'spawnv', 'spawnve', 'startfile', 'stat', 'strerror', 'symlink', 'system', 'times', 'truncate', 'umask', 'unlink', 'unsetenv', 'urandom', 'utime', 'waitpid', 'waitstatus_to_exitcode', 'walk', '

## os.path submodule

The main purpose of the ```os``` module is to navigate around the Operating System and as a consequence many of its identifiers are grouped under the ```path``` submodule:

In [7]:
os.path?

[1;31mType:[0m        module
[1;31mString form:[0m <module 'ntpath' (frozen)>
[1;31mFile:[0m        c:\users\philip\miniconda3\envs\vscode\lib\ntpath.py
[1;31mDocstring:[0m  
Common pathname manipulations, WindowsNT/95 version.

Instead of importing this module directly, import os and refer to this
module as os.path.

In [8]:
print_identifier_group(os.path, kind='attribute')

['_LCMAP_LOWERCASE', '_LOCALE_NAME_INVARIANT', 'altsep', 'curdir', 'defpath', 'devnull', 'extsep', 'genericpath', 'os', 'pardir', 'pathsep', 'sep', 'stat', 'supports_unicode_filenames', 'sys']


In [9]:
print_identifier_group(os.path, kind='method')

['_LCMapStringEx', '_abspath_fallback', '_get_bothseps', '_getfinalpathname', '_getfinalpathname_nonstrict', '_getfullpathname', '_getvolumepathname', '_nt_readlink', '_path_normpath', '_readlink_deep', 'abspath', 'basename', 'commonpath', 'commonprefix', 'dirname', 'exists', 'expanduser', 'expandvars', 'getatime', 'getctime', 'getmtime', 'getsize', 'isabs', 'isdir', 'isfile', 'islink', 'ismount', 'join', 'lexists', 'normcase', 'normpath', 'realpath', 'relpath', 'samefile', 'sameopenfile', 'samestat', 'split', 'splitdrive', 'splitext']


## Attributes

The ```os``` attribute ```name``` will give the name of the operating system. ```nt``` for Windows and ```posix``` for Linux. On this Windows machine:

In [10]:
os.name

'nt'

The ```os``` module is further compartmentalised into path related identifiers via the ```path``` attribute. This module contains the path related attributes. ```curdir``` is the current directory:

In [11]:
os.path.curdir

'.'

```pardir``` is the parent directory:

In [12]:
os.path.pardir

'..'

These have the same values on Windows and Linux so ```.``` and ```..``` are commonly used directly.

```extsep``` is the extension seperator which splits the file name from the file extension:

In [13]:
os.path.extsep

'.'

This has the same value as ```curdir``` on Windows and Linux so ```.``` is commonly used for this also.

The main difference is in the seperator as Windows preferences the back slash ```\``` while Linux uses the forward slash ```/```. 

In Python ```\``` is used to insert an escape character in a string and has to be presented as ```\\```:

In [14]:
os.path.sep

'\\'

Windows also recognises the forward slash ```/``` as an alternative seperator:

In [15]:
os.path.altsep

'/'

These are commonly used so accessible also from ```os``` directly:

In [16]:
os.curdir

'.'

In [17]:
os.pardir

'..'

In [18]:
os.extsep

'.'

In [19]:
os.sep

'\\'

In [20]:
os.altsep

'/'

```linesep``` is an attribute for the line seperator in a file. Recall this is ```'\r\n'``` in Windows and ```'\n'``` in Linux/Mac:

In [21]:
os.linesep

'\r\n'

My ```%UserProfile%``` on this Windows computer is found in:

In [22]:
'C:\\Users\\Philip'

'C:\\Users\\Philip'

This can be simplified using a raw string:

In [23]:
r'C:\Users\Philip'

'C:\\Users\\Philip'

Alternatively it could be constructed using:

In [24]:
'C:' + os.sep + 'Users' + os.sep + 'Philip' 

'C:\\Users\\Philip'

Using ```os.sep``` is slightly more reliable than manually placing backslashes as it is easy to miss one or include one additional. The ```join``` function from the ```os.path``` module will automatically include seperators:

In [25]:
os.path.join('C:\\', 'Users', 'Philip')

'C:\\Users\\Philip'

And this function is generally quite smart at removing excess seperators which would otherwise result in the path not being found:

In [26]:
os.path.join('C:\\', '\\Users', 'Philip')

'C:\\Users\\Philip'

## Environmental Variables

Hardcoding an absolute path like the above is bad practice. If a file is being searched for in this absolute path above, it'll work on my computer but it won't work if you copy the code on your computer because your ```~``` (```%USERPROFILE%``` for Windows or ```HOME``` for Linux/Mac) will be different. The ```os``` module has an ```environ``` dictionary attribute which is used to access environmental variables which are essentially relative locations in accordance to your user profile:

In [27]:
os.environ

environ{'ALLUSERSPROFILE': 'C:\\ProgramData',
        'APPDATA': 'C:\\Users\\Philip\\AppData\\Roaming',
        'CHROME_CRASHPAD_PIPE_NAME': '\\\\.\\pipe\\crashpad_5432_ORIIKWFUXKPDSOCB',
        'COMMONPROGRAMFILES': 'C:\\Program Files\\Common Files',
        'COMMONPROGRAMFILES(X86)': 'C:\\Program Files (x86)\\Common Files',
        'COMMONPROGRAMW6432': 'C:\\Program Files\\Common Files',
        'COMPUTERNAME': 'XPS-9305',
        'COMSPEC': 'C:\\WINDOWS\\system32\\cmd.exe',
        'CONDA_DEFAULT_ENV': 'vscode',
        'CONDA_EXE': 'C:\\Users\\Philip\\anaconda3\\Scripts\\conda.exe',
        'CONDA_PREFIX': 'c:\\Users\\Philip\\Miniconda3\\envs\\vscode',
        'CONDA_PROMPT_MODIFIER': '(vscode) ',
        'CONDA_PYTHON_EXE': 'C:\\Users\\Philip\\anaconda3\\python.exe',
        'CONDA_ROOT': 'C:\\Users\\Philip\\anaconda3',
        'CONDA_SHLVL': '1',
        'DRIVERDATA': 'C:\\Windows\\System32\\Drivers\\DriverData',
        'EFC_9860': '1',
        'ELECTRON_RUN_AS_NODE': '1',
    

The function ```getenv``` reads an environmental variable from this dictionary although it is more common to index into the dictionary using the key.

Windows and Linux have different names and locations of Environmental Variables. Therefore the keys for the ```os.environ``` dictionary are different. In Windows the ```USERNAME``` can be obtained using the key ```'USERNAME'```, on Linux the ```USER``` can be obtained using the key ```'USER'```:

A check can be made for ```os.name``` and the appropriate environmental variable added::

In [28]:
if(os.name == 'nt'):
    # if Windows
    name = os.environ['USERNAME']
else:
    # else Linux/Mac
    name = os.environ['USER']
    
name

'Philip'

And so to get to the ```USERPROFILE```:

In [29]:
if(os.name == 'nt'):
    # if Windows
    home = os.path.join('C:\\', 'Users', os.environ['USERNAME'])
else:
    # else Linux/Mac
    home = os.path.join(os.sep + 'home', os.environ['USER'])
    
home

'C:\\Users\\Philip'

The ```USERPROFILE``` can also be selected using the key ```'USERPROFILE'``` on Windows or ```HOME``` on Linux/Mac:

In [30]:
if(os.name == 'nt'):
    # if Windows
    home = os.environ['USERPROFILE']
else:
    # else Linux/Mac
    home = os.environ['HOME']
    
home

'C:\\Users\\Philip'

This can be used in the ```join``` function from ```os.path``` to get to Documents:

In [31]:
if(os.name == 'nt'):
    # if Windows
    home = os.environ['USERPROFILE']
else:
    # else Linux/Mac
    home = os.environ['HOME']
    
documents = os.path.join(home, 'Documents')
documents

'C:\\Users\\Philip\\Documents'

Alternatively the ```expanduser``` method can be used, to expand a path from USERPROFILE on Windows and HOME on Linux/Mac using ```'~'```

In [32]:
os.path.expanduser('~') 

'C:\\Users\\Philip'

Care needs to be taken with seperators wit this method:

In [33]:
os.path.expanduser('~' + os.sep + 'Documents') 

'C:\\Users\\Philip\\Documents'

In Windows, the environmental Variables are normally upper case and enclosed in ```%```. The ```os.path``` function ```expandvars``` can be used to expand these locations:

In [34]:
os.path.expandvars('%USERPROFILE%' + os.sep + 'Documents')

'C:\\Users\\Philip\\Documents'

The current working directory can be found using the ```os``` function ```getcwd```:

In [35]:
os.getcwd()

'c:\\Users\\Philip\\Documents\\GitHub\\python-notebooks\\os_module'

## File Operations

The files and subdirectories in this directory can be listed using the ```os``` function ```listdir```:

In [36]:
os.listdir()

['helper_module.py', 'images', 'notebook.ipynb', 'text.txt', '__pycache__']

This will by default be the folder containing the Interactive Python Notebook file. The directory can be changed using the ```os``` function ```chdir``` for example to the parent directory using ```..``` or ```os.pardir```:

In [37]:
os.chdir(os.pardir)

In [38]:
os.getcwd()

'c:\\Users\\Philip\\Documents\\GitHub\\python-notebooks'

This parent directory can be assigned to a variable using the ```os``` function ```getcwd```:

In [39]:
parent = os.getcwd()
parent

'c:\\Users\\Philip\\Documents\\GitHub\\python-notebooks'

And the ```os.path``` function ```join``` can be used to join this with the folder and the name of the notebook itself:

In [40]:
notebook_path = os.path.join(parent, 'os_module', 'notebook' + os.extsep + 'ipynb')
notebook_path

'c:\\Users\\Philip\\Documents\\GitHub\\python-notebooks\\os_module\\notebook.ipynb'

The ```os.path``` function ```exists``` can be used to check whether a file exists returning a boolean:

In [41]:
os.path.exists(notebook_path)

True

The ```os.path``` function ```split``` returns a ```tuple``` where the first element is the directory of the file and the second element is the file including the file extension:

In [42]:
os.path.split(notebook_path)

('c:\\Users\\Philip\\Documents\\GitHub\\python-notebooks\\os_module',
 'notebook.ipynb')

In [43]:
file_path, file = os.path.split(notebook_path)

In [44]:
file_path

'c:\\Users\\Philip\\Documents\\GitHub\\python-notebooks\\os_module'

In [45]:
file

'notebook.ipynb'

The ```os.path``` function ```splitext``` splits a file path from its file extension, once again returning a 2 element ```tuple``` of the file path including the file name and the extension respectively:

In [46]:
os.path.splitext(file)

('notebook', '.ipynb')

In [47]:
os.path.splitext(notebook_path)

('c:\\Users\\Philip\\Documents\\GitHub\\python-notebooks\\os_module\\notebook',
 '.ipynb')

The current working directory can be changed to the folder of this notebook file:

In [48]:
os.chdir(file_path)

The contents can be listed using:

In [49]:
os.listdir()

['helper_module.py', 'images', 'notebook.ipynb', 'text.txt', '__pycache__']

A directory can be made using the ```os``` function make directory ```mkdir```. Here a check will be made to see if the directory exists and if it doesn't to create it:

In [50]:
if not os.path.exists('directory1'):
    os.mkdir('directory1')

In [51]:
if not os.path.exists('directory2'):
    os.mkdir('directory2')

In [52]:
os.listdir()

['directory1',
 'directory2',
 'helper_module.py',
 'images',
 'notebook.ipynb',
 'text.txt',
 '__pycache__']

```'directory2'``` can be removed using the ```os``` command remove directory ```rmdir```:

In [53]:
os.rmdir('directory2')

In [54]:
os.listdir()

['directory1',
 'helper_module.py',
 'images',
 'notebook.ipynb',
 'text.txt',
 '__pycache__']

The current working directory can be changed to ```directory1```:

In [55]:
os.chdir('directory1')

In [56]:
os.listdir()

[]

And a file can be created, this time using ```newline=os.linesep``` selecting the defaults of the operating system:

In [57]:
with open('text.txt', mode='w', encoding='utf-8', errors='strict', newline=os.linesep) as file:
    file.write('Hello World!\nBye World!')

A Python file can be created in the same manner:

In [58]:
with open('script.py', mode='w', encoding='utf-8', errors='strict', newline=os.linesep) as file:
    file.write("print('Hello World!')\n")

These files can be seen:

In [59]:
os.listdir()

['script.py', 'text.txt']

If the parent folder is selected using:

In [60]:
os.chdir(os.pardir)

And this folder is attempted to be deleted using ```rmdir``` an ```OSError: The directory is not empty``` will display:

This is done as a background check to make sure files aren't also accidently deleted. Individual files can be deleted using the ```os``` method ```remove```:

In [61]:
os.remove('directory1' + os.sep + 'text.txt')
os.remove('directory1' + os.sep + 'script.py')

And now because it is empty it can be deleted:

In [62]:
os.rmdir('directory1')

The ```os``` module also has the more powerful ```makedirs``` which can be used to create multiple subfolders:

In [63]:
os.makedirs('directory1' + os.sep + 'subdirectory1')

In [64]:
os.listdir()

['directory1',
 'helper_module.py',
 'images',
 'notebook.ipynb',
 'text.txt',
 '__pycache__']

In [65]:
os.listdir('directory1')

['subdirectory1']

In [66]:
os.listdir('directory1' + os.sep + 'subdirectory1')

[]

The ```os``` function ```removedirs``` can be used to remove a directory of empty subdirectories:

In [67]:
os.removedirs('directory1' + os.sep + 'subdirectory1')

The ```os``` function ```replace``` can be used to replace a source with a destination, in essence allowing renaming of a directory or file and moving location of a directory or file:

In [68]:
if not os.path.exists('directory1'):
    os.makedirs('directory1' + os.sep + 'subdirectory1')

In [69]:
file_path = os.path.join(os.getcwd(), 'directory1', 'subdirectory1', 'script.py')

with open(file_path, mode='w', encoding='utf-8', errors='strict', newline=os.linesep) as file:
    file.write("print('Hello World!')\n")

In [70]:
os.listdir('directory1')

['subdirectory1']

In [71]:
os.listdir('directory1' + os.sep + 'subdirectory1')

['script.py']

This script file can be renamed using:

In [72]:
source =  os.path.join(os.getcwd(), 'directory1', 'subdirectory1', 'script.py')
destination = os.path.join(os.getcwd(), 'directory1', 'subdirectory1', 'pscript.py')
os.replace(source, destination)

In [73]:
os.listdir('directory1')

['subdirectory1']

In [74]:
os.listdir('directory1' + os.sep + 'subdirectory1')

['pscript.py']

In [75]:
source = os.path.join(os.getcwd(), 'directory1', 'subdirectory1', 'pscript.py')
destination = os.path.join(os.getcwd(), 'directory1', 'script.py')
os.replace(source, destination)

In [76]:
os.listdir('directory1')

['script.py', 'subdirectory1']

In [77]:
os.listdir('directory1' + os.sep + 'subdirectory1')

[]

Another Python script file can be created:

In [78]:
file_path = os.path.join(os.getcwd(), 'directory1', 'subdirectory1', 'script1.py')

with open(file_path, mode='w', encoding='utf-8', errors='strict', newline=os.linesep) as file:
    file.write("print('Hello World!')\n")

Instead of using the ```os``` function ```listdir```:

In [79]:
os.listdir('directory1')

['script.py', 'subdirectory1']

In [80]:
os.listdir('directory1' + os.sep + 'subdirectory1')

['script1.py']

## walk function

The ```os``` function ```walk``` can be used to create a generator:

In [81]:
forward = os.walk('directory1')
forward

<generator object _walk at 0x0000020A3F3036E0>

When next is used a three element ```tuple``` is generated of the parent folder, a list of subfolders and a list of files:

In [82]:
next(forward)

('directory1', ['subdirectory1'], ['script.py'])

In [83]:
next(forward)

('directory1\\subdirectory1', [], ['script1.py'])

This is typically used in a loop:

In [84]:
top = os.walk('directory1')
for root, dirs, files in top:
    print(root, end='\n')
    print('\t', dirs)
    print('\t', files)

directory1
	 ['subdirectory1']
	 ['script.py']
directory1\subdirectory1
	 []
	 ['script1.py']


The ```topdown``` input argument can be assigned to ```False``` showing longer file paths first:

In [85]:
top = os.walk('directory1', topdown=False)
for root, dirs, files in top:
    print(root, end='\n')
    print('\t', dirs)
    print('\t', files)

directory1\subdirectory1
	 []
	 ['script1.py']
directory1
	 ['subdirectory1']
	 ['script.py']


This can be used in some for loops for example to recursively delete a all files and subdirectories in a directory: 

In [86]:
for root, dirs, files in os.walk('directory1', topdown=False):
    for name in files:
        os.remove(os.path.join(root, name))
    for name in dirs:
        os.rmdir(os.path.join(root, name))
        
os.rmdir('directory1')