# The OS Module

In [1]:
%%html
<iframe width="840" height="473" src="https://www.youtube.com/embed/STXUU1yB30w" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

Import the module

In [1]:
import os

Find current working directory. This is the folder that Python reffers to as "root".<br/> In the case of Jupyter it's also the physical location of the notebook

In [7]:
os.getcwd()

'd:\\Python\\MyProject'

We can change that setting with the `chdir` (change directory) function.<br/>
If not changed, this setting will apply through the rest of this session<br/>
Use the `r` prefix to indicate "raw" text and avoid special meaning of different character combinations

In [9]:
os.chdir(r'D:\Python\MyOtherProject')
os.getcwd()

'D:\\Python\\MyOtherProject'

Non-existent path will generate an error

In [10]:
os.chdir(r'D:\Python\NewProject')

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'D:\\Python\\NewProject'

We can check if directory exists with the `path` function.

In [12]:
print(os.path.exists(r'D:\Python\MyProject'))
print(os.path.exists(r'D:\Python\NewProject'))

True
False


The function `mkdir` is used to create new directories.<br/>
The following code will check if the path exists, if so will make it rootand if not, generate it and then make it root

In [4]:
folder = r'D:\Python\NewProject'
if os.path.exists(folder):
    os.chdir(folder)
else:
    os.mkdir(folder)
    os.chdir(folder)
    
os.getcwd()

'D:\\Python\\NewProject'

Use the `listdir` function to get the contents of a folder as a list.<br/>
The default path is `'.'` which reffers to current location (currnet root).
In this case the folder is empty because we just created it.

In [16]:
os.listdir()

[]

But we can give the function any other location as relative of absolute path. Let's go up one folder and then down to the original folder that was used in this notebook

In [18]:
os.listdir(r'..\MyProject')

['file1.txt', 'file2.txt', 'file3.txt']

`mkdir` can create a single folder inside an already existing folder but if we want to create a deeper path we will get an error

In [21]:
os.mkdir(r'NewFolder\NewSubFolder')

FileNotFoundError: [WinError 3] The system cannot find the path specified: 'NewFolder/NewSubFolder'

This is where we can use `makedirs` to create our nested folder and all the folders above it that do not currently exists

In [35]:
os.makedirs(r'NewFolder\NewSubFolder')
os.listdir()

['NewFolder']

We can remove a single, empty folder with `rmdir` (we get an error if it contain other files or nested folders)

In [36]:
os.rmdir(r'NewFolder\NewSubFolder')
os.listdir()

['NewFolder']

NewFolder still exists (only last level was deleted)<br/><br/>

We can rename files or folders with the `rename` function. If it's a file we have to remember to include the file extension as well

In [42]:
os.rename('NewFolder', 'OldFolder')
os.listdir()

['OldFolder']

We can get more information about specific files using `stat`

In [54]:
print(os.listdir(r'..\MyProject'))
file_info = os.stat(r'..\MyProject\file1.txt')
print(type(file_info))

['file1.txt', 'file2.txt', 'file3.txt']
<class 'os.stat_result'>


Now we can view details such as the file's size, creation date or modification date

In [55]:
print(f'size in bytes:\t{file_info.st_size}')
print(f'file creation time:\t{file_info.st_ctime}')
print(f'file\'s last modification time:\t{file_info.st_mtime}')

size in bytes:	35428
file creation time:	1573588101.0728343
file's last modification time:	1573599480.3227842


These time stamps are not very helpful. We can fix that by converting them into readble dates

In [58]:
import datetime as dt

creation_time = dt.datetime.fromtimestamp(file_info.st_ctime)
print(f'file creation time:\t{creation_time}')

mod_time = dt.datetime.fromtimestamp(file_info.st_mtime)
print(f'file\'s last modification time:\t{mod_time}')

file creation time:	2019-11-12 21:48:21.072834
file's last modification time:	2019-11-13 00:58:00.322784


Get all the names and values of the system's environment variables with `environ` (output returns as distionary)

In [6]:
for var_name in os.environ:
    print(var_name)

ALLUSERSPROFILE
APPDATA
ASL.LOG
COMMONPROGRAMFILES
COMMONPROGRAMFILES(X86)
COMMONPROGRAMW6432
COMPUTERNAME
COMSPEC
DASHLANE_DLL_DIR
DRIVERDATA
FPS_BROWSER_APP_PROFILE_STRING
FPS_BROWSER_USER_PROFILE_STRING
HOMEDRIVE
HOMEPATH
JD2_HOME
LOCALAPPDATA
LOGONSERVER
MSMPI_BIN
NUMBER_OF_PROCESSORS
ONEDRIVE
ONEDRIVECONSUMER
OS
PATH
PATHEXT
PROCESSOR_ARCHITECTURE
PROCESSOR_ARCHITEW6432
PROCESSOR_IDENTIFIER
PROCESSOR_LEVEL
PROCESSOR_REVISION
PROGRAMDATA
PROGRAMFILES
PROGRAMFILES(X86)
PROGRAMW6432
PROMPT
PSMODULEPATH
PUBLIC
SESSIONNAME
SYSTEMDRIVE
SYSTEMROOT
TEMP
TMP
USERDOMAIN
USERDOMAIN_ROAMINGPROFILE
USERNAME
USERPROFILE
VBOX_MSI_INSTALL_PATH
WINDIR
JPY_INTERRUPT_EVENT
IPY_INTERRUPT_EVENT
JPY_PARENT_PID
TERM
CLICOLOR
PAGER
GIT_PAGER
MPLBACKEND


Like any dictionary, we can use the `get` method to get the value of a specific variable

In [8]:
os.environ.get('HOMEPATH')

'\\Users\\eladp'

## walk
`walk` takes a folder path and fetches the entire directory tree under that main folder by returning an iterable of 3-valued tuples comprised of the directory, all folders under that directory and all files under each directory it finds, recursivly

In [28]:
path = '..\MyProject'
for current_dir, nested_dirs, file_names in os.walk(path):
    print(f'Current Directory Name: {current_dir}')
    print(f'Nested Directories: {nested_dirs}')
    print(f'Files: {file_names}\n\n')

Current Directory Name: ..\MyProject
Nested Directories: ['Project Files']
Files: ['file1.txt', 'file2.txt', 'file3.txt']


Current Directory Name: ..\MyProject\Project Files
Nested Directories: ['Folder1', 'Folder2', 'Folder3']
Files: []


Current Directory Name: ..\MyProject\Project Files\Folder1
Nested Directories: []
Files: ['file1.1.txt', 'file1.2.txt', 'file1.3.txt']


Current Directory Name: ..\MyProject\Project Files\Folder2
Nested Directories: []
Files: ['file2.1.txt', 'file2.2.txt', 'file2.3.txt']


Current Directory Name: ..\MyProject\Project Files\Folder3
Nested Directories: []
Files: ['file3.1.txt', 'file3.2.txt', 'file3.3.txt']




## path
The `path` fucntion contains a lot of useful commands

use `path.join` to easily generate valid path strings

In [8]:
os.path.join(os.getcwd(), 'file_name.txt')

'D:\\Python\\NewProject\\file_name.txt'

Print entire path of all the files in the root directory tree

In [61]:
for current_folder, nested_folders,files in os.walk(path):
    for file in files:
        print(os.path.join(current_folder,file))

..\MyProject\file1.txt
..\MyProject\file2.txt
..\MyProject\file3.txt
..\MyProject\Project Files\Folder1\file1.1.txt
..\MyProject\Project Files\Folder1\file1.2.txt
..\MyProject\Project Files\Folder1\file1.3.txt
..\MyProject\Project Files\Folder2\file2.1.txt
..\MyProject\Project Files\Folder2\file2.2.txt
..\MyProject\Project Files\Folder2\file2.3.txt
..\MyProject\Project Files\Folder3\file3.1.txt
..\MyProject\Project Files\Folder3\file3.2.txt
..\MyProject\Project Files\Folder3\file3.3.txt


The `basename` will return only the lowest level out of a path

In [57]:
print(os.path.basename(r'..\MyProject\Project Files\Folder2'))
print(os.path.basename(r'..\MyProject\Project Files\Folder2\file2.1.txt'))

Folder2
file2.1.txt


`dirname` will do the opposite and return all the path up to lowest level

In [55]:
print(os.path.dirname(r'..\MyProject\Project Files\Folder2'))
print(os.path.dirname(r'..\MyProject\Project Files\Folder2\file2.1.txt'))

..\MyProject\Project Files
..\MyProject\Project Files\Folder2


`split` returns both parts as a tuple

In [56]:
print(os.path.split(r'..\MyProject\Project Files\Folder2'))
print(os.path.split(r'..\MyProject\Project Files\Folder2\file2.1.txt'))

('..\\MyProject\\Project Files', 'Folder2')
('..\\MyProject\\Project Files\\Folder2', 'file2.1.txt')


In [54]:
print(os.path.exists(r'..\MyProject\Project Files\Folder2'))
print(os.path.exists(r'..\MyProject\Project Files\Folder2\file2.1.txt'))
print(os.path.exists(r'..\MyProject\Project Files\Folder2\file999.txt'))

True
True
False


We can check to see if each object in a certain path is a folder or a file

In [59]:
print(os.path.isdir(r'..\MyProject\Project Files\Folder2'))
print(os.path.isfile(r'..\MyProject\Project Files\Folder2\file2.1.txt'))

True
True


`splitext` will return the extension of a given file path

In [63]:
print(os.path.splitext(r'..\MyProject\Project Files\Folder2\file2.1.txt'))
print(os.path.splitext(r'..\MyProject\Project Files\Folder2\file2.1.csv'))

('..\\MyProject\\Project Files\\Folder2\\file2.1', '.txt')
('..\\MyProject\\Project Files\\Folder2\\file2.1', '.csv')
