# Navigating inside the os



Nice explanations:
- https://www.pythonlearn.com/html-008/cfbook017.html

#### Relevant functions in `os`

- **`os.getcwd()`** returns a string wit the path of the current working directory.


- **`os.listdir()`** returns a list containing all files and folders in the current directory.


- **`os.mkdir(dirname)`** creates a directory with name `dirname`.


- **`os.makedirs(dirname)`** creates a directory with name `dirname`. If `dirname` if dirname contains several levels of directories they will also be created (this does not happen with `os.mkdir`).


- **`os.rmdir(dirname)`** deletes the directory.


- **`os.removedirs(dirname)`** deletes the directory and all intermediate directories.


-  **`os.rename(filename1, filename2)`** renames `filename1` with `filename2`.


- **`os.stat(filename)`** returns information about `filename` such as the file size.


- **`os.walk()`** returns a `generator` used to navigate throghout the filesystem tree. The generator yeilds a tuple of 3 values. The 3 values correspond to the dirpath, dirnames (inside dirpath), filenames (inside dirpath).


- **`os.environ`** returns a `os._Environ` type  cotaining information about the environment variables. For example, `os.environ.get('HOME')` will return the home directory.

In [1]:
import os

In [2]:
os.getcwd()

'/Users/david/Documents/git_stuff/python_tutorials/os_communication'

In [3]:
os.listdir()

['.DS_Store', '.ipynb_checkpoints', 'folder_for_tests', 'os_navigation.ipynb']

In [4]:
# Look all that is inside the path
path = "./"
print(os.listdir(path))

['.DS_Store', '.ipynb_checkpoints', 'folder_for_tests', 'os_navigation.ipynb']


In [5]:
os.mkdir('created_by_me')

In [6]:
os.listdir()

['.DS_Store',
 '.ipynb_checkpoints',
 'created_by_me',
 'folder_for_tests',
 'os_navigation.ipynb']

In [7]:
# This will not work since B is created in A
# and A does not exit
os.mkdir('created_by_me/A/B')

FileNotFoundError: [Errno 2] No such file or directory: 'created_by_me/A/B'

In [8]:
os.makedirs('created_by_me/A/B')

In [9]:
os.listdir()

['.DS_Store',
 '.ipynb_checkpoints',
 'created_by_me',
 'folder_for_tests',
 'os_navigation.ipynb']

In [10]:
os.removedirs('created_by_me/A/B')

In [11]:
os.listdir()

['.DS_Store', '.ipynb_checkpoints', 'folder_for_tests', 'os_navigation.ipynb']

In [12]:
filename = './folder_for_tests/A_txt_files/f1.txt'
os.stat(filename)

os.stat_result(st_mode=33188, st_ino=19316662, st_dev=16777220, st_nlink=1, st_uid=502, st_gid=20, st_size=16, st_atime=1514394462, st_mtime=1514394462, st_ctime=1514394462)

In [13]:
print("Size of the file in bytes:", os.stat(filename).st_size)

Size of the file in bytes: 16


In [14]:
from datetime import datetime
modification_time = os.stat(filename).st_mtime
print("The file was modified in:", 
      datetime.fromtimestamp(modification_time))

The file was modified in: 2017-12-27 18:07:42


In [15]:
os.environ.get('HOME')

'/Users/david'

#### About os.path

In [16]:
filepath=os.path.join(os.environ.get('HOME'), 'some_file_.txt')
filepath

'/Users/david/some_file_.txt'

In [17]:
os.path.basename('inventedpath/another_invented/f.txt')

'f.txt'

In [18]:
os.path.dirname('inventedpath/another_invented/f.txt')

'inventedpath/another_invented'

In [19]:
os.path.split('inventedpath/another_invented/f.txt')

('inventedpath/another_invented', 'f.txt')

In [20]:
os.path.exists('inventedpath/another_invented/f.txt')

False

In [21]:
os.path.isdir('inventedpath/another_invented/f.txt')

False

In [22]:
os.path.isfile('inventedpath/another_invented/f.txt')

False

In [23]:
## root and extension
os.path.splitext('inventedpath/another_invented/f.txt')

('inventedpath/another_invented/f', '.txt')


## Navigating the filesystem

We can walk over all the folders and subfolders of a a given `path` using the **`os.walk(path)`**. The `os.walk` method returns a `generator`.

Let us use this function to print all subfolders of `folder_for_tests` which is a folder containing several subfolders that we will use to test the different functions in `os`.

In [24]:
type(os.walk(path))

generator

In [25]:
path = './folder_for_tests/'
for dirpath, dirnames, filenames in os.walk(path):
    print('Current path:', dirpath)
    print('Directories:', dirnames)
    print('Files:', filenames)
    print()

Current path: ./folder_for_tests/
Directories: ['A_txt_files', 'B_txt_files']
Files: ['.DS_Store']

Current path: ./folder_for_tests/A_txt_files
Directories: []
Files: ['f1.txt', 'f2.txt']

Current path: ./folder_for_tests/B_txt_files
Directories: ['A_2_txt_files']
Files: ['.DS_Store', 'f3.txt', 'f4.txt', 'f5.txt']

Current path: ./folder_for_tests/B_txt_files/A_2_txt_files
Directories: []
Files: ['f6.txt', 'f7.txt']



#### Gathering filenames, and folders

Print the names of all folders inside `path` (or inside folders that are inside path).

In [26]:
path = './folder_for_tests/'
for dirpath, dirnames, filenames in os.walk(path):
    for directory in dirnames:
        print('folder name:', directory)

folder name: A_txt_files
folder name: B_txt_files
folder name: A_2_txt_files


Print the full path of all previous folders. The path will start in `path`

In [27]:
for dirpath, dirnames, filenames in os.walk(path):
    print("folder path:", dirpath)

folder path: ./folder_for_tests/
folder path: ./folder_for_tests/A_txt_files
folder path: ./folder_for_tests/B_txt_files
folder path: ./folder_for_tests/B_txt_files/A_2_txt_files


Print all `.txt` files inside all subfolders contained in `path` 

In [28]:
path = './folder_for_tests/'
for dirpath, dirnames, filenames in os.walk(path):
    for f in filenames:
        if f.endswith('.txt'):
            print('File:', f)

File: f1.txt
File: f2.txt
File: f3.txt
File: f4.txt
File: f5.txt
File: f6.txt
File: f7.txt


#### find all (.mkv or .mp4 or .avi ) files in the home directory

In [29]:
def find_movie_files():
    path = os.environ.get('HOME')
    files = []
    for dirpath, dirnames, filenames in os.walk(path):
        for f in filenames:
            if f.endswith('.mkv') or f.endswith('.avi') or f.endswith('.mp4'):
                #print('File:', f)
                files.append(f)
                
    return files

In [None]:
movie_files = find_movie_files()

In [None]:
len(movie_files)

In [56]:
def retrieve_movie_data():
    path = os.environ.get('HOME')
    files = []
    sizes = []
    for dirpath, dirnames, filenames in os.walk(path):
        for f in filenames:
            fpath = os.path.join(dirpath, f)
            
            if fpath.endswith('.mkv') or fpath.endswith('.avi') or fpath.endswith('.mp4'):
                #print('File:', f)
                files.append(fpath)
                sizes.append(os.stat(fpath).st_size*10**(-8)) # size in GigaBytes
                
    return files,sizes

In [36]:
filenames, sizes_GB = retrieve_movie_data()

In [40]:
movie_files = [os.path.basename(f) for f in filenames]

In [58]:
## Print movie files sorted
# sorted by size (decreasingly)
import numpy as np
[movie_files[x] for x in np.argsort(sizes_GB)[::-1]]

In [57]:
#np.sort(sizes_GB)[::-1]