# Managing Files and Directories in Python

Python has several built-in modules and functions for handling files.

These functions are spread out over several modules such as os, os.path, shutil, and pathlib, to name a few.

## Setup

Importing libraries

In [28]:
import os
import pathlib
from pathlib import Path
import fnmatch
import shutil
from tempfile import TemporaryFile

Getting full current path

In [2]:
current_path = os.getcwd()
print('Current Path:', current_path)

Current Path: C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files


Adding the working directory name to the path

In [3]:
full_directory_path = os.path.join(current_path, 'Files_General')
print(f'\nFull path of working directory:')
print(f'{full_directory_path}\n')


Full path of working directory:
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General



Changing current working directory

In [4]:
os.chdir(full_directory_path)
current_path = os.getcwd()
print('current_path = ', current_path)

current_path =  C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General


## Writing to and reading from a file

### The direct open way

In [7]:
file = 'Dummy_1.txt'
file_object = open(file, 'r')
data = file_object.read()
print(data)
file_object.close()

Hello World!
What's up?


### Python’s “with open(…) as …” Pattern

In [8]:
with open('Dummy_1.txt', 'r') as f:
    data = f.read()

print(data)

Hello World!
What's up?


In [16]:
with open('Dummy_1.txt', 'a') as f:
    data = '\nI am alive!'
    f.write(data)
    
with open('Dummy_1.txt', 'r') as f:
    data = f.read()

print(data)


Hello World!
What's up?
I am alive!
I am alive!
I am alive!


In [17]:
with open('Dummy_1.txt', 'w') as f:
    data = "Hello World!\nWhat's up?"
    f.write(data)
    
with open('Dummy_1.txt', 'r') as f:
    data = f.read()

print(data)

Hello World!
What's up?


## Listing sub-directories and files in current working directory

### Using "os.listdir" (legacy Python versions)

"os.listdir()" returns a Python list containing the names of the files and subdirectories
in the directory given by the path argument:

In [18]:
print(f'\nList of directories and files of \n{current_path}\n')
print(os.listdir(full_directory_path))


List of directories and files of 
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General

['Directory 01 - txt', 'Directory 02 - pickle', 'Directory 03 - json', 'Directory 04 - csv', 'Directory 05 - xlsx', 'Dummy_1.txt', 'Dummy_1a.txt', 'Dummy_1b.txt', 'Dummy_2.docx', 'Dummy_3.pptx', 'New Directory', 'Temp', 'teste.doc']


---> A directory listing like that isn’t easy to read.

Printing out the output of a call to os.listdir() using a loop helps clean things up:

In [19]:
print(f'\nList of directories and files of \n{current_path}\n')
entries = os.listdir(full_directory_path)
for entry in entries:
    print(entry)


List of directories and files of 
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General

Directory 01 - txt
Directory 02 - pickle
Directory 03 - json
Directory 04 - csv
Directory 05 - xlsx
Dummy_1.txt
Dummy_1a.txt
Dummy_1b.txt
Dummy_2.docx
Dummy_3.pptx
New Directory
Temp
teste.doc


### Using "os.scandir"

The "os.scandir()" Iterator points to all the entries in the current directory.
You can loop over the contents of the iterator and print out the filenames:

In [20]:
print(f'\nList of directories and files of \n{current_path}\n')
with os.scandir(full_directory_path) as entries:
    for entry in entries:
        print(entry.name)


List of directories and files of 
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General

Directory 01 - txt
Directory 02 - pickle
Directory 03 - json
Directory 04 - csv
Directory 05 - xlsx
Dummy_1.txt
Dummy_1a.txt
Dummy_1b.txt
Dummy_2.docx
Dummy_3.pptx
New Directory
Temp
teste.doc


### using "pathlib"

"pathlib.Path()" objects have an .iterdir() method for creating an iterator of all files and folders in a directory.

Each entry yielded by .iterdir() contains information about the file or directory like its name and file attributes.
pathlib was first introduced in Python 3.4 and is a great addition to Python that provides an object oriented
interface to the filesystem.

In [22]:
print(f'\nList of directories and files of \n{current_path}\n')
entries = pathlib.Path(full_directory_path)
for entry in entries.iterdir():
    print(entry.name)


List of directories and files of 
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General

Directory 01 - txt
Directory 02 - pickle
Directory 03 - json
Directory 04 - csv
Directory 05 - xlsx
Dummy_1.txt
Dummy_1a.txt
Dummy_1b.txt
Dummy_2.docx
Dummy_3.pptx
New Directory
Temp
teste.doc


## Listing only the files in a directory

### Using "os.listdir" (legacy Python versions)

In [23]:
print(f'\nList of files in \n{current_path}\n')
basepath = full_directory_path
print('\nList: ', os.listdir(basepath), '\n')
print('Items:')
for entry in os.listdir(basepath):
    if os.path.isfile(os.path.join(basepath, entry)):
        print(entry)


List of files in 
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General


List:  ['Directory 01 - txt', 'Directory 02 - pickle', 'Directory 03 - json', 'Directory 04 - csv', 'Directory 05 - xlsx', 'Dummy_1.txt', 'Dummy_1a.txt', 'Dummy_1b.txt', 'Dummy_2.docx', 'Dummy_3.pptx', 'New Directory', 'Temp', 'teste.doc'] 

Items:
Dummy_1.txt
Dummy_1a.txt
Dummy_1b.txt
Dummy_2.docx
Dummy_3.pptx
teste.doc


### Using "os.scandir"

In [24]:
print(f'\nList of files in \n{current_path}\n')
basepath = full_directory_path
print('\nObject: ', os.scandir(basepath), '\n')
print('Items:')
with os.scandir(basepath) as entries:
    for entry in entries:
        if entry.is_file():
            print(entry.name)


List of files in 
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General


Object:  <nt.ScandirIterator object at 0x000001D8416BD210> 

Items:
Dummy_1.txt
Dummy_1a.txt
Dummy_1b.txt
Dummy_2.docx
Dummy_3.pptx
teste.doc


### Using "pathlib"

In [29]:
print(f'\nList of files in \n{current_path}\n')
basepath = Path(full_directory_path)
print('\nObject: ', basepath.iterdir(), '\n')

print('Items:')
files_in_basepath = basepath.iterdir()
for item in files_in_basepath:
    if item.is_file():
        print(item.name)


List of files in 
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General


Object:  <generator object Path.iterdir at 0x000001D8437B4EB0> 

Items:
Dummy_1.txt
Dummy_1a.txt
Dummy_1b.txt
Dummy_2.docx
Dummy_3.pptx
teste.doc


## Listing only the subdirectories in a directory

### Using "os.listdir" (legacy Python versions)

In [30]:
print(f'\nList of directories in \n{current_path}\n')
path = full_directory_path
print('\nList: ', os.listdir(path), '\n')

print('Items:')
for entry in os.listdir(path):
    if os.path.isdir(os.path.join(path, entry)):
        print(entry)


List of directories in 
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General


List:  ['Directory 01 - txt', 'Directory 02 - pickle', 'Directory 03 - json', 'Directory 04 - csv', 'Directory 05 - xlsx', 'Dummy_1.txt', 'Dummy_1a.txt', 'Dummy_1b.txt', 'Dummy_2.docx', 'Dummy_3.pptx', 'New Directory', 'Temp', 'teste.doc'] 

Items:
Directory 01 - txt
Directory 02 - pickle
Directory 03 - json
Directory 04 - csv
Directory 05 - xlsx
New Directory
Temp


### Using "scandir()"

In [31]:
print(f'\nList of directories in \n{current_path}\n')
path = full_directory_path
print('\nObject: ', os.scandir(path), '\n')

print('Items:')
with os.scandir(path) as entries:
    for entry in entries:
        if entry.is_dir():
            print(entry.name)


List of directories in 
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General


Object:  <nt.ScandirIterator object at 0x000001D8416BA350> 

Items:
Directory 01 - txt
Directory 02 - pickle
Directory 03 - json
Directory 04 - csv
Directory 05 - xlsx
New Directory
Temp


### Using "pathlib"

In [32]:
print(f'\nList of directories in \n{current_path}\n')
path = Path(full_directory_path)
print('\nObject: ', path.iterdir(), '\n')

print('Items:')
for entry in path.iterdir():
    if entry.is_dir():
        print(entry.name)


List of directories in 
C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General


Object:  <generator object Path.iterdir at 0x000001D84374C200> 

Items:
Directory 01 - txt
Directory 02 - pickle
Directory 03 - json
Directory 04 - csv
Directory 05 - xlsx
New Directory
Temp


## Creating file in Python

In [37]:
file = input('Type file name with extension: ')
os.chdir(full_directory_path)
f = open(file, 'w')

input(f'\nFile "{file}" was created and is open for editing. Press "Enter" to close file.')
f.close()
print(f'\n"{file}" was closed.')

Type file name with extension: teste.txt

File "teste.txt" was created and is open for editing. Press "Enter" to close file.

"teste.txt" was closed.


## Deleting Files in Python

In [38]:
file = input('Type file name with extension: ')
os.remove(file)
print(f'\nFile "{file}" was deleted.')

Type file name with extension: teste.txt

File "teste.txt" was deleted.


## Create and delete a directory

### Using "os"

In [43]:
os.chdir(full_directory_path)                # changes current directory
subdirectory_name = 'New Directory'
os.mkdir(subdirectory_name)

input(f'Subdirectory "{subdirectory_name}" was created.\n\nPress Enter to delete it.')
os.rmdir('New Directory')
print(f'\nSubdirectory "{subdirectory_name}" was deleted.')

Subdirectory "New Directory" was created.

Press Enter to delete it.

Subdirectory "New Directory" was deleted.


### Using "pathlib"

In [46]:
os.chdir(full_directory_path)                # changes current directory
# subdirectory_name = 'New Directory/'
p = Path('New Directory/')
p.mkdir()

input(f'Subdirectory "{subdirectory_name}" was created.\n\nPress Enter to delete it.')
Path('New Directory/').rmdir()
print(f'\nSubdirectory "{subdirectory_name}" was deleted.')

FileExistsError: [WinError 183] Não é possível criar um arquivo já existente: 'New Directory'

When directory already exists, you may avoid stopping the program due the error using try.

In [47]:
os.chdir(full_directory_path)                # changes current directory
new_subdir = 'New Directory'
p = Path(new_subdir)
try:
    p.mkdir()
    print (f'Subdirectory "{new_subdir}" created successfully.')
    
except FileExistsError as exc:
    print(f'Subdirectory "{new_subdir}" already exists.')


Subdirectory "New Directory" already exists.


## Creating and deleting multiple directories

### Using "os"

In [48]:
new_branch = '2018/10/05'

os.makedirs(new_branch)

os.chdir(os.path.join(full_directory_path, new_branch))
new_directory = os.getcwd()
print(new_directory)

input('Subdirectories were created. Press Enter to delete them.')
os.chdir(full_directory_path)
os.rmdir('2018/10/05')
os.rmdir('2018/10')
os.rmdir('2018')
print('Subdirectories were deleted.')

C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General\2018\10\05
Subdirectories were created. Press Enter to delete them.
Subdirectories were deleted.


### using "pathlib"

In [57]:
p = pathlib.Path('2018/10/05')
p.mkdir(parents=True)

new_directory = os.getcwd()
os.chdir(os.path.join(full_directory_path, new_directory))
print(new_directory)
    
input('Subdirectories were created. Press Enter to delete them.')
os.chdir(full_directory_path)
pathlib.Path('2018/10/05').rmdir()
pathlib.Path('2018/10').rmdir()
pathlib.Path('2018').rmdir()
print('Subdirectories were deleted.')

C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General
Subdirectories were created. Press Enter to delete them.
Subdirectories were deleted.


obs.: only empty directories may be deleted using "os.rmdir().

To delete directories that may contain dailes or subdiretories "shutil.rmtree()" shall be used.

#### Warning! 
In this case all files and subdirectories under the one deleted by "shutil()" will also be deleted.

In [59]:
new_branch = '2018/10/05'

os.makedirs(new_branch)

os.chdir(os.path.join(full_directory_path, new_branch))
new_directory = os.getcwd()
print(new_directory)

input('Subdirectories were created. Press Enter to delete them.')
os.chdir(full_directory_path)
shutil.rmtree('2018')
print('Subdirectories and files were deleted.')

C:\Users\csp1\PycharmProjects\Python-Classes\2. Python Intermediate\3. Managing Files\Files_General\2018\10\05
Subdirectories were created. Press Enter to delete them.
Subdirectories and files were deleted.


## Filename pattern matching

### Using string methods

In [60]:
# Get .txt files
print('The following .txt files were found:')
for f_name in os.listdir(full_directory_path):
    if f_name.endswith('.txt'):
        print('->',f_name)

The following .txt files were found:
-> Dummy_1.txt
-> Dummy_1a.txt
-> Dummy_1b.txt


### Using "fnmatch"

In [61]:
print(f'\nThe following .txt files were found:')
for file_name in os.listdir(full_directory_path):
    if fnmatch.fnmatch(file_name, '*.txt'):
        print('->',file_name)


The following .txt files were found:
-> Dummy_1.txt
-> Dummy_1a.txt
-> Dummy_1b.txt


#### Advanced "fnmacth"

In [65]:
name = input('Type search criteria: ' )
name_split = name.split('.')

print(f'\nThe following files corresponding to "{name}" were found:')
for filename in os.listdir(full_directory_path):
    if fnmatch.fnmatch(filename, f'{name_split[0]}.{name_split[1]}'):
        print('->',filename)

Type search criteria: *1*

The following files corresponding to "*1*" were found:


IndexError: list index out of range

## Traversing Directories and Processing Files

### Walking a directory tree and printing the names of the directories and files

In [66]:
for dirpath, dirnames, files in os.walk('.'):
    print(f'Found directory: {dirpath}')
    for file_name in files:
        print(file_name)

Found directory: .
Dummy_1.txt
Dummy_1a.txt
Dummy_1b.txt
Dummy_2.docx
Dummy_3.pptx
teste.doc
Found directory: .\Directory 01 - txt
sample1.txt
sample2.txt
sample3.txt
Found directory: .\Directory 02 - pickle
Found directory: .\Directory 02 - pickle\Directory 01 - txt - Copy
sample1.txt
sample2.txt
sample3.txt
Found directory: .\Directory 03 - json
example_1.json
example_2.json
Found directory: .\Directory 04 - csv
addresses.csv
cities.csv
hurricanes.csv
Found directory: .\Directory 05 - xlsx
Financial Sample.xlsx
Found directory: .\New Directory
Found directory: .\Temp
Found directory: .\Temp\Directory 01 - txt - Copy
sample1.txt
sample2.txt
sample3.txt


### Traverse the directory tree in a bottom-up manner

In [67]:
for dirpath, dirnames, files in os.walk('.', topdown=False):
    print(f'Found directory: {dirpath}')
    for file_name in files:
        print(file_name)

Found directory: .\Directory 01 - txt
sample1.txt
sample2.txt
sample3.txt
Found directory: .\Directory 02 - pickle\Directory 01 - txt - Copy
sample1.txt
sample2.txt
sample3.txt
Found directory: .\Directory 02 - pickle
Found directory: .\Directory 03 - json
example_1.json
example_2.json
Found directory: .\Directory 04 - csv
addresses.csv
cities.csv
hurricanes.csv
Found directory: .\Directory 05 - xlsx
Financial Sample.xlsx
Found directory: .\New Directory
Found directory: .\Temp\Directory 01 - txt - Copy
sample1.txt
sample2.txt
sample3.txt
Found directory: .\Temp
Found directory: .
Dummy_1.txt
Dummy_1a.txt
Dummy_1b.txt
Dummy_2.docx
Dummy_3.pptx
teste.doc


This is very useful in situations where you want to recursively delete files and directories.

## Making Temporary Files and Directories 

### using "TemporaryFile" form "tempfile" library

In [68]:
# Create a temporary file and write some data to it (creates in C:\Users\csp1\AppData\Local\Temp)
fp = TemporaryFile('w+t')
fp.write('Hello universe!')
print('Temporary file created in: "C:/Users/csp1/AppData/Local/Temp"')


# Go back to the beginning and read data from file
fp.seek(0)
data = fp.read()
print('\n',data,'\n')

input('Press "Enter" to delete temporary file and continue.')

# Close the file, after which it will be removed
fp.close()

Temporary file created in: "C:/Users/csp1/AppData/Local/Temp"

 Hello universe! 

Press "Enter" to delete temporary file and continue.


## Copying, Moving, and Renaming Files and Directories - "shutil" Library

### Copying Files in Python

In [70]:
file = input('Input file name (with extension) to copy: ')
file_split = file.split('.')
src = f'{full_directory_path}\{file}'
dst = f'{full_directory_path}\{file_split[0]} - Copy.{file_split[1]}'
shutil.copy(src, dst)

Input file name (with extension) to copy: teste.doc


'C:\\Users\\csp1\\PycharmProjects\\Python-Classes\\2. Python Intermediate\\3. Managing Files\\Files_General\\teste - Copy.doc'

### Copying Directories

In [71]:
shutil.copytree('Directory 01 - txt', 'Directory 01 - txt - Copy')

'Directory 01 - txt - Copy'

### Moving Files and Directories

In [73]:
os.mkdir('Temp')
shutil.move('Directory 01 - txt - Copy/', 'Temp/')

'Temp/Directory 01 - txt - Copy'

## Renaming Files and Directories

### Using "os"

In [74]:
os.rename('Dummy_1.txt', 'Dummy_1 - RENAMED.txt')
input('File renamed - Press Enter to revert.')
os.rename('Dummy_1 - RENAMED.txt', 'Dummy_1.txt')

File renamed - Press Enter to revert.


### Using "pathlib"

In [75]:
data_file = Path('Dummy_1.txt')
data_file.rename('Dummy_1 - RENAMED.txt')
input('File renamed - Press Enter to revert.')
data_file = Path('Dummy_1 - RENAMED.txt')
data_file.rename('Dummy_1.txt')

File renamed - Press Enter to revert.


WindowsPath('Dummy_1.txt')