# Chapter 10: ORGANIZING FILES

## The `shutil` Module

### Copying Files and Folders

In [1]:
import os
import shutil
from pathlib import Path

In [6]:
ls

[0m[01;32mREADME.md[0m*                                     [01;32mdata.txt[0m*
[01;32mchapter-06-manipulating-strings.ipynb[0m*         [34;42mfiles[0m/
[01;32mchapter-09-paths-reading-writing-files.ipynb[0m*  [34;42mmy_folder[0m/
[01;32mchapter-10-organizing-files.ipynb[0m*


`shutil.copy(`**`file, destination`**`)`

copy a single file to the folder at the path *destination*

In [3]:
shutil.copy("data.txt", "my_folder")

'my_folder/data.txt'

`shutil.copytree(`**`files, destination`**`)`

copy the folder at the path *source*, along with all of its files and subfolders, to the folder at the path *destination*.

In [11]:
shutil.copytree("files", "backup_folder")

'backup_folder'

### Moving and Renaming Files and Folders

`shutil.move(`**`source, destination`**`)`

Recursively move a file or directory to another location. This is similar to the Unix "mv" command. Return the file or directory's destination.

In [25]:
shutil.move("data.txt", "new_data.txt")

'new_data.txt'

In [12]:
shutil.move("data.txt", "files")

'files/data.txt'

In [20]:
shutil.move("files", "new_folder")

'new_folder'

### Permanently Deleting Files and Folders

In [32]:
os.remove("temp_data.txt")  # Remove a file (same as unlink()).

In [35]:
os.rmdir("this_empty_folder")  # Remove a directory. Directory must be empty.

In [36]:
# Recursively delete a directory tree.
shutil.rmtree("my_folder")

### Safe Deletes with the `send2trash` Module

**`send2trash`** sends *folders and files* to computer’s trash or recycle bin instead of permanently deleting them.

In [37]:
!pip install send2trash



In [2]:
from send2trash import send2trash

In [22]:
send2trash("my_data")

**Note** that the `send2trash()` function can only send files to the recycle bin; it cannot pull files out of it.

## Walking a Directory Tree

**`os.walk`** - Directory tree generator.

##### This figure shows an example **Calibre Library** folder with its contents:

**Calibre Library**\
| &emsp; metadata.db\
| &emsp; metadata_db_prefs_backup.json\
|\
\\---**John Schember**\
&emsp; \\---**Quick Start Guide**\
&emsp;&emsp;&emsp; cover.jpg\
&emsp;&emsp;&emsp; metadata.opf\
&emsp;&emsp;&emsp; Quick Start Guide - John Schember.epub

---
the figure generated with `tree /a /f` command in Windows

In [44]:
for folderName, subfolders, filenames in os.walk('/mnt/d/Calibre Library/'):
    print(f"Current folder: {folderName}")
    
    for subfolder in subfolders:
        print(f"SUBFOLDER OF {folderName}: {subfolder}")
        
    for filename in filenames:
        print(f"FILE INSIDE {folderName}: {filename}")
    
    print()

Current folder: /mnt/d/Calibre Library/
SUBFOLDER OF /mnt/d/Calibre Library/: John Schember
FILE INSIDE /mnt/d/Calibre Library/: metadata.db
FILE INSIDE /mnt/d/Calibre Library/: metadata_db_prefs_backup.json

Current folder: /mnt/d/Calibre Library/John Schember
SUBFOLDER OF /mnt/d/Calibre Library/John Schember: Quick Start Guide

Current folder: /mnt/d/Calibre Library/John Schember/Quick Start Guide
FILE INSIDE /mnt/d/Calibre Library/John Schember/Quick Start Guide: cover.jpg
FILE INSIDE /mnt/d/Calibre Library/John Schember/Quick Start Guide: metadata.opf
FILE INSIDE /mnt/d/Calibre Library/John Schember/Quick Start Guide: Quick Start Guide - John Schember.epub



## Compressing Files with the `zipfile` Module

### Reading ZIP Files

In [46]:
import zipfile

# reading a zip file
exampleZip = zipfile.ZipFile("automate-online-materials/example.zip", 'r')
exampleZip.namelist()

['spam.txt', 'cats/', 'cats/catnames.txt', 'cats/zophie.jpg']

In [52]:
spamInfo = exampleZip.getinfo("spam.txt")
spamInfo

<ZipInfo filename='spam.txt' compress_type=deflate external_attr=0x2020 file_size=13908 compress_size=3828>

sizes in *bytes*

In [41]:
spamInfo.file_size

13908

In [22]:
spamInfo.compress_size

3828

In [32]:
f"Compressed size is {round(spamInfo.file_size / spamInfo.compress_size, 2)}x smaller!"

'Compressed size is 3.63x smaller!'

In [34]:
exampleZip.close()

### Extracting from ZIP Files

In [53]:
import zipfile

# reading a zip file
exampleZip = zipfile.ZipFile("automate-online-materials/example.zip", 'r')
exampleZip.namelist()

['spam.txt', 'cats/', 'cats/catnames.txt', 'cats/zophie.jpg']

In [48]:
# Extract a member from the archive to the current working directory.
exampleZip.extract("spam.txt", "/mnt/d/spam_folder")

'/mnt/d/spam_folder/spam.txt'

In [57]:
# Extract all members from the archive to the current working directory.
exampleZip.extractall("/mnt/d/my_files")

In [58]:
# Close the file
exampleZip.close()

### Creating and Adding to ZIP Files

In [2]:
import zipfile

# open a zip file in write mode
newZip = zipfile.ZipFile("new.zip", 'w')
newZip.write("data.txt", compress_type=zipfile.ZIP_DEFLATED)
newZip.write("data.txt")  # works faster but it does not compress
newZip.close()

***!!! Keep in mind*** that, just as with writing to files, `write mode` will erase all existing contents of a ZIP file. If you want to simply add files to an existing ZIP file, pass `'a'` as the second argument to `zipfile.ZipFile()` to open the ZIP file in `append mode`.

---

## Project: Renaming Files with American-Style Dates to European-Style Dates

In [215]:
import os
import re
import shutil

datePattern = re.compile(r"""^(.*?)     # all text before the date
                        ((0|1)?\d)-     # one or two digits for the month
                        ((0|1|2|3)?\d)- # one or two digits for the day
                        ((19|20)\d\d)   # four digits for the year
                        (.*?)$          # all text after the date
                        """, re.VERBOSE)

`re.VERBOSE` for the second argument will allow whitespace and comments in the regex string to make it more readable.

In [231]:
# Loop over the files in the working directory
for amer_file in os.listdir():
    match = re.search(datePattern, amer_file)

    # Skip files without a date.
    if match == None:
        continue

    # Get the different parts of the filename
    before_date = match.group(1)
    month       = match.group(2)
    day         = match.group(4)
    year        = match.group(6)
    after_date  = match.group(8)

    # Form the European-style filename
    euro_file = f'{before_date}{day}-{month}-{year}{after_date}'
    print(f'Renaming "{amer_file}" to "{euro_file}"...')

    # Get the full, absolute file paths
    current_path = os.getcwd()
    amer_file = os.path.join(current_path, amer_file)
    euro_file = os.path.join(current_path, euro_file)

    # Rename the files
    shutil.move(amer_file, euro_file)

Renaming "0-15-2014.py" to "15-0-2014.py"...
Renaming "1-1-1999-buggyAddingProgram.py" to "1-1-1999-buggyAddingProgram.py"...
Renaming "2-29-2013.zip" to "29-2-2013.zip"...
Renaming "automate-01-09-2001-requirements.txt" to "automate-09-01-2001-requirements.txt"...
Renaming "bir12-31-2022thdays.py" to "bir31-12-2022thdays.py"...
Renaming "mouseNow_5-20-1976.py" to "mouseNow_20-5-1976.py"...
Renaming "spam4-4-1984.txt" to "spam4-4-1984.txt"...


### Ideas for Similar Programs

- To add a prefix to the start of the filename, such as adding *spam_* to
rename *eggs.txt* to *spam_eggs.txt*
- To change filenames with European-style dates to American-style
dates
- To remove the zeros from files such as *spam0042.txt*

---

## Project: Backing Up a Folder into a ZIP File

In [91]:
!pwd

/mnt/d/GitHub/automate-the-boring-stuff/My_Project


In [92]:
!ls

 1		   'New folder1.zip'   dir1		    spam01.txt
 My_Project_1.zip  'New folder2.zip'   folder1		    temp
'New folder'	   'New folder3.zip'   hello.app
'New folder1'	   'New folder4.zip'   messages879352.TXT


In [93]:
import os
import re
import zipfile

# STEP 1: Figure Out the ZIP File’s Name
folder = os.path.basename(os.getcwd())
zip_num = 1

while True:
    new_zip_name = f'{folder}_{zip_num}.zip'
    if not os.path.exists(new_zip_name):
        break
    zip_num += 1

print("New ZIP file's name:", new_zip_name)

New ZIP file's name: My_Project_2.zip


In [95]:
# STEP 2: Create the New ZIP File
new_zip = zipfile.ZipFile(new_zip_name, mode='w')

# STEP 3: Walk the Directory Tree and Add to the ZIP File
# Walk the entire folder tree and compress the files in each folder.
for folder, subfolders, files in os.walk("."):
    for file in files:
        if not file.endswith('.zip'):  # don't back up the backup ZIP files
            print(f'Adding {folder}/{file}...')
            new_zip.write(os.path.join(folder, file), compress_type=zipfile.ZIP_DEFLATED)
print("Done.")
new_zip.close()

Adding ./hello.app...
Adding ./messages879352.TXT...
Adding ./spam01.txt...
Adding ./1/2/3/LAST!/important.txt...
Adding ./folder1/app.txt...
Adding ./folder1/hello.app...
Adding ./New folder/spam012 - Copy.txt...
Adding ./New folder/spam012.txt...
Adding ./New folder/dir1 - Copy/spam012 - Copy (2).txt...
Adding ./New folder/dir1 - Copy/spam012 - Copy (3).txt...
Adding ./New folder/dir1 - Copy (2)/spam012 - Copy (2).txt...
Adding ./New folder/dir1 - Copy (2)/spam012 - Copy - Copy.txt...
Adding ./New folder1/spam012 - Copy.txt...
Adding ./New folder1/spam012.txt...
Adding ./New folder1/dir1 - Copy/spam012 - Copy (2).txt...
Adding ./New folder1/dir1 - Copy/spam012 - Copy (3).txt...
Adding ./New folder1/dir1 - Copy (2)/spam012 - Copy (2).txt...
Adding ./New folder1/dir1 - Copy (2)/spam012 - Copy - Copy.txt...
Done.


### Ideas for Similar Programs

- Walk a directory tree and archive just files with certain extensions, such as *.txt* or *.py*, and nothing else.
- Walk a directory tree and archive every file except the *.txt* and *.py* ones.
- Find the folder in a directory tree that has the greatest number of files or the folder that uses the most disk space.


---

## Practice Projects

### Selective Copy