<a href="https://colab.research.google.com/github/carloslme/automating-boring-stuff/blob/main/Chapter_9_Organizing_Files.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# The shutil Module (Copying Files and Folders)
The shutil module provides functions for copying files, as well as entire folders. Calling `shutil.copy(source, destination)` will copy the file at the path source to the folder at the path destination. (Both source and destination are strings.) If destination is a filename, it will be used as the new name of the copied file. This function returns a string of the path of the copied file.

In [None]:
import shutil, os

In [None]:
os.chdir('/content')

In [None]:
shutil.copy('/content/sample_data/anscombe.json','/content')

'/content/anscombe.json'

In [None]:
shutil.copy('/content/sample_data/california_housing_test.csv','/content')

'/content/california_housing_test.csv'

In [None]:
shutil.copy('/content/sample_data/california_housing_test.csv','/content/customized_text.csv')

'/content/customized_text.csv'

While `shutil.copy()` will copy a single file, shutil.copytree() will copy an entire folder and every folder and file contained in it. Calling `shutil.copytree(source, destination)` will copy the folder at the path source , along with all of its files and subfolders, to the folder at the path destination . The source and destination parameters are both strings. The function returns a string of the path of the copied folder.

In [None]:
import shutil, os
os.chdir('/content')
shutil.copytree('/content/sample_data','/content/backup')

'/content/backup'

# Moving and Renaming Files and Folders
Calling `shutil.move(source, destination)` will move the file or folder at the path source to the path destination and will return a string of the absolute path of the new location.

If `destination` points to a folder, the `source` file gets moved into destination and keeps its current filename.

In [None]:
import shutil
shutil.move('/content/customized_text.csv','/content/backup/')

'/content/backup/customized_text.csv'

The `destination` path can also specify a filename. In the following example, the `source` file is moved and renamed.

In [None]:
shutil.move('/content/anscombe.json', '/content/backup/new_anscombe.json')

'/content/backup/new_anscombe.json'

In [None]:
shutil.move('/content/california_housing_test.csv','/content/folder')

'/content/folder'

So the california_housing_test.csv csv file is renamed to folder (a csv file without the .csv file extension)—probably not what you wanted! This can be a tough-to-spot bug in your programs since the `move()` call can happily do something that might be quite different from what you were expecting. This is yet another reason to be careful when using `move()` .

Finally, the folders that make up the destination must already exist, or else Python will throw an exception.

In [None]:
shutil.move('/content/backup/README.md','/content/does_not_exist/other')

FileNotFoundError: ignored

# Permanently Deleting Files and Folders
You can delete a single file or a single empty folder with functions in the os module, whereas to delete a folder and all of its contents, you use the shutil module. 

* Calling `os.unlink(path)` will delete the file at path. 
* Calling `os.rmdir(path)` will delete the folder at path . This folder must be empty of any files or folders.
* Calling `shutil.rmtree(path)` will remove the folder at path, and all files and folders it contains will also be deleted.

Be careful when using these functions in your programs! It’s often a good idea to first run your program with these calls commented out and with `print()` calls added to show the files that would be deleted.

In [None]:
%cd backup/

In [None]:
import os

for filename in os.listdir():
  #print(filename)
  if filename.endswith('.csv'):
    os.unlink(filename)
    print(filename + ' file deleted')

california_housing_test.csv file deleted
customized_text.csv file deleted
mnist_test.csv file deleted
california_housing_train.csv file deleted
mnist_train_small.csv file deleted


# Safe Deletes with the send2trash Module
Using `send2trash` is much safer than Python’s regular delete functions, because it will send folders and files to your computer’s trash or recycle bin instead of permanently deleting them. If a bug in your program deletes something with `send2trash` you didn’t intend to delete, you can later restore it from the recycle bin.

In [None]:
!pip install send2trash



In [None]:
import send2trash
baconFile = open('bacon.txt','a') # Create the fike
baconFile.write('Bacon is not a vegetable.')
baconFile.close()

In [None]:
send2trash.send2trash('bacon.txt')

Note that the send2trash() function can only send files to the recycle bin; it cannot pull files out of it.

# Walking a Directory Tree
The `os.walk()` function is passed a single string value: the path of a folder. You can use `os.walk()` in a for loop statement to walk a directory tree, much like how you can use the `range()` function to walk over a range of numbers. Unlike `range()` , the `os.walk()` function will return three values on each iteration through the loop: 
1. A string of the current folder’s name 
2. A list of strings of the folders in the current folder 
3. A list of strings of the files in the current folder

In [None]:
import os
for foldername, subfolders, filenames in os.walk('/content/'):
  print('The current folder is' + foldername)
  for subfolder in subfolders:
    print('\t Subfolder of' + foldername + ': ' + subfolder)
    for filename in filenames:
      print('\t\t File inside ' + foldername + ': '+ filename)
    print('')

The current folder is/content/
	 Subfolder of/content/: .config

	 Subfolder of/content/: sample_data

The current folder is/content/.config
	 Subfolder of/content/.config: configurations
		 File inside /content/.config: .last_update_check.json
		 File inside /content/.config: config_sentinel
		 File inside /content/.config: .last_survey_prompt.yaml
		 File inside /content/.config: .last_opt_in_prompt.yaml
		 File inside /content/.config: active_config
		 File inside /content/.config: gce
		 File inside /content/.config: .metricsUUID

	 Subfolder of/content/.config: logs
		 File inside /content/.config: .last_update_check.json
		 File inside /content/.config: config_sentinel
		 File inside /content/.config: .last_survey_prompt.yaml
		 File inside /content/.config: .last_opt_in_prompt.yaml
		 File inside /content/.config: active_config
		 File inside /content/.config: gce
		 File inside /content/.config: .metricsUUID

The current folder is/content/.config/configurations
The current fold

# Compressing Files with the zipfile Module
Compressing a file reduces its size, which is useful when transferring it over the Internet. And since a ZIP file can also contain multiple files and subfolders, it’s a handy way to package several files into one. This single file, called an archive file , can then be, say, attached to an email.

Your Python programs can both create and open (or extract ) ZIP files using functions in the zipfile module.


In [None]:
!wget 'https://nostarch.com/download/Automate_the_Boring_Stuff_onlinematerials_v.2.zip' -P '/content/'

--2021-02-11 06:16:04--  https://nostarch.com/download/Automate_the_Boring_Stuff_onlinematerials_v.2.zip
Resolving nostarch.com (nostarch.com)... 104.20.208.3, 172.67.17.195, 104.20.209.3, ...
Connecting to nostarch.com (nostarch.com)|104.20.208.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8802488 (8.4M) [application/zip]
Saving to: ‘/content/Automate_the_Boring_Stuff_onlinematerials_v.2.zip’


2021-02-11 06:16:05 (6.39 MB/s) - ‘/content/Automate_the_Boring_Stuff_onlinematerials_v.2.zip’ saved [8802488/8802488]



# Reading ZIP Files
To read the contents of a ZIP file, first you must create a ZipFile object (note the capital letters Z and F).

To create a ZipFile object, call the zipfile.`ZipFile()` function, passing it a string of the .zip file’s filename. Note that zipfile is the name of the Python module, and `ZipFile()` is the name of the function.


In [None]:
import zipfile, os
os.chdir('/content/') # move ot directory with .zip
exampleZip = zipfile.ZipFile('Automate_the_Boring_Stuff_onlinematerials_v.2.zip')

In [None]:
exampleZip.namelist()

['automate_online-materials/',
 'automate_online-materials/alarm.wav',
 'automate_online-materials/allMyCats1.py',
 'automate_online-materials/allMyCats2.py',
 'automate_online-materials/backupToZip.py',
 'automate_online-materials/birthdays.py',
 'automate_online-materials/boxPrint.py',
 'automate_online-materials/buggyAddingProgram.py',
 'automate_online-materials/bulletPointAdder.py',
 'automate_online-materials/calcProd.py',
 'automate_online-materials/catlogo.png',
 'automate_online-materials/catnapping.py',
 'automate_online-materials/census2010.py',
 'automate_online-materials/censuspopdata.xlsx',
 'automate_online-materials/characterCount.py',
 'automate_online-materials/coinFlip.py',
 'automate_online-materials/combinedminutes.pdf',
 'automate_online-materials/combinePdfs.py',
 'automate_online-materials/countdown.py',
 'automate_online-materials/demo.docx',
 'automate_online-materials/dictionary.txt',
 'automate_online-materials/dimensions.xlsx',
 'automate_online-materials/d

In [None]:
docInfo = exampleZip.getinfo('automate_online-materials/readDocx.py')

In [None]:
docInfo.file_size

210

In [None]:
docInfo.compress_size

141

In [None]:
'Compressed fil is %sx smaller!' % (round(docInfo.file_size / docInfo.compress_size, 2))

'Compressed fil is 1.49x smaller!'

In [None]:
exampleZip.close()

# Extracting from ZIP Files 
The `extractall()` method for ZipFile objects extracts all the files and folders from a ZIP file into the current working directory.

In [1]:
!wget 'https://nostarch.com/download/Automate_the_Boring_Stuff_onlinematerials_v.2.zip' -P '/content/'

--2021-02-12 04:45:35--  https://nostarch.com/download/Automate_the_Boring_Stuff_onlinematerials_v.2.zip
Resolving nostarch.com (nostarch.com)... 104.20.209.3, 104.20.208.3, 172.67.17.195, ...
Connecting to nostarch.com (nostarch.com)|104.20.209.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8802488 (8.4M) [application/zip]
Saving to: ‘/content/Automate_the_Boring_Stuff_onlinematerials_v.2.zip’


2021-02-12 04:45:36 (6.68 MB/s) - ‘/content/Automate_the_Boring_Stuff_onlinematerials_v.2.zip’ saved [8802488/8802488]



In [4]:
import zipfile, os
os.chdir('/content/') # move to folder with zip file
exampleZip = zipfile.ZipFile('Automate_the_Boring_Stuff_onlinematerials_v.2.zip')
exampleZip.extractall()
exampleZip.close()

The `extract()` method for ZipFile objects will extract a single file from the ZIP file.

In [6]:
exampleZip.extract('automate_online-materials/myPets.py', '/content/new_folder/')

'/content/new_folder/automate_online-materials/myPets.py'

If this second argument is a folder that doesn’t yet exist, Python will create the folder. The value that `extract()` returns is the absolute path to which the file was extracted.

# Creating and Adding to ZIP Files
To create your own compressed ZIP files, you must open the ZipFile object in write mode by passing 'w' as the second argument. (This is similar to opening a text file in write mode by passing 'w' to the `open()` function.) 

When you pass a path to the `write()` method of a ZipFile object, Python will compress the file at that path and add it into the ZIP file. The `write()` method’s first argument is a string of the filename to add. The second argument is the compression type parameter, which tells the computer what algorithm it should use to compress the files; you can always just set this value to `zipfile.ZIP_DEFLATED` . (This specifies the deflate compression algorithm, which works well on all types of data.)

In [8]:
import zipfile
newZip = zipfile.ZipFile('new.zip', 'w')

In [9]:
newZip.write('/content/new_folder/automate_online-materials/myPets.py', compress_type=zipfile.ZIP_DEFLATED)
newZip.close()

Keep in mind that, just as with writing to files, write mode will erase all existing contents of a ZIP file. If you want to simply add files to an existing ZIP file, pass 'a' as the second argument to `zipfile.ZipFile()` to open the ZIP file in append mode .

In [10]:
# Adding to ZIP file
newZip = zipfile.ZipFile('new.zip','a')

In [12]:
newZip.write('/content/sample_data/anscombe.json', compress_type=zipfile.ZIP_DEFLATED)
newZip.close()

Checking the new data added by reading the zip file

In [13]:
newZip = zipfile.ZipFile('new.zip')

In [14]:
newZip.infolist()

[<ZipInfo filename='content/new_folder/automate_online-materials/myPets.py' compress_type=deflate filemode='-rw-r--r--' file_size=198 compress_size=137>,
 <ZipInfo filename='content/sample_data/anscombe.json' compress_type=deflate filemode='-rwxr-xr-x' file_size=1697 compress_size=295>]