<a href="https://colab.research.google.com/github/pbeens/Zip-File-Tutorial/blob/main/Zip_File_Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This Colab Notebook and the support files can be found at:

https://github.com/pbeens/Zip-File-Tutorial

## Download the files we'll play with

In [None]:
!wget -q --show-progress 'https://raw.githubusercontent.com/pbeens/Zip-File-Tutorial/main/files/Canada-Populations-by-Province-eng.csv'
!wget -q --show-progress 'https://github.com/pbeens/Zip-File-Tutorial/raw/main/files/Canada-Populations-by-Province-eng.xlsx'
!wget -q --show-progress 'https://github.com/pbeens/Zip-File-Tutorial/raw/main/files/Lorem-Ipsum.docx'
!wget -q --show-progress 'https://raw.githubusercontent.com/pbeens/Zip-File-Tutorial/main/files/Lorem-Ipsum.rtf'
!wget -q --show-progress 'https://raw.githubusercontent.com/pbeens/Zip-File-Tutorial/main/files/Lorem-Ipsum.txt'
!wget -q --show-progress 'https://github.com/pbeens/Zip-File-Tutorial/raw/main/files/snow-scene.jpg'

## Declare the file variables (as a list)

In [3]:
files = ['/content/Canada-Populations-by-Province-eng.csv',
         '/content/Canada-Populations-by-Province-eng.xlsx',
         '/content/Lorem-Ipsum.docx',
         '/content/Lorem-Ipsum.rtf',
         '/content/Lorem-Ipsum.txt',
         '/content/snow-scene.jpg']

## Let's zip our files

In [37]:
from zipfile import ZipFile

zip_filename = '/content/files.zip'

with ZipFile(zip_filename, mode="w") as archive:
    for file in files:
        archive.write(file)

## What about file compression?

If we look at the filesizes of our files and the size of the zip file you'll see we don't have any compression (yet). In fact, the zip file is bigger than the total size of the files, due to overhead.

###No file compression

In [38]:
import os # needed for getsize()

zip_filename = '/content/files.zip'

def get_and_print_filesizes():
    total_filesize = 0
    for file in files:
        total_filesize += os.path.getsize(file)

    size_of_zip_file = os.path.getsize(zip_filename)

    print(f'Size of all files: {total_filesize}')
    print(f'Size of zip file: {size_of_zip_file}')
    print(f'Compression amount: {(1-size_of_zip_file/total_filesize)*100:.1f}%')

get_and_print_filesizes()

Size of all files: 215985
Size of zip file: 216833
Compression amount: -0.4%


###With file compression

To compress the files, we need to program in the compression method and the compression level we want to use. 

For this tutorial, we will use the ZIP_DEFLATED method of compression. 

Compression levels are from 0 to 9, with 0 being no compression and 9 being the highest. Note that the higher the compression level the longer it will take to compress and decompress the files.

In [35]:
from zipfile import ZipFile, ZIP_DEFLATED

zip_filename = '/content/files.zip'

with ZipFile(zip_filename, "w", ZIP_DEFLATED, compresslevel=9) as archive:
    for file in files:
        archive.write(file)

get_and_print_filesizes()

Size of all files: 215985
Size of zip file: 159179
Compression amount: 26.3%
