This mini program in Python will help data scientists back up data and folders in a zip file.

Suppose you’re working on a project whose files you keep in a folder. You’re worried about losing your work, so you’d like to create zip file snapshots of the entire folder. You’d like to keep different versions, so you want the zip file’s filename to increment each time it is made; for example, AlsPythonBook_1.zip, AlsPythonBook_2.zip, AlsPythonBook_3.zip, and so on. You could do this by hand, but it is rather annoying, and you might accidentally misnumber the zip files’ names. It would be much simpler to run a program that does this boring task for you.

The idea of the mini program is as follows: we first grab all the files that need to be zipped. Then we check to see if a folder exists which can be used to place the zip file. During the process, we also implement version control so that you can have many copies as you like. There are many ways to do version control. One simple way is to use the timestamp as a suffix for the filename. Each time we back up a set of files, the final zip file will contain the time_stamp information. 

In [10]:
import os
import zipfile
import shutil
import datetime 
import time

In [11]:
fsource='C:\\Users\\gao\\Gao_Jupyter_Notebook\\Datasets\\test folder' # original location for files to be zipped
fzip='backup_example' # zip file name
fdest='C:\\Users\\gao\\Gao_Jupyter_Notebook\\Datasets\\backup test folder' # destination folder where you want to put your zip file at 
sleep_time=15 # sleep for 15 seconds

In [12]:
fzip='backup_example' # zip file name
timestamp_format="%Y%m%d_%H%M%S"
now=datetime.datetime.now()
timestamp = now.strftime(timestamp_format)
print('Program Start Time: {:%Y-%m-%d %H:%M:%S}'.format(datetime.datetime.now()), '\n')
fname=fzip+'_'+timestamp

# Check if the folders for file sources and destinations exist
if os.path.isdir(fsource)==True:
    print('The path for the source files exists at the following locations:\n', fsource, '\n', '---You may proceed', '\n')
elif os.path.isdir(fsource)==False:
    print('The path for the source files does not even exist, go back and double check!\n')
if os.path.isdir(fdest)==True:
    print('The path for the destination zip file exists at the following locations:\n', fdest, '\n', '---You may proceed', '\n')
elif os.path.isdir(fdest)==False:
    os.mkdir(fdest) # make a directory if the destination does not exist
    print('The original destination does not exist, so we made a directory!', '\n')
    
# Zip up the files 
os.chdir(fdest)
print('\n-----------------------------------------------------------------------\n')
print('The zip file is being created now for your backup, please wait... \n')
shutil.make_archive(fname, 'zip', fsource) # this function can create either zip or tar file
time.sleep(sleep_time) # sleep for 5 seconds so that the zipping job is finished
print('\n-----------------------------------------------------------------------\n')
print('The zip file was successfully created at the destination folder:\n', fdest)

# Summary information for zip file
zf=zipfile.ZipFile(fname+str('.zip'))
flist = zf.namelist()
print('\n-----------------------------------------------------------------------\n')
print('\nThese are the individual files and folders within the current zip file: \n')
for i in flist:
    fileinzip = zf.getinfo(i) # getting information within the zip file
    print(fileinzip.filename)
    print('Original Size of ' + str(fileinzip.file_size) + ' KB')
    print('After compression, it has ' + str(fileinzip.compress_size) + ' KB')
    try:
        print('Compressed file is %sx smaller!' % (round(fileinzip.file_size / fileinzip.compress_size, 2)))
    except ZeroDivisionError:
        print('The original file/folder is 0KB already, nothing to be compressed')
    print('')
    zf.close()
print('\nBack-up Complete!')

Program Start Time: 2021-09-03 08:53:59 

The path for the source files exists at the following locations:
 C:\Users\gao\Gao_Jupyter_Notebook\Datasets\test folder 
 ---You may proceed 

The path for the destination zip file exists at the following locations:
 C:\Users\gao\Gao_Jupyter_Notebook\Datasets\backup test folder 
 ---You may proceed 


-----------------------------------------------------------------------

The zip file is being created now for your backup, please wait... 



AttributeError: module 'datetime' has no attribute 'sleep'