<a href="https://colab.research.google.com/github/yuliiabosher/Cyber_Resilience_Course/blob/main/03_File_management.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Keep it clean, keep it safe
---

Effective digital file management might include:
*  Avoid saving unnecessary documents.
*  Follow a consistent method for naming your files and folders.
*  Store related documents together, whatever their type.
*  Separate ongoing work from completed work.
*  Avoid overfilling folders.
*  Organize documents by date.
*  Keep back ups

### Upload some files to work with
---

This notebook has its own filing system, which you can work with.  Any files used during a session are deleted when the notebook's runtime session stops (e.g. after a period of inactivity or when you close it).

A set of files has been prepared for this exercise.  The zipped folder containing these can be downloaded from [here](https://drive.google.com/file/d/16YXn5XdbIA4rSoQhZQ4c20__z0C_lD3f/view?usp=sharing)  

Watch this [video](https://vimeo.com/891330968/b7c8408947) to see how to upload the prepared set of files to the notebook.

### Find help for file handling with python
---

You will need to be able to do the following:
*  list the contents of a directory
*  walk through directory contents (subdirectories)
*  rename a file
*  create a folder
*  delete a file
*  move a file
*  change access settings for a file or folder

This site has some guidance:  https://pynative.com/python/file-handling/

The PATH to access files in the notebook file system is:

/content/folder_name/file_name

### Exercise 1
---

Print a list of the files in the Acme File System directory

Method 1

In [1]:
import os
directory_name = r'/content/Acme File System'
file_list = os.listdir(directory_name)
for file in file_list:
  print(file)


Company Accounts - Copy.txt
June 2023 report.txt
June 2023 minutes - Copy.txt
February 2023 minutes.txt
June 2023 minutes.txt
September 2022 report.txt
Mar 2023 report.txt
May 2023 report.txt
September 2023 minutes.txt
Mar 2023 minutes.txt
Bank details.txt
July 2022 report.txt
May 2023 minutes.txt
August 2023 minutes.txt
November 2022 report.txt
February 2023 report - Copy.txt
Mar 2023 report - Copy.txt
Customer payments.txt
October 2022 report.txt
Bobs Diary.txt
Staff addresses.txt
May 2023 minutes - Copy.txt
July 2023 minutes.txt
April 2023 minutes.txt
February 2023 report.txt
October 2023 minutes.txt
Security Project.txt
Temp staff.txt
December  2023 minutes.txt
Company Accounts.txt
November 2022 report - Copy.txt
January 2023 report.txt
January 2023 minutes.txt
August 2023 minutes - Copy.txt
April 2023 report.txt
December  2022 report.txt
August 2022 report.txt
November 2023 minutes.txt


Method 2

In [2]:
import glob
path_name = r'/content/Acme File System/*'
files = glob.glob(path_name, recursive=True)
for file in files:
  print(file.rsplit('/',1)[1])


Company Accounts - Copy.txt
June 2023 report.txt
June 2023 minutes - Copy.txt
February 2023 minutes.txt
June 2023 minutes.txt
September 2022 report.txt
Mar 2023 report.txt
May 2023 report.txt
September 2023 minutes.txt
Mar 2023 minutes.txt
Bank details.txt
July 2022 report.txt
May 2023 minutes.txt
August 2023 minutes.txt
November 2022 report.txt
February 2023 report - Copy.txt
Mar 2023 report - Copy.txt
Customer payments.txt
October 2022 report.txt
Bobs Diary.txt
Staff addresses.txt
May 2023 minutes - Copy.txt
July 2023 minutes.txt
April 2023 minutes.txt
February 2023 report.txt
October 2023 minutes.txt
Security Project.txt
Temp staff.txt
December  2023 minutes.txt
Company Accounts.txt
November 2022 report - Copy.txt
January 2023 report.txt
January 2023 minutes.txt
August 2023 minutes - Copy.txt
April 2023 report.txt
December  2022 report.txt
August 2022 report.txt
November 2023 minutes.txt


Method 3

In [3]:
new_directory_name = r'/content/Acme File System'
directory_contents = os.walk(new_directory_name)
for root, dirs, files in directory_contents:
  for file in files:
    print(file)

Company Accounts - Copy.txt
June 2023 report.txt
June 2023 minutes - Copy.txt
February 2023 minutes.txt
June 2023 minutes.txt
September 2022 report.txt
Mar 2023 report.txt
May 2023 report.txt
September 2023 minutes.txt
Mar 2023 minutes.txt
Bank details.txt
July 2022 report.txt
May 2023 minutes.txt
August 2023 minutes.txt
November 2022 report.txt
February 2023 report - Copy.txt
Mar 2023 report - Copy.txt
Customer payments.txt
October 2022 report.txt
Bobs Diary.txt
Staff addresses.txt
May 2023 minutes - Copy.txt
July 2023 minutes.txt
April 2023 minutes.txt
February 2023 report.txt
October 2023 minutes.txt
Security Project.txt
Temp staff.txt
December  2023 minutes.txt
Company Accounts.txt
November 2022 report - Copy.txt
January 2023 report.txt
January 2023 minutes.txt
August 2023 minutes - Copy.txt
April 2023 report.txt
December  2022 report.txt
August 2022 report.txt
November 2023 minutes.txt


### Exercise 2
---
Create TWO new folders:  
*  Minutes
*  Reports

Move all files with names containing 'minutes' to the Minutes folder.  

Move all files with names containing 'report' to the Reports folder.

Create folders in a separate code cell so that this code does not have to re-run

In [4]:
os.mkdir(r'/content/Acme File System/Minutes')
os.mkdir(r'/content/Acme File System/Reports')

In [6]:
import shutil
path_name_minutes = r'/content/Acme File System/*minutes*'
files_minutes = glob.glob(path_name_minutes)
for file_path in files_minutes:
  destination = file_path.rsplit('/',1)[0] + r'/Minutes/' + file_path.rsplit('/',1)[1]
  shutil.move(file_path, destination)
path_name_reports = r'/content/Acme File System/*report*'
files_reports = glob.glob(path_name_reports)
for file_path in files_reports:
  destination = file_path.rsplit('/',1)[0] + r'/Reports/' + file_path.rsplit('/',1)[1]
  shutil.move(file_path, destination)



### Exercise 3
---
In the Reports folder, create TWO new folders:
*  2022 reports
*  2023 reports

Move all files with 2022 in the name to the 2022 folder.  
Move all files with 2023 in the name to the 2023 folder.

### Exercise 4
---
List the directory structure for Acme File System using os.walk()

Help here: https://www.geeksforgeeks.org/os-walk-python/

### Exercise 5
---
There are some accidental copies of files in the directory.  

Walk through the directory, find all files with 'copy' in the name and delete them.

List all the files and folders in the directory and its sub-directories.

### Exercise 6
---
The files in the minutes and the reports directories are labelled by month.  Consequently, they are not stored in date order.  

Rename all files with a month in the name to use the month number rather than the name (e.g. replace 'August' with '08', 'December' with '12, etc)

List all the files and folders in the directory and its sub-directories.

### Exercise 7
---
Some operating systems do not process files with spaces in their names.  

Rename all files to replace spaces with underscores ( _ )  

List all the files and folders in the directory and its sub-directories.

### Exercise 8
---
Make a back up copy of the whole folder

### Exercise 9
---
Zip the back up folder (use the shutil or zipfile library - help here: https://www.guru99.com/python-zip-file.html.

Download the zipped file.  You can use:   
```
from google.colab import files
files.download('example.txt')
```



### Final challenge
---

Create a new Colab notebook. Write a clean up program that will:

Allow the user to specify the name of a zipped folder to upload  
Allow the zipped folder to be uploaded  
Unzip the folder    
Clean up the folder in a similar way to all the things done in the exercises  
Back up  
Zip and download  

