<a href="https://colab.research.google.com/github/yuliiabosher/Cyber_Resilience_Course/blob/main/03_File_management.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Keep it clean, keep it safe
---

Effective digital file management might include:
*  Avoid saving unnecessary documents.
*  Follow a consistent method for naming your files and folders.
*  Store related documents together, whatever their type.
*  Separate ongoing work from completed work.
*  Avoid overfilling folders.
*  Organize documents by date.
*  Keep back ups

### Upload some files to work with
---

This notebook has its own filing system, which you can work with.  Any files used during a session are deleted when the notebook's runtime session stops (e.g. after a period of inactivity or when you close it).

A set of files has been prepared for this exercise.  The zipped folder containing these can be downloaded from [here](https://drive.google.com/file/d/16YXn5XdbIA4rSoQhZQ4c20__z0C_lD3f/view?usp=sharing)  

Watch this [video](https://vimeo.com/891330968/b7c8408947) to see how to upload the prepared set of files to the notebook.

### Find help for file handling with python
---

You will need to be able to do the following:
*  list the contents of a directory
*  walk through directory contents (subdirectories)
*  rename a file
*  create a folder
*  delete a file
*  move a file
*  change access settings for a file or folder

This site has some guidance:  https://pynative.com/python/file-handling/

The PATH to access files in the notebook file system is:

/content/folder_name/file_name

### Exercise 1
---

Print a list of the files in the Acme File System directory

In [None]:
import os
import shutil
import glob
from google.colab import files
acme_directory_path = '/content/Acme File System'

In [None]:
def list_files_in_folder_and_subfolders(folder):
  directory_contents = os.walk(folder)
  file_name_list = []
  for root, dirs, files in directory_contents:
    for file in files:
      file_name_list.append(file)
  return file_name_list
acme_files_list = list_files_in_folder_and_subfolders(acme_directory_path)
for file in acme_files_list:
  print(file)

Company Accounts - Copy.txt
June 2023 report.txt
June 2023 minutes - Copy.txt
February 2023 minutes.txt
June 2023 minutes.txt
September 2022 report.txt
Mar 2023 report.txt
May 2023 report.txt
September 2023 minutes.txt
Mar 2023 minutes.txt
Bank details.txt
July 2022 report.txt
May 2023 minutes.txt
August 2023 minutes.txt
November 2022 report.txt
February 2023 report - Copy.txt
Mar 2023 report - Copy.txt
Customer payments.txt
October 2022 report.txt
Bobs Diary.txt
Staff addresses.txt
May 2023 minutes - Copy.txt
July 2023 minutes.txt
April 2023 minutes.txt
February 2023 report.txt
October 2023 minutes.txt
Security Project.txt
Temp staff.txt
December  2023 minutes.txt
Company Accounts.txt
November 2022 report - Copy.txt
January 2023 report.txt
January 2023 minutes.txt
August 2023 minutes - Copy.txt
April 2023 report.txt
December  2022 report.txt
August 2022 report.txt
November 2023 minutes.txt


### Exercise 2
---
Create TWO new folders:  
*  Minutes
*  Reports

Move all files with names containing 'minutes' to the Minutes folder.  

Move all files with names containing 'report' to the Reports folder.

Created folders in a separate code cell so that this code does not have to re-run

In [None]:
create_minutes_subfolder_in_acme_folder = os.mkdir(r'/content/Acme File System/Minutes')
create_reports_subfolder_in_acme_folder = os.mkdir(r'/content/Acme File System/Reports')


In [None]:
files_minutes_pattern = glob.glob('/content/Acme File System/*minutes*')
def move_files_to_inside_folder(destination_folder_name, pathnames_matching_pattern):
  for file_path in pathnames_matching_pattern:
    if os.path.isfile(file_path):
      destination = os.path.join(file_path.rsplit('/',1)[0], destination_folder_name, file_path.rsplit('/',1)[1])
      shutil.move(file_path, destination)
move_files_with_minutes_in_name_to_minutes_subfolder  = move_files_to_inside_folder("Minutes",files_minutes_pattern)
files_report_pattern = glob.glob('/content/Acme File System/*report*')
move_files_with_reports_in_name_to_reports_subfolder = move_files_to_inside_folder("Reports",files_report_pattern)

### Exercise 3
---
In the Reports folder, create TWO new folders:
*  2022 reports
*  2023 reports

Move all files with 2022 in the name to the 2022 folder.  
Move all files with 2023 in the name to the 2023 folder.

Create a separate code cell for creating new folders to avoid errors while re-running other code

In [None]:
create_2022_reports_subfolder_in_reports_folder = os.mkdir(r'/content/Acme File System/Reports/2022 reports')
create_2023_reports_subfolder_in_reports_folder = os.mkdir(r'/content/Acme File System/Reports/2023 reports')

In [None]:
files_reports_2022_pattern = glob.glob('/content/Acme File System/Reports/*2022*')
files_reports_2023_pattern = glob.glob('/content/Acme File System/Reports/*2023*')
move_files_with_2022_in_name_to_2022_reports_subfolder = move_files_to_inside_folder("2022 reports", files_reports_2022_pattern)
move_files_with_2022_in_name_to_2023_reports_subfolder = move_files_to_inside_folder("2023 reports", files_reports_2023_pattern)

### Exercise 4
---
List the directory structure for Acme File System using os.walk()

Help here: https://www.geeksforgeeks.org/os-walk-python/

In [None]:
def list_directory_structure(target_directory_path):
  target_directory_contents = os.walk(target_directory_path)
  i = 0
  for root, dirs, files in target_directory_contents:
    i+=1
    print(f"\nLevel {i}\n\nThe root folder of level {i} is {root.rsplit('/',1)[1]} \n")
    print(f"The folders in {root.rsplit('/',1)[1]} are:\n")
    if dirs==[]:
      print('No folders')
    for dir in dirs:
          print(dir)
    print(f"\nThe files in {root.rsplit('/',1)[1]} are:\n")
    if files==[]:
      print('No files')
    for file in files:
          print(file)
show_acme_directory_tree_moving_files = list_directory_structure(acme_directory_path)


Level 1

The root folder of level 1 is Acme File System 

The folders in Acme File System are:

Reports
Minutes

The files in Acme File System are:

Company Accounts - Copy.txt
Bank details.txt
Customer payments.txt
Bobs Diary.txt
Staff addresses.txt
Security Project.txt
Temp staff.txt
Company Accounts.txt

Level 2

The root folder of level 2 is Reports 

The folders in Reports are:

2023 reports
2022 reports

The files in Reports are:

No files

Level 3

The root folder of level 3 is 2023 reports 

The folders in 2023 reports are:

No folders

The files in 2023 reports are:

June 2023 report.txt
Mar 2023 report.txt
May 2023 report.txt
February 2023 report - Copy.txt
Mar 2023 report - Copy.txt
February 2023 report.txt
January 2023 report.txt
April 2023 report.txt

Level 4

The root folder of level 4 is 2022 reports 

The folders in 2022 reports are:

No folders

The files in 2022 reports are:

September 2022 report.txt
July 2022 report.txt
November 2022 report.txt
October 2022 report.

### Exercise 5
---
There are some accidental copies of files in the directory.  

Walk through the directory, find all files with 'copy' in the name and delete them.

List all the files and folders in the directory and its sub-directories.

In [None]:
def delete_files_with_pattern_in_name(folder_path, pattern):
  file_paths_to_delete = f'{folder_path}/**/*{pattern}*'
  for file_path in glob.iglob(file_paths_to_delete,recursive=True):
    os.remove(file_path)
delete_file_copies = delete_files_with_pattern_in_name(acme_directory_path,'Copy')
show_acme_directory_tree_after_deleting_files = list_directory_structure(acme_directory_path)


Level 1

The root folder of level 1 is Acme File System 

The folders in Acme File System are:

Reports
Minutes

The files in Acme File System are:

Bank details.txt
Customer payments.txt
Bobs Diary.txt
Staff addresses.txt
Security Project.txt
Temp staff.txt
Company Accounts.txt

Level 2

The root folder of level 2 is Reports 

The folders in Reports are:

2023 reports
2022 reports

The files in Reports are:

No files

Level 3

The root folder of level 3 is 2023 reports 

The folders in 2023 reports are:

No folders

The files in 2023 reports are:

June 2023 report.txt
Mar 2023 report.txt
May 2023 report.txt
February 2023 report.txt
January 2023 report.txt
April 2023 report.txt

Level 4

The root folder of level 4 is 2022 reports 

The folders in 2022 reports are:

No folders

The files in 2022 reports are:

September 2022 report.txt
July 2022 report.txt
November 2022 report.txt
October 2022 report.txt
December  2022 report.txt
August 2022 report.txt

Level 5

The root folder of level

### Exercise 6
---
The files in the minutes and the reports directories are labelled by month.  Consequently, they are not stored in date order.  

Rename all files with a month in the name to use the month number rather than the name (e.g. replace 'August' with '08', 'December' with '12, etc)

List all the files and folders in the directory and its sub-directories.

In [None]:
def rename_files_based_on_pattern(folder, pattern, substitute):
  file_paths_to_replace = f'{folder}/**/*{pattern}*'
  for path in glob.iglob(file_paths_to_replace,recursive=True):
    if os.path.isfile(path):
      file_current_directory = os.path.split(path)[0]
      file_name = os.path.split(path)[1]
      new_file_name = file_name.replace(pattern, substitute)
      new_path = os.path.join(file_current_directory, new_file_name)
      os.rename(path, new_path)
def replace_months_with_numbers_in_file_names(folder):
  months_numbers = {"January":'01',"February":'02',"March":'03',"April":'04',"May":'05',"June":'06',"July":'07',"August":'08',"September":'09',"October":'10',"November":'11',"December":'12'}
  for month, number in months_numbers.items():
    rename_files_based_on_pattern(folder,month,number)
replace_months_with_numbers_in_acme = replace_months_with_numbers_in_file_names(acme_directory_path)
show_acme_directory_tree_after_replacing_months = list_directory_structure(acme_directory_path)


Level 1

The root folder of level 1 is Acme File System 

The folders in Acme File System are:

Reports
Minutes

The files in Acme File System are:

Bank details.txt
Customer payments.txt
Bobs Diary.txt
Staff addresses.txt
Security Project.txt
Temp staff.txt
Company Accounts.txt

Level 2

The root folder of level 2 is Reports 

The folders in Reports are:

2023 reports
2022 reports

The files in Reports are:

No files

Level 3

The root folder of level 3 is 2023 reports 

The folders in 2023 reports are:

No folders

The files in 2023 reports are:

05 2023 report.txt
02 2023 report.txt
Mar 2023 report.txt
06 2023 report.txt
04 2023 report.txt
01 2023 report.txt

Level 4

The root folder of level 4 is 2022 reports 

The folders in 2022 reports are:

No folders

The files in 2022 reports are:

10 2022 report.txt
07 2022 report.txt
12  2022 report.txt
09 2022 report.txt
08 2022 report.txt
11 2022 report.txt

Level 5

The root folder of level 5 is Minutes 

The folders in Minutes are:

No

### Exercise 7
---
Some operating systems do not process files with spaces in their names.  

Rename all files to replace spaces with underscores ( _ )  

List all the files and folders in the directory and its sub-directories.

In [None]:
rename_acme_files_by_replacing_blanks_with_underscores = rename_files_based_on_pattern(acme_directory_path," ","_")
show_acme_directory_tree_after_replacing_blanks_with_underscores = list_directory_structure(acme_directory_path)


Level 1

The root folder of level 1 is Acme File System 

The folders in Acme File System are:

Reports
Minutes

The files in Acme File System are:

Staff_addresses.txt
Customer_payments.txt
Bank_details.txt
Company_Accounts.txt
Security_Project.txt
Bobs_Diary.txt
Temp_staff.txt

Level 2

The root folder of level 2 is Reports 

The folders in Reports are:

2023 reports
2022 reports

The files in Reports are:

No files

Level 3

The root folder of level 3 is 2023 reports 

The folders in 2023 reports are:

No folders

The files in 2023 reports are:

04_2023_report.txt
01_2023_report.txt
05_2023_report.txt
06_2023_report.txt
02_2023_report.txt
Mar_2023_report.txt

Level 4

The root folder of level 4 is 2022 reports 

The folders in 2022 reports are:

No folders

The files in 2022 reports are:

12__2022_report.txt
10_2022_report.txt
08_2022_report.txt
11_2022_report.txt
07_2022_report.txt
09_2022_report.txt

Level 5

The root folder of level 5 is Minutes 

The folders in Minutes are:

No

### Exercise 8
---
Make a back up copy of the whole folder

In [None]:
destination_directory_path = '/content/Acme File System Backup'
create_backup_acme_folder = shutil.copytree(acme_directory_path, destination_directory_path, ignore_dangling_symlinks=False, dirs_exist_ok=True)

### Exercise 9
---
Zip the back up folder (use the shutil or zipfile library - help here: https://www.guru99.com/python-zip-file.html.

Download the zipped file.  You can use:   
```
from google.colab import files
files.download('example.txt')
```



In [None]:
create_archive_of_acme_backup_folder = shutil.make_archive('Acme File System Backup',"zip", '/content/Acme File System Backup')
download_archive_of_acme_folder = files.download('Acme File System Backup.zip')


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

### Final challenge
---

Create a new Colab notebook. Write a clean up program that will:

Allow the user to specify the name of a zipped folder to upload  
Allow the zipped folder to be uploaded  
Unzip the folder    
Clean up the folder in a similar way to all the things done in the exercises  
Back up  
Zip and download  

