# <center> **Project Report** </center>
<center><span style="color: #999999;">By Fe, George, and Husam</span></center>


**Welcome to the project report for our python group project!**

Before proceeding with the reading of this report, ensure that you have read the contents of the `README.md` file, which gives all the relevant background information and instructions for the running and implementation of the code. For any additional questions in that regard, please use one of the appropriate contact channels mentioned in the `README.md` file. We thank you for your cooperation!

## Motivation

The motivation for this project materialized upon the realization that the python code submitted by students were manually correct, sheet by sheet, by the teachers of the course. This was especially enduring due to the fact that it was a beginners course, meaning a lot more effort compared to other courses. For this reason, it felt intuitive to at least not have to endure the additional step of going through the content that could already be confirmed to be correct. The goal of this project was to make the correction process lighter.

## Summary of Working

The code functions in three main steps and one cleanup step:

1. **Organisation**: This step involved identifying the folder structure, looking for the relevant exercise sheets as zip files, unzipping them, identifying the exercises in there, and setting them up for correction/grading.

2. **Grading/Correcting**: This involved the actual work on the exercise sheets. Mainly conducting specific checks based on the details of the task. The checking stemmed down to a `statement`. When it was `True`, the exercise passed, and when it was `False`, the exercise failed.

3. **Grading and sorting**: This was the final step before cleanup, where, based on whether the exercise sheet passed or failed (whether `statement` was `True` or `False`, the points are inputted into a `Points_Log.txt` and the exercise sheet moves to a folder, named based on whether it is successful or unsuccessful). If it was unsucessful, it moves to a folder for manual correction.

4. **Cleanup**: It deleted typically remaining folders like `__MACOSX` (remnants from extracting zip files zipped with Apple&trade; ) and `__pycache__` (containing pre-compiled data to make it easier for IDE's to run the code. We will not need it for the correction, so we also delete it). It then scans for files that the system could not recognize as an exercise sheet, and moves it to a seperate folder for `Unrecognized sheets`.

## Ground code

### Function: select_parent_folder()

The first step of the entire process is to collect a mother directory. This is the directory which will contain several child folders with exercise sheets in them. The code for it was simple, and just used the tkinter `filedialog` function built exactly for this purpose:


In [None]:
def select_parent_folder():
    parent_folder = filedialog.askdirectory(title="Select Parent Folder")
    return parent_folder

This opens the system file explorer that allows navigation within the files and folders of the PC. This is stored in a variable:

In [None]:
parent_folder = select_parent_folder()

It is very possible that when the file explorer opens, the user can simply exit the explorer and not select a file. For that reason, we carry out the extraction process only if a folder has been clicked (by using an `if` string):

In [None]:
if parent_folder:
    extract_sheets(parent_folder)

### Function: extract_sheets(parent_folder)

The extract sheets function does what the name suggests, it extracts the sheets. It also does some other small tasks that are all a part of the clean sheet-extraction process. To start, it will need a perform the same tasks individually for each folder within the mother folder. For this we need some way of checking what folders lie within the parent folder. For this, the package `os` has a module called `listdir` that creates a list of items inside a specified folder. So we run a for loop so that it runs on all the folders in the main directory:

In [None]:
for folder_name in os.listdir(parent_folder):
    folder_path = os.path.join(parent_folder, folder_name)

    if os.path.isdir(folder_path):

After looking in each folder, it performs a multitude of tasks, namely:

#### 1. Creating required folders:

A simple `for` loop runs through the names of all the necessary folders and then creates them if they don't exist already:

In [None]:
for folder in [
    "Already Extracted Sheets", "Manual Correction Needed",
    "Successful Sheets", "TXT Files", "Unrecognized Sheets"
]:
    temp_folder = os.path.join(
        folder_path,
        folder)
    os.makedirs(temp_folder, exist_ok=True)

### 2. Creation of "Points Log" and addition of first exercise points:

A `txt` file is created that stores the point balance and creates a log of all points obtained. After that the points for the first exercise are automatically inputted because the first exercise is free points for all members of the course.

In [None]:
# Creates points log
points_log_path = os.path.join(folder_path, "Points_Log.txt")
# Adds points for Sheet 01 Task 01
if not os.path.exists(points_log_path):
    with open(points_log_path, 'w') as points_log:
        point_balance = 5
        ex01_log = "Sheet 01 Task 01 IDE Installation: +5 Points\n"
        points_log.write(f"File name: {folder_name}\n")
        points_log.write(f"Point balance: {point_balance}\n")
        points_log.write("\nLogs:\n")
        points_log.write(ex01_log)

### 3. Extraction of sheets

Now comes the part where the function lives upto its name! The extraction. Before extracting, it first checks to see if the exercise sheet in question has already been extracted. If yes, then it will simply skip the extraction and move on. Apart from that, Another problem that we faced was that some students were naming their zip files with the ".zip" inside their names. So, for example the file would be `sheet01.zip.zip`. If this was extracted, it would extract it to a folder called `sheet01.zip` (even though it is not a zip file and is simpply a directory with .zip in its name). The `os` library cannot tell the difference between a zip file called `sheet01.zip` and a directory called `sheet01.zip` and therefore throws an error saying that two files with the same name cannot exist in the same directory. For that reason, we put the extraction in a try block, and in the event of an error (`FileNotFoundError`for Windows `NotADirectoryError` for mac), we added a few extra steps that fixed this naming problem:

In [None]:
# Extracts all 5 exercise sheets
sheet_names = ["sheet01.zip", "sheet02.zip", "sheet03.zip",
               "sheet04.zip", "sheet05.zip"]
for sheet_name in sheet_names:
    sheet_zip_path = os.path.join(folder_path, sheet_name)
    # Warning message if it's already been extracted
    if os.path.exists(sheet_zip_path):
        if os.path.exists(
            os.path.join(
                folder_path,
                "Already Extracted Sheets",
                sheet_name)
        ):
            message = (
                f"{sheet_name} has already been extracted for"
                f"{folder_name}.Extraction for this file "
                "will be skipped."
            )
            messagebox.showinfo("Sheet Already Extracted", message)
            # Adds "not extracted" suffix
            zip_files = sheet_name.replace(
                '.zip', ' (Not extracted).zip')
            not_extracted_path = os.path.join(
                folder_path, f"{zip_files}")
            os.rename(sheet_zip_path, not_extracted_path)
            continue
        with zipfile.ZipFile(sheet_zip_path, 'r') as zip_ref:
            try:
                zip_ref.extractall(folder_path)
            except (FileNotFoundError, NotADirectoryError):
                folder_to_delete = os.path.join(
                    folder_path,
                    "%temp%")
                os.makedirs(folder_to_delete, exist_ok=True)
                zip_ref.extractall(folder_to_delete)
                for item in os.listdir(folder_to_delete):
                    item_path = os.path.join(
                        folder_to_delete,
                        item)
                    if (
                        os.path.isdir(item_path)
                        and item.endswith(".zip")
                    ):
                        os.rename(item_path, os.path.join(
                                folder_to_delete,
                                item[:-4]))
                    shutil.move(os.path.join(
                        folder_to_delete,
                        item[:-4]), folder_path)
                os.rmdir(folder_to_delete)

The problem with some of the zip files was that most of them came with outer folders in them that was an additional step before the exercise sheets. For example, a zip folder by the name `exercise01.zip` had a folder called `exercise01` inside it, which has the tasks of the exercise. For this reason, we made our code unpack all folders that were not the folders we made initially, that had python files in them. We then used `rmdir` to delete the now empty folders:

In [None]:
for item in os.listdir(folder_path):
    item_path = os.path.join(folder_path, item)
    if os.path.isdir(item_path) and item not in [
            "__MACOSX", "__pycache__",
            "Already Extracted Sheets",
            "Manual Correction Needed",
            "Successful Sheets", "TXT Files",
            "Unrecognized Sheets"]:
        py_file_found = any(
            sub_item.endswith(".py") and os.path.isfile(
                os.path.join(item_path, sub_item))
            for sub_item in os.listdir(item_path)
        )
        if py_file_found:
            for sub_item in os.listdir(item_path):
                sub_item_path = os.path.join(
                    item_path, sub_item)
                new_item_path = os.path.join(
                    folder_path, sub_item)
                os.rename(sub_item_path, new_item_path)
            # Remove the now empty additional folder
            os.rmdir(item_path)

### 4. TXT Files

We then had all extracted TXT files moved to a folder called `TXT Files` with the name of the exercise sheet they came out of as the suffix:

In [None]:
already_extracted_path = os.path.join(
    folder_path,
    "Already Extracted Sheets",
    sheet_name)
os.rename(sheet_zip_path, already_extracted_path)
for extracted_file in os.listdir(folder_path):
    extracted_file_path = os.path.join(
        folder_path, extracted_file)
    # Looks for TXT files
    if extracted_file.lower().endswith(
            '.txt') and extracted_file != "Points_Log.txt":
        new_txt_path = os.path.join(
            folder_path, "TXT Files",
            f"{sheet_name.replace('.zip', '')}"
            f"_{extracted_file}")  # Adds filename suffix
        # Puts them into the TXT files folder
        os.rename(extracted_file_path, new_txt_path)

### 5. Exercise correction

We then ran functions responsible for correcting each task in the exercise sheet:

In [None]:
# Exercise sheet correction functions
helloworld(folder_path)
username(folder_path)
crosssum(folder_path)
lifeinweeks(folder_path)
leapyear(folder_path)
million(folder_path)
caesar_cipher(folder_path)
books(folder_path)
anagrams(folder_path)
data(folder_path)
graph(folder_path)
zen(folder_path)
shapes(folder_path)
zen_word_frequency(folder_path)
quotes(folder_path)  # Remove if needed
names(folder_path)
tictactoe(folder_path)
README(folder_path)
project(folder_path)

### 6. Deletion and Final Cleanup

The final step was to delete `__MACOSX` and `__pycache__`, and to put everything else into a folder for unrecognized sheets:

In [None]:
        # Removes commonly found folders
        folders_to_delete = ["__MACOSX", "__pycache__"]
        for folder_name in folders_to_delete:
            folder_to_delete_path = os.path.join(folder_path, folder_name)
            if (
                os.path.exists(folder_to_delete_path)
                and os.path.isdir(folder_to_delete_path)
            ):
                shutil.rmtree(folder_to_delete_path)
# Checks for files with wrong names
for remaining_item in os.listdir(folder_path):
    remaining_item_path = os.path.join(folder_path, remaining_item)
    if remaining_item not in ["Already Extracted Sheets",
                              "Manual Correction Needed",
                              "Successful Sheets", "TXT Files",
                              "Unrecognized Sheets", "Points_Log.txt"]:
        # Move the item to the "Unrecognized Sheets" folder
        new_path = os.path.join(
            folder_path, "Unrecognized Sheets", remaining_item)
        shutil.move(remaining_item_path, new_path)