# Import All Necessary Modules And Setup Project

If you get any errors when importing these, ensure you run the commands:
```bash
$ python -m pip install -r requirements.txt
```
to install all necessary modules for this project. This command must be run from inside of this project directory.

It is recommended to use virtual environments for this project to ensure there is no conflicting package versions on your system.

Activate the virtual environment (if needed), run the pip install command, and then launch Jupyter Lab inside this project to get this project running.

In [None]:
# Uncomment the following line to execute the pip install
# %pip install -r requirements.txt

In [None]:
import pandas as pd
import numpy as np

# Visualization
from matplotlib import pyplot as plt
import seaborn as sns

from measure_incremental_development.compute import calculate_mid, classify_snapshots


## Get DF Representing Single Student Submission And File

In [None]:
from getSubmissionDataframes import *

`getSubmissionDataframes` contains the following functions:

*   `getFileInStudentSubmission`
*   `getStudentSubmission`
*   `filterDownToRunAndEdits`
*   `filterDownToRunAndEditsAndPastes`
*   `getStudentSubmissionRunsAndEdits`
*   `getFileInStudentSubmissionRunsAndEdits`

## Reconstruct Submissions

In [None]:
from reconstructSubmissions import *

`reconstructSubmissions` has the functions:

*   `reconstructSingleFileDebugger`
*   `reconstructFinalFile`
*   `reconstructFileAtRunEvents`
*   `reconstructProjectAtRunEvents`

#### View Reconstructed Project

In [None]:
from viewReconstructions import *

`viewReconstructions` has the following functions:

*   `viewFinalReconstructedProject`
*   `viewReconstructedProjectStates`

#### Get Student Project Info

In [None]:
from getStudentProjectInfo import *

`getStudentProjectInfo` has the following function:

*   `getStudentProjectList`

## Load Datasets

In [None]:
keystroke_df_unedited = pd.read_csv("data/keystrokes.csv")
student_df_unedited = pd.read_csv("data/students.csv")

#### Copy Datasets For Modification

This preserves the initial datasets, in case we ever need to bring an unedited column/row back into anything

In [None]:
keystroke_df = keystroke_df_unedited.copy()
student_df = student_df_unedited.copy()

#### Reconstructing All Projects

Note, we only particularly care about `final_data`.

In [None]:
projects_df, run_events_df, final_data = getStudentProjectList(student_df, keystroke_df)

#### Create Directories For Projects, If Necessary

In [None]:
PROJECT_RECONSTRUCTION_DIRECTORY = 'reconstructed_submissions'

In [None]:
import os
import pathlib

directoryPath = pathlib.Path(PROJECT_RECONSTRUCTION_DIRECTORY)

directoryPath.mkdir(exist_ok=True)

In [None]:
for student, assign, projStates in final_data:
    # Ensure there's some files to save
    if len(projStates) > 0:
        # Create directory for Student/Assignment
        directoryName = PROJECT_RECONSTRUCTION_DIRECTORY + "/" + student + "/" + assign + "/"
        studentProjectPath = pathlib.Path(directoryName)
        studentProjectPath.mkdir(parents=True, exist_ok=True)

        # Save all files
        for fileName in projStates:
            # ensure final file isn't empty
            if len(projStates[fileName][-1]) > 0:
                newFile = open(directoryName + fileName, 'w')
                print(
                    projStates[fileName][-1],
                    file=newFile
                )
                newFile.close()
