# **Utilising Code Cleansing Methods**
- In this section, I applied linting, logging, docstrings and refractoring to the code.
- The link to which can be found below:
- 


### **Unit Testing- Data Preprocessing**
- In the unit tests, we aim on obtaining a 30% or more test coverage for each part.
- To apply unit testing and cover at least 50% of the DataPreprocessor class, I will create test cases for key functionalities.
- The unit tests will be written using the unittest framework in Python.
- This coverage will include:
1. Testing database connection.
2. Reading data from the database into a DataFrame.
3. Cleaning and preprocessing DataFrames.
4. Normalizing DataFrames.
5. Handling data insertion into the database.
6. Setup for Unit Testing

- For unit testing database interactions, we will mock the database connections and operations to avoid actual database dependencies.
- I will use the unittest.mock library to mock the connections and data fetch operations.
- The link to the python file being run can be shown below:

- I will run it using the following command:
+ python -m unittest discover -s . -p "test_preprocessor.py"

- After running the unit tests, using the coverage library, I managed to obtain a coverage of 92% of the code in this section of the code.
- The evidence image can be found here:
- ![image.png](attachment:40c86b37-f749-403c-8bbf-19ab906e1bdc.png)

### **Unit Testing for Model Preprocessing**
To ensure robust and reliable testing of the data preprocessing pipeline using the `Model2BPreprocessor` class, we will implement unit tests covering key functionalities. The aim is to achieve at least 30% test coverage for each function and overall 50% coverage of the key functionalities in the data processing script.

### **Coverage Goals**
1. **Data Loading**:
   - Testing the loading of data into a DataFrame from a simulated source.

2. **Mapping and Encoding Severity Levels**:
   - Test the mapping of incident severity to reduced categories.
   - Verify the correct encoding of the mapped severity levels.

3. **Calculating Accident Count**:
   - Ensure that the accident count per event is correctly calculated and added to the DataFrame.

4. **Handling and Imputing Missing Values**:
   - Test the imputation of missing values in the feature columns.

5. **Standardizing Features**:
   - Validate the standardization of feature columns using `StandardScaler`.
   - Ensure that the mean of standardized features is 0 and the standard deviation is 1.

6. **Balancing the Dataset**:
   - Test the balancing of the dataset to ensure equal representation of each target class using SMOTE.

7. **Splitting the Dataset**:
   - Verify the splitting of the dataset into training, validation, and test sets.
   - Ensure that the splits maintain the correct proportion and are correctly one-hot encoded.

### **Setup for Unit Testing**
- The `unittest.mock` library will be used to simulate database connections and data fetch operations to avoid actual dependencies.
- Mocking is applied for operations that involve database interactions.
- Tests will be organized and executed using the `unittest` framework.

### **Command to Run the Unit Tests**
To execute the unit tests, run the following command from your terminal or command line:

```bash
python -m unittest discover -s . -p "test_modelprocessor.py"
```

This command will discover and run all tests in files matching the pattern `test_modelprocessor.py`.

### **Code Coverage**
Using the `coverage` library, you can measure the code coverage of the tests. After running the tests, generate a coverage report using:

```bash
coverage run -m unittest discover -s . -p "test_modelprocessor.py"
coverage report -m
```

### **Explanation of Unit Tests**

1. **`setUp` Method**:
   - Creates a mock DataFrame with relevant columns to be used in each test.
   - Initializes the `Model2BPreprocessor` with the mock DataFrame and specified features.

2. **`test_map_severity`**:
   - Tests if the `map_severity` method correctly maps and encodes the severity levels.
   - Validates the resulting `reduced_severity` and `encoded_severity` columns.

3. **`test_create_accident_count`**:
   - Tests if the `create_accident_count` method correctly calculates and adds the accident count per event to the DataFrame.

4. **`test_handle_missing_values`**:
   - Tests if the `handle_missing_values` method correctly imputes missing values by replacing them with zeros.

5. **`test_standardize_features`**:
   - Tests if the `standardize_features` method correctly standardizes the feature columns.
   - Ensures that the mean and standard deviation of standardized features are approximately 0 and 1, respectively.

6. **`test_balance_dataset`**:
   - Tests if the `balance_dataset` method correctly balances the dataset using SMOTE.
   - Verifies that each class is represented equally after resampling.

7. **`test_split_dataset`**:
   - Tests if the `split_dataset` method correctly splits the data into training, validation, and test sets.
   - Ensures that the labels are one-hot encoded and that the sizes of the splits match expected proportions.

8. **`test_preprocess`**:
   - Tests the full preprocessing pipeline by running the `preprocess` method.
   - Checks if the outputs are as expected and validates the shapes and one-hot encoding of the splits.

- After running the unit tests, using the coverage library, I managed to obtain a coverage of 54% of the code in this section of the code.
- The evidence images can be found here:
- --------------------------------------------------------------------
![image.png](attachment:961f9e35-78ac-410e-aa9e-517d2b4dfe80.png)

### **Unit Testing for Model Training**
- To ensure robust and reliable testing of the deep learning model training, we will implement unit tests covering key functionalities of the `Model2BTrainer` class.

### **Coverage Goals**
1. **Model Initialization**:
   - Test the creation of the model with specified input dimensions.
   - Verify if the model layers are correctly configured.

2. **Model Compilation**:
   - Ensure the model is compiled with the specified optimizer and loss function.
   - Check if the metrics include accuracy.

3. **Model Training**:
   - Test the training process with dummy data.
   - Validate the usage of callbacks such as `EarlyStopping` and `ReduceLROnPlateau`.

4. **Model Evaluation**:
   - Assess the model's evaluation capability on a dummy dataset.
   - Confirm if it returns loss and accuracy.

5. **Prediction and Evaluation**:
   - Test the prediction generation on the test set.
   - Verify the confusion matrix and classification report outputs.

6. **Plotting Learning Curves**:
   - Ensure the method can plot training and validation accuracy/loss over epochs.

7. **Saving the Model**:
   - Check if the model saving functionality works correctly.
   - Validate that the saved file exists.

### **Setup for Unit Testing**
- Mocking libraries will be used to simulate the behavior of the Keras/TensorFlow model and training process.
- The `unittest.mock` library will help in avoiding the need to train on actual data, which saves time and resources during testing.
- Tests will be organized and executed using the `unittest` framework.

### **Command to Run the Unit Tests**
To execute the unit tests, run the following command from your terminal or command line:

```bash
python -m unittest discover -s . -p "test_model_training.py"
```

- This command will discover and run all tests in files matching the pattern `test_model_training.py`.

### **Code Coverage**
Using the `coverage` library, you can measure the code coverage of the tests. After running the tests, you can generate a coverage report using:

```bash
coverage run -m unittest discover -s . -p "test_model_training.py"
coverage report -m
```

### **Explanation of Unit Tests**

1. **`setUp` Method**:
   - Sets up mock datasets to be used in each test.
   - Initializes the `Model2BTrainer` with the mock datasets.

2. **`test_compute_class_weights`**:
   - Tests if the `compute_class_weights` method correctly computes and returns the class weights as a dictionary.

3. **`test_build_model`**:
   - Tests if the `build_model` method correctly builds a Keras Sequential model with the expected number of layers.

4. **`test_train_model`**:
   - Mocks the `build_model` method to avoid actual model building.
   - Mocks the `fit` method of the model to avoid actual training.
   - Tests if the `train_model` method calls the `fit` method with the correct parameters.

5. **`test_evaluate_model`**:
   - Mocks the training history and tests if the `evaluate_model` method correctly evaluates and prints the maximum validation accuracy.

6. **`test_save_model`**:
   - Mocks the `save` method of the Keras model to avoid actual file writing.
   - Tests if the `save_model` method calls the `save` method with the correct file path.

7. **`test_plot_learning_curves`**:
   - Mocks the training history and `matplotlib.pyplot` methods.
   - Tests if the `plot_learning_curves` method generates the correct plots for accuracy and loss.

8. **`test_plot_confusion_matrix`**:
   - Mocks the `predict` method of the model to avoid actual prediction.
   - Tests if the `plot_confusion_matrix` method correctly computes and displays the confusion matrix.


- After running the unit tests, using the coverage library, I managed to obtain a coverage of 46% of the code in this section of the code.
- The evidence images can be found here:
![image.png](attachment:5379b693-485f-4e99-8a0c-34af08d1ac41.png)


### **Unit Testing for Model Evaluation**
- To ensure robust and reliable testing of the model evaluation process using the `Model2BEvaluator` class, we will implement unit tests covering key functionalities.
- The aim is to achieve at least 30% test coverage.

### **Coverage Goals**
1. **Prediction**:
   - Test the prediction generation on the test set.
   - Ensure the predictions have the correct shape and type.

2. **Calculate Metrics**:
   - Verify precision, recall, and F1 score calculations.
   - Ensure these metrics are calculated correctly for a given set of predictions.

3. **Plotting Confusion Matrix**:
   - Test the generation of the confusion matrix.
   - Verify that the plot is created without errors.

4. **Model Evaluation**:
   - Assess the model's evaluation capability on the training and test datasets.
   - Confirm that it returns and prints the correct loss and accuracy.

5. **Generate Classification Report**:
   - Validate the creation of the classification report.
   - Ensure it accurately reflects the model’s performance.

6. **Plotting Learning Curves**:
   - Test the plotting of training and validation accuracy/loss over epochs.
   - Ensure the method can plot the learning curves correctly.

7. **Run Complete Evaluation**:
   - Verify that running the full evaluation process calls all the relevant methods and completes without errors.

### **Setup for Unit Testing**
- Use mocking to simulate the behavior of the Keras/TensorFlow model and evaluation process.
- The `unittest.mock` library will help avoid the need to evaluate on actual data, saving time and resources during testing.
- Tests will be organized and executed using the `unittest` framework.

### **Command to Run the Unit Tests**
To execute the unit tests, run the following command from your terminal or command line:

```bash
python -m unittest discover -s . -p "test_model_evaluator.py"
```

- This command will discover and run all tests in files matching the pattern `test_model_evaluator.py`.

### **Code Coverage**
Using the `coverage` library, you can measure the code coverage of the tests. After running the tests, generate a coverage report using:

```bash
coverage run -m unittest discover -s . -p "test_model_evaluator.py"
coverage report -m
```

### **Explanation of Unit Tests**

1. **Test Setup (`setUp` method)**:
   - Creates mock objects and dummy data for testing.
   - Initializes the `Model2BEvaluator` with these mocks.

2. **`test_evaluate_model`**:
   - Mocks the model's `evaluate` method.
   - Ensures that the evaluation metrics for both training and test datasets are retrieved and stored correctly.

3. **`test_plot_learning_curves`**:
   - Mocks `matplotlib.pyplot.show` to avoid displaying the plot during tests.
   - Verifies that the learning curves for accuracy and loss are plotted with the correct number of plot calls.

4. **`test_plot_confusion_matrix`**:
   - Mocks `matplotlib.pyplot.show` and `matplotlib.pyplot.gca`.
   - Ensures that the confusion matrix is computed and plotted correctly.

5. **`test_print_classification_report`**:
   - Mocks the `print` function to capture the printed output.
   - Verifies that the classification report is generated and printed with the expected content.

6. **`test_run_evaluation`**:
   - Mocks all key methods of `Model2BEvaluator` to ensure they are called during the complete evaluation run.
   - Checks if the complete evaluation process calls all necessary methods.

This sequence will help you assess the test coverage for your `Model2BEvaluator` class and ensure that all critical functionalities are adequately tested.
- After running the unit tests, using the coverage library, I managed to obtain a coverage of 46% of the code in this section of the code.
- The evidence images can be found here:
![image.png](attachment:bc1c9401-9ff9-4ec1-8c6f-cf4feb92c739.png)

### **Unit Testing for Driving Risk Prediction Application**
- To ensure robust and reliable testing of the `DrivingRiskApp` class, we will implement unit tests covering key functionalities.
- The aim is to achieve at least 30% test coverage.

### **Coverage Goals**
1. **Setting Background and Styles**:
   - Test the background image and CSS styling setup.
   - Ensure the Streamlit markdown is called to apply styles correctly.

2. **Loading Pretrained Model**:
   - Verify that the model is loaded with the correct architecture and weights.
   - Ensure the optimizer and learning rate are set correctly.

3. **Generating Dummy Data**:
   - Test the generation of the provided dummy data.
   - Ensure the DataFrame has the expected structure and contains relevant features.

4. **Predicting Risk Level**:
   - Mock the scaler and model to simulate predictions.
   - Verify that predictions are generated correctly and the risk level is mapped appropriately.

5. **Login Page**:
   - Test the login page UI components.
   - Ensure that the username and password inputs are correctly handled.
   - Simulate a successful login and verify state changes.

6. **Main Application Page**:
   - Test the navigation between different pages (home, analyzing, and results).
   - Ensure the correct components are displayed based on the current page.
   - Validate that settings can be toggled and intervals are set correctly.

7. **Analyzing Page**:
   - Verify the display of the loading spinner and transition to the results page.
   - Ensure the correct waiting period is simulated.

8. **Results Page**:
   - Mock the model and data transformations to simulate the prediction and display process.
   - Verify the display of predicted risk levels and probabilities.
   - Ensure the input data used for prediction and the scaled data are shown for debugging purposes.

### **Setup for Unit Testing**
- Use mocking to simulate the behavior of the Streamlit UI elements and TensorFlow model.
- The `unittest.mock` library will help avoid dependencies on actual user interactions and model predictions.
- Tests will be organized and executed using the `unittest` framework.


### **Explanation of Unit Tests**

1. **Test Setup (`setUp` method)**:
   - Initializes the `DrivingRiskApp` instance before each test.
   - Sets up any necessary mock objects and data.

2. **`test_set_bg_hack_url`**:
   - Mocks the Streamlit markdown method.
   - Verifies that the background image and CSS are set correctly.

3. **`test_load_pretrained_model`**:
   - Mocks the model's load_weights method.
   - Ensures the model is loaded with the correct weights and compiled with the specified optimizer and learning rate.

4. **`test_get_provided_dummy_data`**:
   - Tests the generation of dummy data.
   - Ensures the DataFrame returned has the expected columns and structure.

5. **`test_predict_risk`**:
   - Mocks the scaler's transform method and the model's predict method.
   - Verifies that the prediction is generated and the risk level is correctly mapped.

6. **`test_login_page`**:
   - Mocks the Streamlit text_input and button methods.
   - Ensures the username and password inputs are displayed and handled correctly.
   - Verifies the state change upon a simulated successful login.

7. **`test_successful_login`**:
   - Mocks the Streamlit text_input, button, and rerun methods.
   - Simulates a successful login and checks if the login state is updated.

8. **`test_home_page`**:
   - Mocks the Streamlit header, write, and button methods.
   - Verifies the correct display of the home page components and handles the navigation to the analyzing page.

9. **`test_analyzing_page`**:
   - Mocks the time.sleep and Streamlit rerun methods.
   - Tests the display of the analyzing page and the simulated wait before transitioning to the results page.

10. **`test_results_page`**:
    - Mocks the Streamlit success, write, columns, empty, and rerun methods.
    - Verifies the display of predicted risk levels and probabilities.
    - Ensures the correct display of input data and scaled data for debugging.
      
- After running the unit tests, using the coverage library, I managed to obtain a coverage of 46% of the code in this section of the code.
- The evidence images can be found here:
![image.png](attachment:3ff085ec-b5ee-48a1-8442-e572b89a3209.png)