# Compilation of Senior Project Activities - Electrical Engineering Department

## Introduction
This document summarizes the activities and findings of our senior project focused on deep unfolding for matrix completion. Below is a comprehensive compilation of our coursework, research, experiments, and key insights.

## Coursework and Preliminary Studies
1. **Advanced Digital Signal Processing (ADSP):** Completed Dr. Tahir's online course.
2. **Deep Learning Fundamentals:** Completed an online course by Daniel Burke: [Pytorch for Deep Learning](https://www.learnpytorch.io/).
3. **Literature Review:** Engaged with key texts for foundational knowledge:
   - Read Chapter 4 and the Appendix of [*High-Dimensional Data Analysis with Low-Dimensional Models*](https://book-wright-ma.github.io/Book-WM-20210422.pdf) by John Wright and Yi Ma, covering optimization algorithms and duality theory.

## Research and Experimentation
4. **Paper Implementations:**
   - Replicated findings from Shoaib Bhai's papers on [DUPA-RPCA](https://ieeexplore.ieee.org/document/9906418) and [DUST-RPCA](https://arxiv.org/abs/2211.03184).
5. **Development of ConvMC-Net:**
   - Developed, refined, and documented ConvMC-Net.
   - Ran experiments on synthetic data and compared the results with those from ADMM-Net.
   - Results indicated superior inference capabilities of ConvMC-Net and comparable performance in various test conditions.
   - Detailed documentation is available on [GitHub](https://github.com/Talha-Nehal-Undegrad-Study/convmc-net).
6. **Further Comparisons and Insights:**
   - Compared ConvMC-Net with [$\ell_0$-BCD](https://ieeexplore.ieee.org/abstract/document/9970585), a state-of-the-art method for matrix completion involving Gaussian Mixture Model (GMM) noise, using MATLAB.
   - Identified discrepancies in expected versus actual comparison to be due to different noise assumptions in the objective functions.
7. **Algorithm Exploration:**
   - Switched focus to other algorithms addressing GMM noise, notably [M-Estimation](https://ieeexplore.ieee.org/document/8682657) that utilizes the Huber regression algorithm (hubreg) developed via MM algorithm.
   - Reviewed and unfolded M-estimation based on the [work](https://ieeexplore.ieee.org/document/9231538) of Ollila and Mian concerning sparse learning applications.

## Advanced Experimentation and Unfolding
6. **Unfolding and Optimization:**
   - Initially chose to unfold M-Estimation using the hubreg algorithm described by Ollila and Mian in their paper on sparse learning applications.
   - After unfolding this variant and conducting experiments on synthetic data, we observed unexpected behavior of loss curves, which led us to further investigation of the underlying theory.
   - Upon deeper study of [*Robust Statistics for Signal Processing*](https://www.cambridge.org/core/books/robust-statistics-for-signal-processing/0C1419475504BC0E6C26376185813B6D), we realized that the original hubreg algorithm, which, unlike the previously unfolded variant, had no hyperparameters, was more suitable for our needs.
   - Despite initial reluctance due to its seemingly unlearnable structure, peer discussions convinced us to proceed with unfolding the original hubreg algorithm.
   - Implemented convolution mappings onto the pseudo-inverse matrix as a learnable parameter, which significantly improved our results over the initial approach.
7. **Performance Analysis and Comparisons:**
   - Noticed fluctuations in loss curves for certain combinations of SNR and sampling rate, where validation loss increased significantly for several epochs before decreasing, yet ended higher than it started.
   - Tested different matrix sizes not used in $\ell_0$-BCD experiments and found that M-Estimation and [$\ell_p$-reg](https://ieeexplore.ieee.org/document/8219728) performed better in several combinations.
   - Conducted comparative tests using $\ell_0$-BCD matrix sizes (400x500 and 150x300) for image inpainting. It outperformed others at 400x500, but was comparable to M-estimation and outperformed by $\ell_p$-reg at 150x300.
   - Were suggested by XIao Peng Li that tuning the $\epsilon$ hyperparameter in $\ell_0$-BCD might enhance its performance.
8. **Further Developments and Reflections:**
   - Refined, updated, and implemented MATLAB scripts for various algorithms including $\ell_0$-BCD, M-Estimation, [ORMC](https://ieeexplore.ieee.org/abstract/document/9457222), $\ell_p$-reg (p = 1), and $\ell_p$-ADMM (p = 1), considering these as strong competitors.
   - Aimed to achieve superior results than iterative methods, particularly at higher sampling rates; initial findings showed similar performance at lower rates.
   - Explored enhancements to the unfolding method by learning the pseudo-inverse matrix through a product of two matrices and experimenting with initialization methods (e.g., Xavier) and non-linear activations (e.g., tanh) to improve performance.
   - Observed consistent performance in training and validation loss curves, with ranges between 4e-4 to 3e-4 for training and 7 to 8 for validation which is very 
   suspicious as it seems to imply that learning the psuedo-inverse has somehow made the algorithm indifferent to the noise and sampling rate combinations (note that difference loss metrics were used for training and validation).
   - More details can be found on GitHub regarding our experiments with [M-Estimation](https://github.com/Talha-Nehal-Undegrad-Study/M-estimation-RMC) and its unfolded version, [ConvHuberMC-Net](https://github.com/Talha-Nehal-Undegrad-Study/ConvHuberMC-Net).

## Conclusions and Future Work
9. **Current Findings:**
    - Achieved consistent performance with training and validation loss curves stabilizing within expected ranges.
    - Early results suggest potential for achieving better performance than traditional iterative methods, particularly at higher sampling rates.
10. **Next Steps:**
    - Continue to refine and test ConvMC-Net on image inpainting and additional synthetic data sets.
    - Aim to surpass the performance of iterative versions by addressing identified issues and optimizing algorithm parameters.

## Reflections
- The project provided profound insights into the challenges and complexities of unfolding algorithms for matrix completion in noisy environments.
- In hindsight, despite the fact that this was an entirely new area for us, a few missteps costed us a lot of valuable time and slowed our progress:
    - Upon realizing that ConvMC-Net cannot be fairly compared with ADMM-Net due to difference in model assumptions, it might have been better to consult our peers about whether this is really the case, and, if it is, can we somehow modify ConvMC-Net instead of jumping to a new algorithm.
    - When we decided to unfold M-estimation, we could have asked whether some other algorithm would be more friendly to deep unfolding (we primarily chose M-estimation based on the closeness of its results to those of $\ell_0$-BCD). These lessons will hopefully guide us in our future projects.
- Future projects can include a detailed review of ConvMC-Net followed by our explorative efforts on adapting unfolding techniques to GMM noise problems.

In [1]:
# Note: Original saved in Tahir Sproj folder

# Deep Learning Libraries
import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.utils.data as data

# Data Manipulation and Analysis
import numpy as np
import pandas as pd

# Data Visualization
import matplotlib
import matplotlib.pyplot as plt

# File and System Interaction
import os
from pathlib import Path
import torch.optim as optim


# Date and Time Handling
import time
import datetime

# Linear Algebra
from torch import linalg as LA

# Neural Architecture
try:
    from torchinfo import summary
except:
    # %pip install torchinfo
    from torchinfo import summary

In [2]:
%load_ext autoreload
%autoreload 2

from python_scripts import dataset_processing
from python_scripts import architecture
from python_scripts import revised_architecture
from python_scripts import training
from python_scripts import logs_and_results


In [3]:
# Setting up some global variables

ROOT = os.getcwd().replace('\\', '/') + '/HuberMC_Data'
# ROOT = 'C:/Users/Talha/OneDrive - Higher Education Commission/Documents/GitHub/ConvHuberMC/HuberMC_Data'
# ROOT = 'C:/Users/HP/GitHub Workspace/ConvHuberMC-Net/HuberMC_Data'
TRY = 'Try 1'
SESSION = 'Session 1'
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device, ROOT

('cpu',
 'c:/Users/Talha/OneDrive - Higher Education Commission/Documents/GitHub/ConvHuberMC/HuberMC_Data')

Testing Training Loop

In [4]:
# Get parameters --> for convhubermc:
def get_default_param(gpu = True, model = 'HuberMC-Net'):
    params_net = {}
    params_net['size1'] = 150
    params_net['size2'] = 300
    params_net['rank'] = 10
    
    params_net['device'] = device

    if model == 'HuberMC-Net':
        params_net['hubreg_iters'] = 2
    elif model == 'LP1':
        params_net['inner_iters'] = 2
        
    params_net['layers'] = 3
    params_net['kernel'] = (3, 3)
    
    params_net['CalInGPU'] = gpu
    
    return params_net

In [5]:
class Conv2dC(nn.Module):
    def __init__(self, kernel): # Empirically found to be self.kernel to maintain same shape
        super(Conv2dC, self).__init__()

        # Given a kernel size of 2 dimensions, we calculate the padding through the formula (k[0] - 1)/2 --> this helps maintain the shape as close as possible
        pad0 = int((kernel[0] - 1) / 2)
        pad1 = int((kernel[1] - 1) / 2)
        if torch.cuda.is_available():
            self.convR = nn.Conv2d(1, 1, (kernel[0], kernel[0]), (1, 1), (pad0, pad1), groups = 1).cuda()
        else:
            self.convR = nn.Conv2d(1, 1, (kernel[0], kernel[0]), (1, 1), (pad0, pad1), groups = 1).to('cpu')
        
        # Initialize weights to zero
        self.convR.weight.data.zero_()

        # For a 3x3 kernel, set the center value to 1 to approximate an identity operation
        if kernel[0] == 3 and kernel[1] == 3:
            self.convR.weight.data[0, 0, 1, 1] = 1

        # Set bias to zero
        self.convR.bias.data.zero_()
        # At groups = in_channels, each input channel is convolved with its own set of filters (of size out_channels/in_channels)

    def forward(self, x):
        # get the height dimension and convert it to int
        n = x.shape[-1]
        # This line creates a new tensor xR by slicing the input tensor along the columns dimension (0:n). The None adds an extra dimension, making xR a 4-dimensional tensor with size (1, 1, H, W).
        # The 1's make sure the consistency with the conv operation which is expecting a in_channels, out_channels, which are set to 1
        xR = x[None, None, :, 0:n].clone()
        xR = self.convR(xR)
        # Removing the extra dimension
        xR = xR.squeeze()
        x = xR
        return x

In [6]:
conv_op = Conv2dC(kernel = (3, 3))

In [7]:
test = torch.randn(150, 10)
test2 = test * 0.01

test[:10, :10], test2[:10, :10]

(tensor([[ 6.8881e-01,  1.1912e-01,  3.6523e+00,  1.0410e-01, -2.3979e-01,
          -1.5193e-01, -5.8620e-01,  7.5602e-01, -1.1917e+00, -5.3552e-01],
         [ 3.8613e-01, -3.3701e-01, -2.4367e+00,  8.0928e-01,  2.0553e+00,
          -7.5597e-01, -9.8536e-01, -3.1243e-01, -2.0315e+00, -1.0597e+00],
         [ 1.4733e-01, -5.2234e-01,  4.1729e-01,  8.2231e-01, -1.6066e-01,
           8.6506e-02,  7.8791e-01, -3.0938e-01,  1.8582e+00,  1.8808e+00],
         [-9.4668e-01,  2.7926e-03,  1.0760e+00, -1.8669e+00, -5.7272e-01,
           2.0427e+00,  6.1051e-03,  6.9947e-01,  4.8721e-01,  1.0490e+00],
         [ 4.6475e-01,  1.9995e+00,  7.7303e-03, -7.7050e-01,  2.2461e-01,
          -4.0902e-01,  4.0365e-01, -3.2824e-01, -2.1672e-01, -5.8281e-01],
         [-1.1063e-01,  5.0243e-01, -9.1922e-02, -2.8413e-01,  2.0263e+00,
           9.9791e-01,  5.1758e-01, -1.7228e+00, -1.5064e+00, -1.5869e+00],
         [-3.3347e-01,  3.2403e-01,  4.9551e-01,  1.1229e-02, -1.1195e+00,
           1.1633e+

In [8]:
test3, test4 = conv_op(test), conv_op(test2)

test3[:10, :10], test4[:10, :10]

(tensor([[ 6.8881e-01,  1.1912e-01,  3.6523e+00,  1.0410e-01, -2.3979e-01,
          -1.5193e-01, -5.8620e-01,  7.5602e-01, -1.1917e+00, -5.3552e-01],
         [ 3.8613e-01, -3.3701e-01, -2.4367e+00,  8.0928e-01,  2.0553e+00,
          -7.5597e-01, -9.8536e-01, -3.1243e-01, -2.0315e+00, -1.0597e+00],
         [ 1.4733e-01, -5.2234e-01,  4.1729e-01,  8.2231e-01, -1.6066e-01,
           8.6506e-02,  7.8791e-01, -3.0938e-01,  1.8582e+00,  1.8808e+00],
         [-9.4668e-01,  2.7926e-03,  1.0760e+00, -1.8669e+00, -5.7272e-01,
           2.0427e+00,  6.1051e-03,  6.9947e-01,  4.8721e-01,  1.0490e+00],
         [ 4.6475e-01,  1.9995e+00,  7.7303e-03, -7.7050e-01,  2.2461e-01,
          -4.0902e-01,  4.0365e-01, -3.2824e-01, -2.1672e-01, -5.8281e-01],
         [-1.1063e-01,  5.0243e-01, -9.1922e-02, -2.8413e-01,  2.0263e+00,
           9.9791e-01,  5.1758e-01, -1.7228e+00, -1.5064e+00, -1.5869e+00],
         [-3.3347e-01,  3.2403e-01,  4.9551e-01,  1.1229e-02, -1.1195e+00,
           1.1633e+

In [9]:
# seed = 123
# torch.manual_seed(seed)
# target = (torch.randn(30, 50))
# input_tensor = target * torch.bernoulli(torch.full((30, 50), 0.2))
# model = architecture.UnfoldedNet_Huber(params = get_default_param(False))

# criterion = nn.MSELoss()
# optimizer = torch.optim.Adam(model.parameters())

# model.train()
# output = model(input_tensor)

# loss = (criterion(output, target))/torch.square(torch.norm(target, p = 'fro'))
# optimizer.zero_grad()
# print(f'loss before backward: {loss}, loss.grad: {loss.requires_grad}')
# loss.backward()
# print(f'loss: {loss}')
# print("\nGradients after one epoch:")
# for name, param in model.named_parameters():
#     print(f'name: {name}\t\tgradient: {param.grad}')
# optimizer.step()

In [10]:
# # Create a input_tensor of random indices
# shuffled_indices = torch.randperm(input_tensor.nelement())

# # Index the original input_tensor with these shuffled indices
# shuffled_input_tensor = input_tensor.view(-1)[shuffled_indices].view(input_tensor.size())

# output = model(shuffled_input_tensor)

# loss = (criterion(output, target))/torch.square(torch.norm(target, p = 'fro'))
# optimizer.zero_grad()
# print(f'loss before backward: {loss}, loss.grad: {loss.requires_grad}')
# loss.backward()
# print(f'loss: {loss}')
# print("\nGradients after one epoch:")
# for name, param in model.named_parameters():
#     print(f'name: {name}\t\tgradient: {param.grad}')
# optimizer.step()

In [11]:
# # Create a input_tensor of random indices
# shuffled_indices = torch.randperm(input_tensor.nelement())

# # Index the original input_tensor with these shuffled indices
# shuffled_input_tensor = input_tensor.view(-1)[shuffled_indices].view(input_tensor.size())

# output = model(shuffled_input_tensor)

# loss = (criterion(output, target))/torch.square(torch.norm(target, p = 'fro'))
# optimizer.zero_grad()
# print(f'loss before backward: {loss}, loss.grad: {loss.requires_grad}')
# loss.backward()
# print(f'loss: {loss}')
# print("\nGradients after one epoch:")
# for name, param in model.named_parameters():
#     print(f'name: {name}\t\tgradient: {param.grad}')
# optimizer.step()

In [12]:
model = revised_architecture.UnfoldedNet_Huber(params = get_default_param(False, model = 'HuberMC-Net'))
model

UnfoldedNet_Huber(
  (conv_layers): ModuleList(
    (0-449): 450 x Conv2dC(
      (convR): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
  )
  (matrix_layers): ModuleList(
    (0-1349): 1350 x PseudoInverse()
  )
  (huber_obj): Sequential(
    (0): Huber(
      (conv_layers): ModuleList(
        (0-449): 450 x Conv2dC(
          (convR): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        )
      )
      (matrix_layers): ModuleList(
        (0-1349): 1350 x PseudoInverse()
      )
    )
    (1): Huber(
      (conv_layers): ModuleList(
        (0-449): 450 x Conv2dC(
          (convR): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        )
      )
      (matrix_layers): ModuleList(
        (0-1349): 1350 x PseudoInverse()
      )
    )
    (2): Huber(
      (conv_layers): ModuleList(
        (0-449): 450 x Conv2dC(
          (convR): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        )
      )
      (mat

In [13]:
model.forward(torch.randn(150, 300) * torch.bernoulli(torch.full((150, 300), 0.2)))
# model.forward(torch.randn(30, 50) * torch.bernoulli(torch.full((30, 50), 0.2)))

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tens

tensor([[ 0.0795, -0.8543, -0.7214,  ..., -3.3066, -0.6238, -2.6727],
        [ 0.9199, -2.9457, -2.8913,  ..., -1.8228, -1.9789, -2.6390],
        [ 2.1899, -1.2537, -4.0716,  ..., -4.0067, -2.1586, -2.3299],
        ...,
        [-2.1128,  3.8805,  0.4659,  ..., -3.6600,  2.6213,  2.3013],
        [ 3.2551, -2.4691, -5.1118,  ..., -3.8627, -1.2764,  0.5355],
        [-0.5400,  2.6887,  2.7664,  ...,  2.4427,  0.5165, -1.3203]],
       grad_fn=<MmBackward0>)

In [14]:
summary(model, input_size = [150, 300])

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tens

Layer (type:depth-idx)                   Output Shape              Param #
UnfoldedNet_Huber                        [150, 300]                --
├─Sequential: 1-1                        [150, 300]                --
│    └─Huber: 2-1                        [150, 300]                5,040
│    └─Huber: 2-4                        --                        (recursive)
│    │    └─ModuleList: 3-3              --                        (recursive)
│    └─Huber: 2-3                        [150, 300]                7,610
│    └─Huber: 2-4                        --                        (recursive)
│    │    └─ModuleList: 3-3              --                        (recursive)
│    └─Huber: 2-5                        [150, 300]                7,620
│    │    └─ModuleList: 3-3              --                        (recursive)
Total params: 24,760
Trainable params: 24,760
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 27
Input size (MB): 0.18
Forward/backward pass size (MB): 21.60
Pa

In [15]:
# Some settings for visualisation
matplotlib.use('Agg')
%matplotlib inline

seed = 123
torch.manual_seed(seed)

# Set parameters (including hyperparameters) and setting for saving/logging data
hyper_param_net = training.get_hyperparameter_grid('HuberMC-Net', TrainInstances = 20, ValInstances = 10, BatchSize = 5, ValBatchSize = 2, num_epochs = 2, learning_rate = 0.001)
params_net = get_default_param(gpu = False, model = 'HuberMC-Net')
CalInGPU = params_net['CalInGPU']

q_list = [0.2]
db_list = [3.0]

for q in q_list:
    for db in db_list:
        # ProjectName = TRY + ' ' + logs_and_results.get_current_time() + ' ' + hyper_param_net['Model'] + ' ' + 'Sampling Rate: ' + logs_and_results.get_q_str(q) + ' and DB ' + logs_and_results.get_noise_str(db)

        ProjectName = TRY + ' ' + hyper_param_net['Model'] + ' Q ' + logs_and_results.get_q_str(q) + ' DB ' + logs_and_results.get_noise_str(db)
        # Note: Removed time stamp from log file name as : not supported. Weird because this was not a problem in linux

        # Get log file
        logfile = logs_and_results.get_modularized_record(ProjectName, q, db, 'Logs', hyper_param_net, params_net, ROOT, SESSION)
        with open(logfile, 'w', 1) as log:
            print('Project Name: %s\n'%ProjectName)
            log.write('Project Name: %s\n\n'%ProjectName)

            # Get Model
            net = training.get_model(params_net, hyper_param_net, log)
            print('\nParameters = \n%s\n'%str(params_net))
            log.write('\nParameters = \n%s\n\n'%str(params_net))

            #Loading data and creating dataloader for both test and training
            # print('Loading Data phase...')
            log.write('Loading phase...\n')
            shape_dset = (params_net['size1'], params_net['size2'])
            
            train_loader, val_loader = dataset_processing.get_dataloaders(params_net = params_net, hyper_param_net = hyper_param_net, sampling_rate = q, db = db, ROOT = ROOT)

            # print('Finished loading.\n')
            log.write('Finished loading.\n\n')

            # Some additional settings for training including loss, optimizer,
            # floss = nn.functional.mse_loss(reduction = 'sum')
            floss = nn.MSELoss()
            optimizer = torch.optim.Adam(net.parameters(), lr = hyper_param_net['Lr'])
            # scheduler2 =  torch.optim.lr_scheduler.StepLR(optimizer, step_size= 1, gamma = 0.97, verbose = True)

            # Array for recording parameter values after each layer for each epoch etc
            outputs_L = revised_architecture.to_var(torch.zeros([shape_dset[0], shape_dset[1]]), CalInGPU) 
            lossmean_vec = np.zeros((hyper_param_net['Epochs'], ))
            lossmean_val_vec = np.zeros((hyper_param_net['Epochs'], ))


            # dummy variable to monitor and record progress for loss
            minloss = np.inf

            for epoch in range(hyper_param_net['Epochs']):
                print(f'Epoch: {epoch + 1}, {logs_and_results.get_current_time()}, \n')
                log.write(f'Epoch: {epoch + 1} ')
                log.write(logs_and_results.get_current_time() + '\n\n')

                # Train and Test Steps. (Record every 5 epochs)
                if (epoch + 1) % 5 == 0:
                    # print('Loading and calculating training batches...')
                    log.write('Loading and calculating training batches...\n')
                    startime = time.time()
                    loss_mean = training.train_step(net, train_loader, floss, optimizer, CalInGPU, hyper_param_net['TrainInstances'], hyper_param_net['BatchSize']) # remove alpha from train func
                    endtime = time.time()
                    # print('Training time is %f'%(endtime - startime))
                    log.write('Training time is %f'%(endtime - startime))

                    # print('Loading and calculating validation batches...')
                    log.write('Loading and calculating validation batches...\n')
                    startime = time.time()
                    loss_val_mean = training.test_step(net, val_loader, floss, CalInGPU, hyper_param_net['ValInstances'], hyper_param_net['ValBatchSize'])
                    endtime = time.time()
                    # print('Test time is %f'%(endtime - startime))
                    log.write('Test time is %f'%(endtime - startime))

                else:
                    loss_mean = training.train_step(net, train_loader, floss, optimizer, CalInGPU, hyper_param_net['TrainInstances'], hyper_param_net['BatchSize'])
                    loss_val_mean = training.test_step(net, val_loader, floss, CalInGPU, hyper_param_net['ValInstances'], hyper_param_net['ValBatchSize'])

                # Update Record and Parameters
                lossmean_vec[epoch] = loss_mean
                lossmean_val_vec[epoch] = loss_val_mean


                print('Epoch [%d/%d], Mean Training Loss:%.5e, Mean Validation Loss:%.5e'
                      %(epoch + 1, hyper_param_net['Epochs'], loss_mean, loss_val_mean))

                # Update Log after every 5 epochs. Make a plot of MSE against epochs every 5 epochs. Save Model in whole/dict form every five epochs.
                if (epoch + 1) % 5 == 0:
                    print(f"Saving Whole Model at Epochs: [{epoch + 1}/{hyper_param_net['Epochs']}]")
                    model_whole_path = logs_and_results.get_modularized_record(ProjectName, q, db, 'Saved Models - Whole', hyper_param_net, params_net, ROOT, SESSION, current_epoch = epoch + 1)
                    # torch.save(net, model_whole_path)
                    print(f"Saving Model Dict at Epochs: [{epoch + 1}/{hyper_param_net['Epochs']}]")
                    model_state_dict_path = logs_and_results.get_modularized_record(ProjectName, q, db, 'Saved Models - Dict', hyper_param_net, params_net, ROOT, SESSION, current_epoch = epoch + 1)
                    # torch.save(net.state_dict(), model_state_dict_path)

                    # print('Epoch [%d/%d], Mean Training Loss:%.5e, Mean Validation Loss:%.5e'
                    # %(epoch + 1, hyper_param_net['Epochs'], loss_mean, loss_val_mean))
                    # print('loss_lowrank_mean', loss_lowrank_mean)
                    # print('loss_val_lowrank_mean', loss_val_lowrank_mean)
                    # print(f'c: {c_list}, lamda: {lamda_list}, mu: {mu_list}')

                    # log.write('loss_lowrank_mean %.5e\n' %(loss_lowrank_mean))
                    # log.write('loss_val_lowrank_mean %.5e\n' %(loss_val_lowrank_mean))
                    log.write('Epoch [%d/%d], Mean Training Loss:%.5e, Mean Validation Loss:%.5e\n'
                              %(epoch + 1, hyper_param_net['Epochs'], loss_mean, loss_val_mean))
                    np.set_printoptions(precision = 3)

                    if True or loss_val_mean < minloss:
                        # print('saved at [epoch%d/%d]'%(epoch + 1, hyper_param_net['Epochs']))
                        log.write('saved at [epoch%d/%d]\n' %(epoch + 1, hyper_param_net['Epochs']))
                        minloss = min(loss_val_mean, minloss)

            # Finish off by observing the minimum loss on validation set

            #Print min loss
            # print('\nMin Loss = %.4e'%np.min(lossmean_val_vec))
            log.write('\nMin Loss = %.4e'%np.min(lossmean_val_vec))

            # Plotting MSE vs Epoch and Saving it

            # Get Directory where we have to save the plot
            dir = logs_and_results.get_modularized_record(ProjectName, q, db, 'Plots', hyper_param_net, params_net, ROOT, SESSION, current_epoch = epoch + 1)
            logs_and_results.plot_and_save_mse_vs_epoch(lossmean_vec, lossmean_val_vec, dir)

Project Name: Try 1 HuberMC-Net Q 20.0% DB 3.0

Configuring Network...
Instantiating Model...
Model Instantiated...

Parameters = 
{'size1': 150, 'size2': 300, 'rank': 10, 'device': 'cpu', 'hubreg_iters': 2, 'layers': 3, 'kernel': (3, 3), 'CalInGPU': False}

Epoch: 1, 2024-04-14 02:24:52, 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) Has nans: False 

Reconstructed Conv Matrix V tensor([[[[0., 0., 0.],
          [0., 1., 0.],
          [0., 0., 0.]]]]) 

KeyboardInterrupt: 