In [1]:
import torch
from torch import nn
torch.manual_seed(2022)

<torch._C.Generator at 0x7f9b69759c30>

In [2]:
if torch.cuda.is_available():
   device = torch.device("cuda")
   print("Running on the GPU")
else:
    device = torch.device("cpu")
    print("Running on the CPU")

Running on the GPU


**BCEWithLogitsLoss**

This loss function combines a Sigmoid layer and the Binary Cross-Entropy (BCE) loss in one single class, making it suitable for binary classification tasks where the output logits need to be converted to probabilities. It computes the binary cross-entropy between the target and the output logits.

**Formula**: 
$$BCE(x, y) = -\frac{1}{N} \sum_{i=1}^{N} \left[y_i \cdot \log(\sigma(x_i)) + (1 - y_i) \cdot \log(1 - \sigma(x_i))\right] $$

Where:
- \( $x_i$ \) is the output logit for the \(i\)-th sample.
- \( $y_i$ \) is the target label (0 or 1) for the \(i\)-th sample.
- \( $\sigma(x_i)$ = $\frac{1}{1 + e^{-x_i}}$ \) is the Sigmoid function applied to the output logit.

**L1Loss**

This loss function computes the Mean Absolute Error (MAE) between the predicted output and the target. It is commonly used in regression tasks, where minimizing the absolute difference between predictions and actual values is important.

**Formula**: 
$$L_1Loss(x, y) = \frac{1}{N} \sum_{i=1}^{N} |x_i - y_i|$$

Where:
- \( $x_i$ \) is the predicted value for the \(i\)-th sample.
- \( $y_i$ \) is the target value for the \(i\)-th sample.
- \( N \) is the number of samples.


In [3]:
adv_criterion = nn.BCEWithLogitsLoss()
recon_criterion = nn.L1Loss()

In [5]:
import numpy as np
import scipy.io as sio
import os 

num_channels = 250
prefix = '../DCRM/Data_BC_250'
u3val = np.array([sio.loadmat(os.path.join(prefix, "BCMultiPoissonCalc_" + str(i) + ".mat"))['u'] for i in range(1,num_channels+1)]) # solution obtained with DF
poisson_f = np.array([sio.loadmat(os.path.join(prefix, "BCMultiPoissonCalc_" + str(i) + ".mat"))['gf'] for i in range(1,num_channels+1)]) # source term

inputs = torch.tensor(np.expand_dims(poisson_f, axis=1)) # from (250, 128, 128) to (250, 1, 128, 128)
true_sol = torch.tensor(np.expand_dims(u3val, axis=1))
BCval = torch.zeros_like(true_sol) # Bcs are zero

In [6]:
num_channelsTest = 1000
prefixTest = '../DCRM/Data_BC_1000'
u3valTest = np.array([sio.loadmat(os.path.join(prefixTest, "BCMultiPoissonCalc_" + str(i) + ".mat"))['u'] for i in range(1,num_channelsTest+1)])
poisson_fTest = np.array([sio.loadmat(os.path.join(prefixTest, "BCMultiPoissonCalc_" + str(i) + ".mat"))['gf'] for i in range(1,num_channelsTest+1)])

inputsTest = torch.tensor(np.expand_dims(poisson_fTest , axis=1))
true_solTest = torch.tensor(np.expand_dims(u3valTest , axis=1))
BCvalTest = torch.zeros_like(true_solTest)

for i in range(true_solTest .shape[0]):
    BCvalTest[i,0,:,:] = true_solTest[i,0,:,:]
    BCvalTest[i, 0, 1:127, 1:127] = torch.zeros((126,126))


In [7]:
x = torch.linspace(0, 1, 128 )
y = torch.linspace(0, 1, 128 )
rx, ry = torch.meshgrid(x, y) # rx is the x component of the meshgrid, ry is the y component of the meshgrid
rx = rx.to(device)
ry = ry.to(device)

# If the tensors are on the GPU, they have to be moved to the CPU to be converted to numpy arrays
rxd = rx.cpu().detach().numpy()
ryd = ry.cpu().detach().numpy()

  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]


#### 1 - Adam Optimizer

The Adam optimizer uses two moments for adjusting learning rates:

1. **First Moment (Gradient Mean)**:
   - **Effect**: A value close to 1 gives more weight to past gradients, allowing the optimizer to retain a longer-term memory of gradients.

2. **Second Moment (Gradient Variance)**:
   - **Effect**: Squaring the gradients is used to measure the variance, which helps in adjusting the learning rate adaptively based on the magnitude of gradients.

Parameters:
- **$\beta_1$**: Controls the momentum (gradient mean).
- **$\beta_2$**: Controls the variance (squared gradient mean).


In [8]:
from network_w import *

def weights_init(m):
    if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
        torch.nn.init.normal_(m.weight, 0.0, 0.02) # initialize the weights with a normal distribution
    if isinstance(m, nn.BatchNorm2d):
        torch.nn.init.normal_(m.weight, 0.0, 0.02) 
        torch.nn.init.constant_(m.bias, 0) # initialize the bias of the batch normalization to zero

input_dim = 2
real_dim = 1

lr = 0.0001
beta_1 = 0.5
beta_2 = 0.999

gen = UNet(input_dim, real_dim).to(device)
gen = gen.apply(weights_init)
gen_opt = torch.optim.Adam(gen.parameters(), lr=lr, betas=(beta_1, beta_2))

#### 2 - Cosine Annealing

1. **Why we are using it**:
    - Used to dynamically reduce the learning rate of the optimizer during training, **following a cosine curve**. By decreasing the learning rate, it helps avoid saddle points or shallow local minima.
2. **Saddle points**:
    - Point where the function has different curvatures in different directions, being both convex and concave in various directions
    - Result in weak or zero gradients, making it hard to determine the optimal direction for optimization and potentially slowing convergence


In [9]:
gen_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(gen_opt, 300 * 2500)

#### 3 - Noyaux de convolution

1. **Output dimension of an image of size WxH after performing a convolution with a kernel of size $k_h$ x $k_w$**: 
    - Padding : P and Stride : S
    
    - H' = $\frac{H + 2P - k_h}{S} + 1$ 

    - W' = $\frac{W + 2P - k_w}{S} + 1$ 

2. **Paper explanations**:
    - Laplacian operator discretized spacially by centrale difference schemes :
$$
\begin{align*}
\Delta u(x, y) & = u_{xx}(x, y) + u_{yy}(x, y) \\
& \approx \frac{u(x -h, y) + u(x + h, y) - 4u(x, y) + u(x, y - h) + u(x,y + h)}{h^2} \\
& := \frac{1}{h^2} \begin{bmatrix}
                    0 & 1 & 0 \\
                    1 & -4 & 1 \\
                    0 & 1 & 0
                    \end{bmatrix}
\end{align*}
$$