# Introduction

**The &mu;-SEMNAN Solver** finds the fittest parameters in a linear Gaussian Acyclic Margincal Ancestral Structural Equation Model (AMASEM).
As we will see, there are more than one method to compute the fittest parameters of the
AMASEM structure.

We start by loading the libraries. The SEMNAN Solver uses PyTorch&reg; and depends on CUDA&reg; as the backend.
We will need to load pytorch and make sure that the backend used is CUDA. We do this by
introducing a `device` variable that is always set to cuda and pass it to any tensor we make.
We also import our library `semnan_cuda`.

In [1]:
import torch
import semnan_cuda as sc

device = torch.device("cuda")

Consider the following AMASEM.

![alt text](img/health-graph.svg "Health Graph")

It is composed of four visible variables and six latent variables.
We will compile this graph as an adgacency matrix. The encoding is simple: the latent variables go on top
and the remaining variables (the visible ones) form an upper-triangular matrix at the bottom.

In [9]:
struct = torch.tensor([
        [1, 1, 1, 0],
        [0, 1, 0, 1],
        [1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 1, 0],
        [0, 0, 0, 1],
        [0, 1, 1, 1],  # X
        [0, 0, 1, 0],  # V_BP
        [0, 0, 0, 1],  # V_BMI
        [0, 0, 0, 0],  # Y
    ], dtype=torch.bool)

In order to obtain the fittest parameters, we fit this structure to a `SEMNANSolver`.


In [10]:
semnan = sc.SEMNANSolver(struct)

We will obtain the fittest parameters (that is the optimal weights between the variables)
with respect to the sample covariance matrix. It is the observed covariance matrix that has been induced by the causal system.

In [11]:
sample_covariance = torch.tensor([
        [2,  3,  6,  8],
        [3,  7, 12, 16],
        [6, 12, 23, 30],
        [8, 16, 30, 41],
    ])

semnan.sample_covariance = sample_covariance

The newly created `SEMNANSolver` object will use the gradient descent method to compute the optimal weights.
However, it only computes the partial derivatives of the objective function with respect to the weights.
Therefore, we need to use an arbitrary optimizer to update the weights in each step.

In [12]:
optim = torch.optim.Adamax([semnan.weights], lr=0.001)

That's it! We only need to start training the SEMNANSolver. Training the SEMNANSolver is pretty much
like training a neural network: we take a `forward()` and `backward()` step and then call the `step()` method
of the optimizer to do the rest. For this, we first set the stopping conditions:

In [13]:
max_iterations = 10000
min_error = 1.0e-7

The following trains the AMASEM. We would also like to print valuable information at each step of the optimizer.


In [14]:
for i in range(max_iterations):
    semnan.forward()
    error = semnan.loss().item()

    if error < min_error:
        break

    semnan.backward()
    optim.step()

    if i % (max_iterations / 10) == 0:
        print(f"iteration={i:<10} loss={error:<15.5}")
else:
    print("Did not converge in the maximum number of iterations!")

iteration=0          loss=12.965         
iteration=1000       loss=1.7852         
iteration=2000       loss=1.0448         
iteration=3000       loss=0.39159        
iteration=4000       loss=0.094944       
iteration=5000       loss=0.036318       
iteration=6000       loss=0.012206       
iteration=7000       loss=0.0025386      
iteration=8000       loss=8.3685e-05     


Now that the AMASEM has been parametrized using the `SEMNANSolver`, we can print out the induced visible covariance matrix...

In [15]:
print(semnan.visible_covariance_)

tensor([[ 1.9993,  2.9969,  5.9954,  7.9940],
        [ 2.9969,  6.9904, 11.9849, 15.9800],
        [ 5.9954, 11.9849, 22.9767, 29.9691],
        [ 7.9940, 15.9800, 29.9691, 40.9590]], device='cuda:0')


... and the weights matrix of the AMASEM.

In [16]:
print(semnan.weights)

tensor([[ 2.7763e-01,  1.5870e+00,  6.8930e+00, -0.0000e+00],
        [ 0.0000e+00,  1.1762e-01, -0.0000e+00,  1.2119e+00],
        [-1.3864e+00, -0.0000e+00,  0.0000e+00,  0.0000e+00],
        [ 0.0000e+00, -2.5046e-01, -0.0000e+00,  0.0000e+00],
        [ 0.0000e+00, -0.0000e+00, -1.9917e-06, -0.0000e+00],
        [-0.0000e+00,  0.0000e+00,  0.0000e+00, -6.0673e-01],
        [-0.0000e+00,  1.2786e+00,  6.5536e+00,  1.4268e-01],
        [-0.0000e+00,  0.0000e+00, -3.0101e+00, -0.0000e+00],
        [ 0.0000e+00, -0.0000e+00, -0.0000e+00,  1.2858e+00],
        [-0.0000e+00, -0.0000e+00, -0.0000e+00,  0.0000e+00]], device='cuda:0')
