Training on regret loss requires that we compute the derivative of the fairness problem with respect to its input parameters. 

We differentiate KKT conditions to obtain these derivatives.

We use chain rule to compute the derivative of the problem w.r.t. its parameter from 
\begin{align}
    \frac{\partial{l_{Regret}(\hat{r},\cdot)}}{\partial{\boldsymbol\theta}} = \frac{\partial{l_{Regret}(\hat{r},\cdot)}}{\partial{d^*(\hat{r})}} \cdot \frac{\partial{d^*(\hat{r})}}{\partial{\hat{r}}} \cdot \frac{\partial{\hat{r}}}{\partial{\boldsymbol{\theta}}}
\end{align}) 

The derivative is calculated from solving the differentiated KKT system. 

i.e. $x = A^{-1} B$, here A is the invertible KKT matrix.

What I need to code:
- Replace the gradients calculation in the backward pass of regret loss training from pertubed to autograd.
- For those closed form solutions, use the analytical gradient formula.
- Comparison with basedline for Speed and for Loss performance:
    - with two-stage
    - for problems with closed-forms, compare with autograd for speed and for regret performance.
    - for other regular problems, compare the matrix approach with autograd for speed and for regret performance.
        - Here expect slower speed but slightly higher regret performance.

In [2]:
import torch

# function to extract grad
def set_grad(var):
    def hook(grad):
        var.grad = grad * 2
    return hook

X = torch.tensor([[0.5, 0.3, 2.1], [0.2, 0.1, 1.1]], requires_grad=True)
W = torch.tensor([[2.1, 1.5], [-1.4, 0.5], [0.2, 1.1]])
B = torch.tensor([1.1, -0.3])

# Z = XW^T + B
Z = torch.nn.functional.linear(X, weight=W.t(), bias=B)

# register_hook for Z
Z.register_hook(set_grad(Z))

S = torch.sum(Z)
S.backward()
print("Z:\n", Z)
print("gZ:\n", Z.grad)
print("gX:\n",X, X.grad)

Z:
 tensor([[2.1500, 2.9100],
        [1.6000, 1.2600]], grad_fn=<AddmmBackward0>)
gZ:
 tensor([[2., 2.],
        [2., 2.]])
gX:
 tensor([[0.5000, 0.3000, 2.1000],
        [0.2000, 0.1000, 1.1000]], requires_grad=True) tensor([[ 3.6000, -0.9000,  1.3000],
        [ 3.6000, -0.9000,  1.3000]])


In [3]:
def set_grad(var):
    def hook(grad):
        var.grad = 2 * grad
    return hook

theta = torch.tensor([3.0], requires_grad=True)  # The initial variable theta
r = torch.sin(theta)  # r = sin(theta)
g = r ** 2  # g = r^2
L = g.sum()  # L = g

g.register_hook(set_grad(g))

L.backward()

print("theta:\n", theta)
print("dr/dtheta (should be cos(theta)):\n", r.grad)
print("g (r^2):\n", g)
print("Manually modified dg/dr:\n", g.grad)
print("Final gradient dL/dtheta:\n", theta.grad)


theta:
 tensor([3.], requires_grad=True)
dr/dtheta (should be cos(theta)):
 None
g (r^2):
 tensor([0.0199], grad_fn=<PowBackward0>)
Manually modified dg/dr:
 tensor([2.])
Final gradient dL/dtheta:
 tensor([-0.2794])


  print("dr/dtheta (should be cos(theta)):\n", r.grad)


In [4]:
import torch

# Create input tensors
x = torch.tensor(2.0, requires_grad=True)
y = torch.tensor(3.0, requires_grad=True)

# Define the computation
z = x**2 + y**2

# Compute gradients using autograd
z.backward()

# Access the gradients
grad_x = x.grad
grad_y = y.grad

# Print the gradients
print("Gradient of z with respect to x:", grad_x)
print("Gradient of z with respect to y:", grad_y)

Gradient of z with respect to x: tensor(4.)
Gradient of z with respect to y: tensor(6.)


In [5]:
import numpy as np

def generate_matrix(n):
    """
    Generates a random nxn matrix.
    
    Parameters:
    n (int): The dimension of the matrix to be generated.
    
    Returns:
    np.ndarray: A random nxn matrix.
    """
    return np.random.rand(n, n)

def solve_system_by_inversion(A, B):
    """
    Solves the system Ax = B by matrix inversion.
    
    Parameters:
    A (np.ndarray): The coefficient matrix.
    B (np.ndarray): The right-hand side matrix or vector.
    
    Returns:
    np.ndarray: The solution vector or matrix x.
    """
    # Check if the matrix A is invertible
    if np.linalg.det(A) == 0:
        raise ValueError("Matrix A is not invertible.")
    
    # Calculate the inverse of A
    A_inv = np.linalg.inv(A)
    
    # Calculate the solution x
    x = np.dot(A_inv, B)
    
    return x



In [6]:
# Example usage:
n = 3
A = generate_matrix(n)
B = np.random.rand(n, 1)

print("Matrix A:")
print(A)

print("\nMatrix B:")
print(B)

try:
    x = solve_system_by_inversion(A, B)
    print("\nSolution x:")
    print(x)
except ValueError as e:
    print(e)

Matrix A:
[[0.37817492 0.63857121 0.92564704]
 [0.48543803 0.52120504 0.78655682]
 [0.89952628 0.40314126 0.04170855]]

Matrix B:
[[0.98917964]
 [0.96889453]
 [0.69563087]]

Solution x:
[[ 0.77175175]
 [-0.08013672]
 [ 0.80861868]]
