## [TORCH.AUTOGRAD.FUNCTIONAL.JACOBIAN](https://pytorch.org/docs/master/generated/torch.autograd.functional.jacobian.html#torch-autograd-functional-jacobian)

>### torch.autograd.functional.jacobian** ( ***func, inputs, create_graph=False, strict=False,vectorize=False, strategy='reverse-mode'*** ) [SOURCE](https://pytorch.org/docs/master/_modules/torch/autograd/functional.html#jacobian)  

&emsp;&emsp;Function that computes the Jacobian of a given function.

#### **Parameters**
 - **func** (function) – a Python function that takes Tensor inputs and returns a tuple of Tensors or a Tensor.
 
 - **inputs** (tuple of Tensors or [<font color="red">Tensor</font>](https://pytorch.org/docs/master/tensors.html#torch.Tensor)) – inputs to the function <font color="DarkBlue">func</font>.
 
 - **create_graph** ([<font color="red">bool</font>](https://docs.python.org/3/library/functions.html#bool), optional) – If <font color="DarkBlue">True</font>, the Jacobian will be computed in a differentiable manner. Note that when <font color="DarkBlue">strict</font> is <font color="DarkBlue">False</font>, the result can not require gradients or be disconnected from the inputs. Defaults to <font color="DarkBlue">False</font>.  
 
 - **strict** ([<font color="red">bool</font>](https://docs.python.org/3/library/functions.html#bool), optional) – If <font color="DarkBlue">True</font>, an error will be raised when we detect that there exists an input such that all the outputs are independent of it. If <font color="DarkBlue">False</font>, we return a Tensor of zeros as the jacobian for said inputs, which is the expected mathematical value. Defaults to <font color="DarkBlue">False</font>.
 
 - **vectorize** ([<font color="red">bool</font>](https://docs.python.org/3/library/functions.html#bool), optional) – This feature is experimental. Please consider using [<font color="red">functorch’s jacrev or jacfwd</font>](https://github.com/pytorch/functorch#what-are-the-transforms) instead if you are looking for something less experimental and more performant. When computing the jacobian, usually we invoke <font color="DarkBlue">autograd.grad</font> once per row of the jacobian. If this flag is <font color="DarkBlue">True</font>, we perform only a single <font color="DarkBlue">autograd.grad</font> call with <font color="DarkBlue">batched_grad=True</font> which uses the vmap prototype feature. Though this should lead to performance improvements in many cases, because this feature is still experimental, there may be performance cliffs. See [torch.autograd.grad()](https://pytorch.org/docs/master/generated/torch.autograd.grad.html#torch.autograd.grad)’s <font color="DarkBlue">batched_grad</font> parameter for more information.
 
 - **strategy** ([<font color="red">str</font>](https://docs.python.org/3/library/stdtypes.html#str), optional) – Set to <font color="DarkBlue">"forward-mode"</font> or <font color="DarkBlue">"reverse-mode"</font> to determine whether the Jacobian will be computed with forward or reverse mode AD. Currently, <font color="DarkBlue">"forward-mode"</font> requires <font color="DarkBlue">vectorized=True</font>. Defaults to <font color="DarkBlue">"reverse-mode"</font>. If func has more outputs than inputs, <font color="DarkBlue">"forward-mode"</font> tends to be more performant. Otherwise, prefer to use <font color="DarkBlue">"reverse-mode"</font>.

#### **Returns**  
- if there is a single input and output, this will be a single Tensor containing the Jacobian for the linearized inputs and output. If one of the two is a tuple, then the Jacobian will be a tuple of Tensors. If both of them are tuples, then the Jacobian will be a tuple of tuple of Tensors where <font color="DarkBlue">Jacobian[i][j]</font> will contain the Jacobian of the **<font color="DarkBlue">ith</font>** output and **<font color="DarkBlue">jth</font>** input and will have as size the concatenation of the sizes of the corresponding output and the corresponding input and will have same dtype and device as the corresponding input. If strategy is <font color="DarkBlue">forward-mode</font>, the dtype will be that of the output; otherwise, the input.

#### Return type
- Jacobian ([<font color="red">Tensor</font>](https://pytorch.org/docs/master/tensors.html#torch.Tensor) or nested tuple of Tensors)

In [1]:
# Example
import torch
from torch.autograd.functional import jacobian

def exp_reducer(x): # 定义函数，输入x张量，求e^x并对第一纬求和
  return x.exp().sum(dim=1)
inputs = torch.rand(2, 2) # rand为0-1均匀分布
jacobian(exp_reducer, inputs) # inputs为传入exp_reducer函数的输入, 输出的是exp_reduce(input)对input的求导结果

tensor([[[2.7033, 2.0317],
         [0.0000, 0.0000]],

        [[0.0000, 0.0000],
         [1.1729, 1.0367]]])

In [2]:
jacobian(exp_reducer, inputs, create_graph=True)

tensor([[[2.7033, 2.0317],
         [0.0000, 0.0000]],

        [[0.0000, 0.0000],
         [1.1729, 1.0367]]], grad_fn=<ViewBackward0>)

In [3]:
def exp_adder(x, y):
  return 2 * x.exp() + 3 * y
inputs = (torch.rand(2), torch.rand(2))
jacobian(exp_adder, inputs)

(tensor([[4.2750, 0.0000],
         [0.0000, 2.2751]]),
 tensor([[3., 0.],
         [0., 3.]]))

In [4]:
x = torch.randn(2,3)
x   # 高斯分布

tensor([[-0.4765, -0.7783, -0.3851],
        [-2.1350, -0.5474, -1.8763]])

In [5]:
y = exp_reducer(x) # 求e指数并对dim1求和
y

tensor([1.7605, 0.8499])

In [6]:
jacobian(exp_reducer, x) # 上面的2*3张量为y1对x的各个分量分别求偏导的结果
                         # 下面的2*3张量为y2对x各个分量分别求偏导的结果

tensor([[[0.6210, 0.4592, 0.6804],
         [0.0000, 0.0000, 0.0000]],

        [[0.0000, 0.0000, 0.0000],
         [0.1182, 0.5785, 0.1531]]])

In [7]:
# 向量对向量求导
a = torch.randn(3)
print("a =", a)
def func(x):
    return a+x
x = torch.randn(3)
print("x =", x)
print("a+x =",func(x))
jacobian(func, x), jacobian(func, x).shape 

a = tensor([-1.3400,  0.3056,  1.1258])
x = tensor([-0.6280,  0.0718,  0.2278])
a+x = tensor([-1.9680,  0.3773,  1.3535])


(tensor([[1., 0., 0.],
         [0., 1., 0.],
         [0., 0., 1.]]),
 torch.Size([3, 3]))

In [8]:
# 矩阵对矩阵求导
a = torch.randn(3,3)
print("a =", a)
def func(x):
    return a+x
x = torch.randn(3,3)
print("x =", x)
print("a+x =",func(x))
jacobian(func, x), jacobian(func, x).shape

a = tensor([[ 0.2130,  0.9812, -1.7295],
        [-0.7298, -0.4272, -0.5411],
        [ 0.0050,  1.1093,  0.1938]])
x = tensor([[-1.9332, -1.1321,  0.5352],
        [ 0.0445, -0.2164,  2.1953],
        [ 0.0195, -0.6834,  1.1046]])
a+x = tensor([[-1.7202, -0.1509, -1.1943],
        [-0.6853, -0.6436,  1.6543],
        [ 0.0245,  0.4258,  1.2983]])


(tensor([[[[1., 0., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],
 
          [[0., 1., 0.],
           [0., 0., 0.],
           [0., 0., 0.]],
 
          [[0., 0., 1.],
           [0., 0., 0.],
           [0., 0., 0.]]],
 
 
         [[[0., 0., 0.],
           [1., 0., 0.],
           [0., 0., 0.]],
 
          [[0., 0., 0.],
           [0., 1., 0.],
           [0., 0., 0.]],
 
          [[0., 0., 0.],
           [0., 0., 1.],
           [0., 0., 0.]]],
 
 
         [[[0., 0., 0.],
           [0., 0., 0.],
           [1., 0., 0.]],
 
          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 1., 0.]],
 
          [[0., 0., 0.],
           [0., 0., 0.],
           [0., 0., 1.]]]]),
 torch.Size([3, 3, 3, 3]))

In [9]:
# 用.backward()函数计算
a = torch.randn(3)
print("a =", a)
def func(x):
    return a+x
x = torch.randn(3, requires_grad=True)
print("x =", x)
y = func(x)
print("y =", y)
y.backward(torch.ones_like(y))# y不是标量，如果不传入和y一样size的tensor会报错，传入全1的tensor相当于求y.sun()对x的导数
print("x.grad =",x.grad) 
print("jacobian:", jacobian(func, x))
torch.ones_like(y) @ jacobian(func, x) # 矩阵相乘

a = tensor([ 0.1043, -0.0819, -0.9433])
x = tensor([-0.7089, -0.8569,  0.6958], requires_grad=True)
y = tensor([-0.6047, -0.9388, -0.2475], grad_fn=<AddBackward0>)
x.grad = tensor([1., 1., 1.])
jacobian: tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])


tensor([1., 1., 1.])

In [10]:
a = torch.randn(2, 3, requires_grad=True)
print("a =", a)
b = torch.randn(3, 2, requires_grad=True)
print("b =", b)
y = a @ b
y

a = tensor([[-0.5171, -0.9421,  0.8334],
        [-0.0342, -0.8039, -0.1967]], requires_grad=True)
b = tensor([[ 0.7732, -0.9926],
        [ 0.3547, -0.4250],
        [-0.4115,  0.5164]], requires_grad=True)


tensor([[-1.0770,  1.3441],
        [-0.2306,  0.2740]], grad_fn=<MmBackward0>)

In [11]:
y.backward()    # y不是标量会报错

RuntimeError: grad can be implicitly created only for scalar outputs

In [12]:
y.backward(torch.ones_like(y))
print("a.grad:", a.grad)
print("b.grad:", b.grad)

a.grad: tensor([[-0.2194, -0.0703,  0.1049],
        [-0.2194, -0.0703,  0.1049]])
b.grad: tensor([[-0.5513, -0.5513],
        [-1.7460, -1.7460],
        [ 0.6367,  0.6367]])


In [13]:
# 用jacobian函数验证一下
def func(a):
    return a @ b
func(a)

tensor([[-1.0770,  1.3441],
        [-0.2306,  0.2740]], grad_fn=<MmBackward0>)

In [14]:
func(a[0]) # 验证第一行

tensor([-1.0770,  1.3441], grad_fn=<SqueezeBackward3>)

In [15]:
torch.ones_like(func(a[0])) @ jacobian(func, a[0])   # 参考上一节中a.grad的计算方法：(v.T)·J

tensor([-0.2194, -0.0703,  0.1049])

In [16]:
a.grad[0]

tensor([-0.2194, -0.0703,  0.1049])