<br>

<div align=center><font color=maroon size=8><b>torch.nn</b></font></div>

<font size=4><b>References:</b></font>
* `Docs > `<a href="https://pytorch.org/docs/stable/tensor_attributes.html" style="text-decoration:none;">Tensor Attributes</a>
* `Docs > `<a href="https://pytorch.org/docs/stable/tensor_view.html" style="text-decoration:none;">Tensor Views</a>
* Docs > <a href="" style="text-decoration:none;"></a>

* `Docs > `<a href="https://pytorch.org/docs/stable/torch.html" style="text-decoration:none;">torch</a>
    * Docs > torch > <a href="https://pytorch.org/docs/stable/generated/torch.tensor.html" style="text-decoration:none;">torch.tensor</a> 
    * Docs > torch > <a href="https://pytorch.org/docs/stable/generated/torch.no_grad.html" style="text-decoration:none;">no_grad</a>
    * Docs > torch > <a href="https://pytorch.org/docs/stable/generated/torch.numel.html" style="text-decoration:none;">torch.numel</a>
    * Docs > torch > <a href="" style="text-decoration:none;"></a> 

* `Docs > `<a href="https://pytorch.org/docs/stable/tensors.html" style="text-decoration:none;">torch.Tensor</a>
* `Docs > torch.Tensor > `<a href="https://pytorch.org/docs/stable/generated/torch.Tensor.view.html" style="text-decoration:none;">torch.Tensor.view</a>
* `Docs > torch.Tensor > `<a href="https://pytorch.org/docs/stable/generated/torch.Tensor.detach.html" style="text-decoration:none;">torch.Tensor.detach</a>
* `Docs > torch.Tensor > `<a href="https://pytorch.org/docs/stable/generated/torch.Tensor.register_hook.html" style="text-decoration:none;">torch.Tensor.register_hook</a>
* `Docs > torch.Tensor > `<a href="https://pytorch.org/docs/stable/generated/torch.Tensor.zero_.html" style="text-decoration:none;">torch.Tensor.zero_</a>
* Docs > torch.Tensor > <a href="" style="text-decoration:none;"></a>

* `Docs > `<a href="https://pytorch.org/docs/stable/nn.init.html" style="text-decoration:none;">torch.nn.init</a>
* 
* `Docs > `<a href="https://pytorch.org/docs/stable/nn.html" style="text-decoration:none;">torch.nn</a>
    * `Docs > torch.nn > `<a href="https://pytorch.org/docs/stable/generated/torch.nn.Module.html" style="text-decoration:none;">Module</a>
    * 
    * `Docs > torch.nn > `<a href="https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html" style="text-decoration:none;">BatchNorm2d</a>
    * `Docs > torch.nn > `<a href="https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html" style="text-decoration:none;">Dropout</a>
    * `Docs > torch.nn > `<a href="" style="text-decoration:none;"></a>

<br>
<br>
<br>

In [2]:
import torch

<br>
<br>
<br>

<font size=3 color=gray>Docs > </font>
# Tensor Attributes <a href="https://pytorch.org/docs/stable/tensor_attributes.html" style="text-decoration:none;font-size:70%">[link]</a>

<font size=3>Each `torch.Tensor` has a <a href="https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.dtype" style="text-decoration:none;font-size:120%">torch.dtype</a>, <a href="https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.device" style="text-decoration:none;font-size:120%">torch.device</a>, and <a href="https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.layout|" style="text-decoration:none;font-size:120%">torch.layout</a>.</font>

<br>

## torch.dtype <a href="https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.dtype" style="text-decoration:none;font-size:70%">[link]</a>

<div class="alert alert-block alert-info">

<font size=3 color=gray><b>CLASS</b></font>&emsp;
<font size=4><b>torch.dtype</b></font>

</div>

<font size=3>A `torch.dtype` is an object that represents the data type of a <a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor" style="text-decoration:none;font-size:120%">torch.Tensor</a>. PyTorch has <font color=maroon>**twelve**</font> different data types:</font>

<img src="./1 PyTorch documentation/1 Notes/images/torch-dtype.jpeg" width=700px align=left>

<font color=red>1</font> `torch.float16` or `torch.half` Sometimes referred to as `binary16`: uses 1 sign, 5 exponent, and 10 significand bits. <font color=maroon>Useful when **precision** is important.</font>

<font color=red>2</font> `torch.bfloat16` Sometimes referred to as `Brain Floating Point`: use 1 sign, 8 exponent and 7 significand bits. <font color=maroon>Useful when **range** is important</font>, since it has the same number of exponent bits as float32.

<br>

To find out if a `torch.dtype` is a floating point data type, the property <a href="https://pytorch.org/docs/stable/generated/torch.is_floating_point.html" style="text-decoration:none;font-size:130%">is_floating_point</a> can be used, which returns True if the data type is a floating point data type.

To find out if a `torch.dtype` is a complex data type, the property <a href="https://pytorch.org/docs/stable/generated/torch.is_complex.html" style="text-decoration:none;font-size:130%">is_complex</a> can be used, which returns True if the data type is a complex data type.

<br>

<font size=3>When the dtypes of inputs to an arithmetic operation (add, sub, div, mul) differ, we promote by finding the minimum dtype that satisfies the following rules:


* If the type of a scalar operand is of a <font color=maroon>higher category</font> than tensor operands (where <font color=maroon>complex > floating > integral > boolean</font>), we promote to a type with sufficient size to hold all scalar operands of that category.


* If a zero-dimension tensor operand has a higher category than dimensioned operands, we promote to a type with sufficient size and category to hold all zero-dim tensor operands of that category.


* If there are no higher-category zero-dim operands, we promote to a type with sufficient size and category to hold all dimensioned operands.</font>

<br>

<font size=3 color=maroon>A floating point scalar operand has dtype <font color=royalblue>torch.get_default_dtype()</font> and an integral non-boolean scalar operand has dtype torch.int64. Unlike numpy, we do not inspect values when determining the minimum dtypes of an operand. Quantized and complex types are not yet supported.</font>

In [3]:
float_tensor = torch.ones(1, dtype=torch.float)
double_tensor = torch.ones(1, dtype=torch.double)

complex_float_tensor = torch.ones(1, dtype=torch.complex64)
complex_double_tensor = torch.ones(1, dtype=torch.complex128)

int_tensor = torch.ones(1, dtype=torch.int)
long_tensor = torch.ones(1, dtype=torch.long)
uint_tensor = torch.ones(1, dtype=torch.uint8)
double_tensor = torch.ones(1, dtype=torch.double)
bool_tensor = torch.ones(1, dtype=torch.bool)

# zero-dim tensors
long_zerodim = torch.tensor(1, dtype=torch.long)
int_zerodim = torch.tensor(1, dtype=torch.int)

In [4]:
torch.add(5, 5).dtype

torch.int64

In [5]:
# 5 is an int64, but does not have higher category than int_tensor so is not considered.
# 这里标量 5 和 int_tensor 都是 int category
(int_tensor + 5).dtype

torch.int32

In [6]:
(int_tensor + long_zerodim).dtype

torch.int32

In [7]:
(long_tensor + int_tensor).dtype

torch.int64

In [8]:
(bool_tensor + long_tensor).dtype

torch.int64

In [9]:
(bool_tensor + uint_tensor).dtype

torch.uint8

In [10]:
(float_tensor + double_tensor).dtype

torch.float64

In [11]:
(complex_float_tensor + complex_double_tensor).dtype

torch.complex128

In [12]:
(bool_tensor + int_tensor).dtype

torch.int32

In [13]:
# Since long is a different kind than float, result dtype only needs to be large enough
# to hold the float.
torch.add(long_tensor, float_tensor).dtype

torch.float32

<br>

<font size=3>When the output tensor of an arithmetic operation is specified, we allow casting to its ***dtype*** except that:

* An integral output tensor cannot accept a floating point tensor.


* A boolean output tensor cannot accept a non-boolean tensor.


* A non-complex output tensor cannot accept a complex tensor.</font>

Casting Examples:

In [14]:
# allowed:
float_tensor *= float_tensor
float_tensor *= int_tensor
float_tensor *= uint_tensor
float_tensor *= bool_tensor

float_tensor *= double_tensor
int_tensor *= long_tensor

int_tensor *= uint_tensor

uint_tensor *= int_tensor

In [15]:
# disallowed (RuntimeError: result type can't be cast to the desired output type):
int_tensor *= float_tensor
bool_tensor *= int_tensor
bool_tensor *= uint_tensor
float_tensor *= complex_float_tensor

RuntimeError: result type Float can't be cast to the desired output type Int

<br>
<br>

## torch.device <a href="https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.device" style="text-decoration:none;font-size:70%">[link]</a>

<div class="alert alert-block alert-info">

<font size=3 color=gray><b>CLASS</b></font>&emsp;
<font size=4><b>torch.device</b></font>

</div>

A `torch.device` is an object representing the device on which a <a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor" style="text-decoration:none;font-size:120%">torch.Tensor</a> is or will be allocated.

The `torch.device` contains a device type ('<font color=maroon size=3>**cpu**</font>' or '<font color=maroon size=3>**cuda**</font>') <font color=maroon>and optional device ordinal for the device type</font>. If the device ordinal is not present, this object will always represent the current device for the device type, even after <a href="https://pytorch.org/docs/stable/generated/torch.cuda.set_device.html" style="text-decoration:none;font-size:120%">torch.cuda.set_device()</a> is called; e.g., a `torch.Tensor` constructed with device **`'cuda'`** is equivalent to **`'cuda:X'`** where X is the result of <a href="https://pytorch.org/docs/stable/generated/torch.cuda.current_device.html#torch.cuda.current_device" style="text-decoration:none;font-size:120%">torch.cuda.current_device()</a>.

A `torch.Tensor`’s device can be accessed via the <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.device.html" style="text-decoration:none;font-size:120%">Tensor.device</a> property.

A `torch.device` can be constructed via a `string` or via a `string and device ordinal`

<br>

Via a string:

In [16]:
torch.device('cuda:0')

device(type='cuda', index=0)

In [17]:
torch.device('cpu')

device(type='cpu')

In [18]:
torch.device('cuda')  # current cuda device

device(type='cuda')

<br>

Via a string and device ordinal:

In [19]:
torch.device('cuda', 0)

device(type='cuda', index=0)

In [20]:
torch.device('cpu', 0)

device(type='cpu', index=0)

<div class="alert alert-block alert-info">

<font size=3 color=red><b>NOTE: </b></font>
<br><br>
The `torch.device` argument in functions can generally be substituted with a string. This allows for fast prototyping of code.
<br>

    # Example of a function that takes in a torch.device
    >>> cuda1 = torch.device('cuda:1')
    >>> torch.randn((2,3), device=cuda1)

    # You can substitute the torch.device with a string
    >>> torch.randn((2,3), device='cuda:1')

</div>

<div class="alert alert-block alert-info">

<font size=3 color=red><b>NOTE: </b></font>
<br>
<br>
For legacy reasons, a device can be constructed via a single device ordinal, which is treated as a cuda device. This matches <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.get_device.html" style="text-decoration:none;color:blue;font-size:120%">Tensor.get_device()</a>, which returns an ordinal for cuda tensors and is not supported for cpu tensors.

    >>> torch.device(1)
    device(type='cuda', index=1)
</div>

<div class="alert alert-block alert-info">

<font size=3 color=red><b>NOTE: </b></font>
<br>
<br>
Methods which take a device will generally accept a (properly formatted) string or (legacy) integer device ordinal, i.e. the following are all equivalent:

    >>> torch.randn((2,3), device=torch.device('cuda:1'))
    >>> torch.randn((2,3), device='cuda:1')
    >>> torch.randn((2,3), device=1)  # legacy

</div>

<br>
<br>

## torch.layout <a href="https://pytorch.org/docs/stable/tensor_attributes.html#torch-layout" style="text-decoration:none;font-size:70%">[link]</a>

<div class="alert alert-block alert-info">

<font size=3 color=gray><b>CLASS</b></font>&emsp;
<font size=4><b>torch.layout</b></font>

</div>

<div class="alert alert-block alert-danger">

<font size=3 color=red><b>WARNING: </b></font>

The `torch.layout` class is in beta and subject to change.

</div>

<font size=3>A `torch.layout` is an object that represents the memory layout of a <a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor" style="text-decoration:none;font-size:120%">torch.Tensor</a>. Currently, we support <font size=4 color=blue><b>torch.strided</b> (dense Tensors)</font> and have ***beta support*** for <font size=4 color=blue><b>torch.sparse_coo</b> (sparse COO Tensors)</font>.</font>

<font size=3>`torch.strided` represents dense Tensors and is the memory layout that is most commonly used. Each strided tensor has an associated <font color=blue>**torch.Storage**</font>, which holds its data. These tensors provide multi-dimensional, <a href="https://en.wikipedia.org/wiki/Stride_of_an_array" style="text-decoration:none;color:maroon;font-size:120%;">strided</a> view of a storage. 
<br>
<br>
<font color=maroon><b>Strides are a list of integers:</b> the k-th stride represents the jump in the memory necessary to go from one element to the next one in the k-th dimension of the Tensor. This concept makes it possible to perform many tensor operations efficiently.</font></font>

Example:

In [21]:
x = torch.tensor([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
x.stride()

(5, 1)

In [22]:
x.t().stride()

(1, 5)

<br>

<font size=3>For more information on `torch.sparse_coo` tensors, see <a href="https://pytorch.org/docs/stable/sparse.html#sparse-docs" style="text-decoration:none;color:maroon;font-size:120%;">torch.sparse</a>.</font>

<br>

<img src="./1 PyTorch documentation/1 Notes/images/Tensor.png" width=500px>
Cited from <a href="http://blog.ezyang.com/2019/05/pytorch-internals/" style="text-decoration:none;color:maroon;font-size:110%;">ezyang’s blogpost about PyTorch Internals</a>

<br>
<br>

## torch.memory_format <a href="https://pytorch.org/docs/stable/tensor_attributes.html#torch-memory-format" style="text-decoration:none;font-size:70%">[link]</a>

<div class="alert alert-block alert-info">

<font size=3 color=gray><b>CLASS</b></font>&emsp;
<font size=4><b>torch.memory_format</b></font>

</div>

<font size=3>A `torch.memory_format` is an object representing the memory format on which a <a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor" style="text-decoration:none;font-size:120%">torch.Tensor</a> is or will be allocated.</font>
<br>
<br>
* <font size=4>**`torch.contiguous_format`**</font>: Tensor is or will be allocated in dense non-overlapping memory. Strides represented by values in decreasing order.
<br>
<br>
* <font size=4>**`torch.channels_last`**</font>: Tensor is or will be allocated in dense non-overlapping memory. Strides represented by values in `strides[0] > strides[2] > strides[3] > strides[1] == 1` aka ***NHWC order***.
<br>
<br>
* <font size=4>**`torch.preserve_format`**</font>: Used in functions like ***clone*** to preserve the memory format of the input tensor. If input tensor is allocated in dense non-overlapping memory, the output tensor strides will be copied from the input. Otherwise output strides will follow `torch.contiguous_format`


<br>
<br>
<br>

<font size=3 color=gray>Docs > </font>
# Tensor Views <a href="https://pytorch.org/docs/stable/tensor_view.html" style="text-decoration:none;font-size:70%">[link]</a>

<font size=3>PyTorch allows a tensor to be a **`View`** of an existing tensor. <font color=maroon>View tensor shares the same underlying data with its base tensor.</font> Supporting **`View`** avoids explicit data copy, thus allows us to do fast and memory efficient reshaping, slicing and element-wise operations.</font>

For example, to get a view of an existing tensor `t`, you can call `t.view(...)`.

In [23]:
t = torch.rand(4, 4)
b = t.view(2, 8)

# `t` and `b` share the same underlying data.
t.storage().data_ptr() == b.storage().data_ptr()  

True

In [24]:
# Modifying view tensor changes base tensor as well.
b[0][0] = 3.14
t[0][0]

tensor(3.1400)

Since views share underlying data with its base tensor, if you edit the data in the view, it will be reflected in the base tensor as well.

<br>

<font size=3>
Typically a PyTorch op returns a new tensor as output, e.g.  <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.add.html" style="text-decoration:none;font-size:110%">torch.Tensor.add()</a>. 

But in case of view ops, outputs are views of input tensors to avoid unnecessary data copy. No data movement occurs when creating a view, view tensor just changes the way it interprets the same data. <font color=maroon>Taking a view of contiguous tensor could **potentially produce a non-contiguous tensor**. Users should be pay additional attention as contiguity might have **implicit performance impact**.</font> <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.transpose.html" style="text-decoration:none;font-size:110%">torch.Tensor.transpose()</a> is a common example.
</font>

<br>

In [25]:
base = torch.tensor([[0, 1],[2, 3]])
base.is_contiguous()

True

In [26]:
# `t` is a view of `base`. No data movement happened here.
t = base.transpose(0, 1)

# View tensors might be non-contiguous.
t.is_contiguous()

False

<br>

In [40]:
print(base.stride())
print(t.stride())

(2, 1)
(1, 2)


<br>

In [32]:
# `t` and `base` share the same underlying data.
t.storage().data_ptr() == base.storage().data_ptr()  

True

In [33]:
id(base)

1706612480256

In [34]:
id(t)

1706609599888

<br>

In [41]:
# To get a contiguous tensor, call `.contiguous()` to enforce
# copying data when `t` is not contiguous.
c = t.contiguous()
c.stride()

(2, 1)

<br>

In [42]:
# `t` and `base` share the same underlying data.
t.storage().data_ptr() == base.storage().data_ptr()  

True

In [43]:
# `t` and `base` share the same underlying data.
t.storage().data_ptr() == c.storage().data_ptr()  

False

In [46]:
print(id(base))  # 和前面 id(base) 一样
print(id(t))     # 和前面 id(t) 一样
print(id(c))

1706612480256
1706609599888
1706620570896


<br>

<font size=3><font color=maroon>For reference, here’s a full list of view ops in PyTorch:</font>

* Basic slicing and indexing op, e.g. `tensor[0, 2:, 1:7:2]` returns a view of base tensor, see note below.
<br>
<br>
* （略）
</font>

<div class="alert alert-block alert-info">

<font size=3 color=red><b>NOTE: </b></font>

When accessing the contents of a tensor via indexing, PyTorch follows Numpy behaviors that basic indexing returns views, while advanced indexing returns a copy. Assignment via either basic or advanced indexing is in-place. See more examples in <a href="https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html" style="text-decoration:none;color:maroon;font-size:120%;">Numpy indexing documentation</a>.

</div>

<br>

<font size=3 color=maroon>It’s also worth mentioning a few ops with special behaviors:</font>
* <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.reshape.html" style="text-decoration:none;font-size:120%">reshape()</a>, <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.reshape_as.html" style="text-decoration:none;font-size:120%">reshape_as()</a> and <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.flatten.html" style="text-decoration:none;font-size:120%">flatten()</a> can return either a view or new tensor, user code shouldn’t rely on whether it’s view or not.
<br>
<br>
* <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.contiguous.html" style="text-decoration:none;font-size:120%">contiguous()</a> returns **itself** if input tensor is already contiguous, otherwise it returns a new contiguous tensor by copying data.

<br>

<font size=4>For a more detailed walk-through of PyTorch internal implementation, please refer to <a href="http://blog.ezyang.com/2019/05/pytorch-internals/" style="text-decoration:none;color:maroon;font-size:120%;">ezyang’s blogpost about PyTorch Internals</a>.</font>

<br>
<br>
<br>

<font size=3 color=gray>Docs > </font>
# torch <a href="https://pytorch.org/docs/stable/torch.html" style="text-decoration:none;font-size:70%">[link]</a>

<font size=3>The torch package contains data structures for multi-dimensional tensors and defines mathematical operations over these tensors. Additionally, it provides many utilities for efficient serializing of Tensors and arbitrary types, and other useful utilities.</font>

It has a CUDA counterpart, that enables you to run your tensor computations on an NVIDIA GPU with compute capability >= 3.0

## Tensors    <a href="https://pytorch.org/docs/stable/torch.html#tensors" style="text-decoration:none;font-size:70%">[link]</a>

#### torch.numel() <a href="https://pytorch.org/docs/stable/generated/torch.numel.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.is_tensor() <a href="https://pytorch.org/docs/stable/generated/torch.is_tensor.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.is_storage() <a href="https://pytorch.org/docs/stable/generated/torch.is_storage.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.is_complex() <a href="https://pytorch.org/docs/stable/generated/torch.is_complex.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.is_floating_point() <a href="https://pytorch.org/docs/stable/generated/torch.is_floating_point.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.set_default_dtype() <a href="https://pytorch.org/docs/stable/generated/torch.set_default_dtype.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.set_printoptions() <a href="https://pytorch.org/docs/stable/generated/torch.set_printoptions.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.set_flush_denormal() <a href="https://pytorch.org/docs/stable/generated/torch.set_flush_denormal.html" style="text-decoration:none;font-size:70%">[link]</a>

。

。

。

<br>

### Creation Ops <a href="https://pytorch.org/docs/stable/torch.html#creation-ops" style="text-decoration:none;font-size:70%">[link]</a>

### Indexing, Slicing, Joining, Mutating Ops <a href="https://pytorch.org/docs/stable/torch.html#indexing-slicing-joining-mutating-ops" style="text-decoration:none;font-size:70%">[link]</a>

<br>
<br>

## Generators <a href="https://pytorch.org/docs/stable/torch.html#generators" style="text-decoration:none;font-size:70%">[link]</a>

<br>
<br>

## Random sampling <a href="https://pytorch.org/docs/stable/torch.html#random-sampling" style="text-decoration:none;font-size:70%">[link]</a>

#### torch.seed() <a href="https://pytorch.org/docs/stable/generated/torch.seed.html#torch.seed" style="text-decoration:none;font-size:70%">[link]</a>

#### torch.manual_seed(seed) <a href="https://pytorch.org/docs/stable/generated/torch.manual_seed.html#torch.manual_seed" style="text-decoration:none;font-size:70%">[link]</a>

。

。

。

<br>

### <font color=maroon>torch.default_generator</font> Returns the default CPU torch.Generator

#### torch.bernoulli(input, *, generator=None, out=None) <a href="https://pytorch.org/docs/stable/generated/torch.bernoulli.html" style="text-decoration:none;font-size:70%">[link]</a>

#### torch.rand() <a href="https://pytorch.org/docs/stable/generated/torch.rand.html" style="text-decoration:none;font-size:70%">[link]</a>

#### torch.rand_like(input, *, dtype=None, ...) <a href="https://pytorch.org/docs/stable/generated/torch.rand_like.html" style="text-decoration:none;font-size:70%">[link]</a>

#### torch.normal(mean, std, *, generator=None, out=None) <a href="https://pytorch.org/docs/stable/generated/torch.normal.html" style="text-decoration:none;font-size:70%">[link]</a>

#### torch.randint(low=0, high, size, ...) <a href="https://pytorch.org/docs/stable/generated/torch.randint.html" style="text-decoration:none;font-size:70%">[link]</a>

#### torch.randn() <a href="https://pytorch.org/docs/stable/generated/torch.randn.html" style="text-decoration:none;font-size:70%">[link]</a>

。

。

。

<br>

### In-place random sampling <a href="https://pytorch.org/docs/stable/torch.html#in-place-random-sampling" style="text-decoration:none;font-size:70%">[link]</a>

<font size=3>There are a few more in-place random sampling functions defined on Tensors as well. Click through to refer to their documentation:</font>

。

。

。

<br>

### Quasi-random sampling <a href="https://pytorch.org/docs/stable/torch.html#quasi-random-sampling" style="text-decoration:none;font-size:70%">[link]</a>

。

。

。

<br>
<br>

## <font color=red>Serialization</font> <a href="https://pytorch.org/docs/stable/torch.html#serialization" style="text-decoration:none;font-size:70%">[link]</a>

### torch.save <a href="https://pytorch.org/docs/stable/generated/torch.save.html" style="text-decoration:none;font-size:70%">[link]</a>

### torch.load() <a href="https://pytorch.org/docs/stable/generated/torch.load.html" style="text-decoration:none;font-size:70%">[link]</a>

<br>
<br>

## <font color=red>Parallelism</font> <a href="https://pytorch.org/docs/stable/torch.html#parallelism" style="text-decoration:none;font-size:70%">[link]</a>

<br>
<br>

## <font color=red>Locally disabling gradient computation</font> <a href="https://pytorch.org/docs/stable/torch.html#locally-disabling-gradient-computation" style="text-decoration:none;font-size:70%">[link]</a>

<font size=3>The context managers <a href="https://pytorch.org/docs/stable/generated/torch.no_grad.html" style="text-decoration:none;font-size:120%">torch.no_grad()</a>, <a href="https://pytorch.org/docs/stable/generated/torch.enable_grad.html" style="text-decoration:none;font-size:120%">torch.enable_grad()</a>, and <a href="https://pytorch.org/docs/stable/generated/torch.set_grad_enabled.html" style="text-decoration:none;font-size:120%">torch.set_grad_enabled()</a> are helpful for locally disabling and enabling gradient computation. See <a href="https://pytorch.org/docs/stable/autograd.html#locally-disable-grad" style="text-decoration:none;color:maroon;font-size:110%;">Locally disabling gradient computation</a> for more details on their usage. These context managers are thread local, so they won’t work if you send work to another thread using the threading module, etc.</font>

In [3]:
x = torch.zeros(1, requires_grad=True)
with torch.no_grad():
    y = x * 2
y.requires_grad

False

In [4]:
is_train = False
with torch.set_grad_enabled(is_train):
    y = x * 2
y.requires_grad

False

In [5]:
torch.set_grad_enabled(True)  # this can also be used as a function
y = x * 2
y.requires_grad

True

In [6]:
torch.set_grad_enabled(False)
y = x * 2
y.requires_grad

False

### torch.no_grad() <a href="https://pytorch.org/docs/stable/generated/torch.no_grad.html" style="text-decoration:none;font-size:70%">[link]</a>

### torch.enable_grad() <a href="https://pytorch.org/docs/stable/generated/torch.enable_grad.html" style="text-decoration:none;font-size:70%">[link]</a>

### torch.set_grad_enabled() <a href="https://pytorch.org/docs/stable/generated/torch.set_grad_enabled.html" style="text-decoration:none;font-size:70%">[link]</a>

### torch.is_grad_enabled() <a href="https://pytorch.org/docs/stable/generated/torch.is_grad_enabled.html" style="text-decoration:none;font-size:70%">[link]</a>

### torch.inference_mode() <a href="https://pytorch.org/docs/stable/generated/torch.inference_mode.html" style="text-decoration:none;font-size:70%">[link]</a>

### torch.is_inference_mode_enabled() <a href="https://pytorch.org/docs/stable/generated/torch.is_inference_mode_enabled.html" style="text-decoration:none;font-size:70%">[link]</a>

<br>
<br>

## Math operations <a href="https://pytorch.org/docs/stable/torch.html#math-operations" style="text-decoration:none;font-size:70%">[link]</a>

### Pointwise Ops <a href="https://pytorch.org/docs/stable/torch.html#pointwise-ops" style="text-decoration:none;font-size:70%">[link]</a>

### Reduction Ops <a href="https://pytorch.org/docs/stable/torch.html#reduction-ops" style="text-decoration:none;font-size:70%">[link]</a>

#### torch.argmax() <a href="https://pytorch.org/docs/stable/generated/torch.argmax.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.amax() <a href="https://pytorch.org/docs/stable/generated/torch.amax.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.aminmax() <a href="https://pytorch.org/docs/stable/generated/torch.aminmax.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.all() <a href="https://pytorch.org/docs/stable/generated/torch.all.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.any() <a href="https://pytorch.org/docs/stable/generated/torch.any.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.max() <a href="https://pytorch.org/docs/stable/generated/torch.max.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.dist() <a href="https://pytorch.org/docs/stable/generated/torch.dist.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.logsumexp() <a href="https://pytorch.org/docs/stable/generated/torch.logsumexp.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.mean() <a href="https://pytorch.org/docs/stable/generated/torch.mean.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.nanmean() <a href="https://pytorch.org/docs/stable/generated/torch.nanmean.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.median() <a href="https://pytorch.org/docs/stable/generated/torch.median.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.nanmedian() <a href="https://pytorch.org/docs/stable/generated/torch.nanmedian.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.mode() <a href="https://pytorch.org/docs/stable/generated/torch.mode.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.norm() <a href="https://pytorch.org/docs/stable/generated/torch.norm.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.nansum() <a href="https://pytorch.org/docs/stable/generated/torch.nansum.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.prod() <a href="https://pytorch.org/docs/stable/generated/torch.prod.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.std() <a href="https://pytorch.org/docs/stable/generated/torch.std.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.std_mean() <a href="https://pytorch.org/docs/stable/generated/torch.std_mean.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.unique() <a href="https://pytorch.org/docs/stable/generated/torch.unique.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.var() <a href="https://pytorch.org/docs/stable/generated/torch.var.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.var_mean() <a href="https://pytorch.org/docs/stable/generated/torch.var_mean.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.count_nonzero() <a href="https://pytorch.org/docs/stable/generated/torch.count_nonzero.html" style="text-decoration:none;font-size:70%">[link]</a>

。

。

。

<br>

### Comparison Ops <a href="https://pytorch.org/docs/stable/torch.html#spectral-ops" style="text-decoration:none;font-size:70%">[link]</a>

#### torch.allclose() <a href="https://pytorch.org/docs/stable/generated/torch.allclose.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.argsort() <a href="https://pytorch.org/docs/stable/generated/torch.argsort.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.eq() <a href="https://pytorch.org/docs/stable/generated/torch.eq.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.equal() <a href="https://pytorch.org/docs/stable/generated/torch.equal.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.ge() <a href="https://pytorch.org/docs/stable/generated/torch.ge.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.greater_equal() <a href="https://pytorch.org/docs/stable/generated/torch.greater_equal.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.isnan() <a href="https://pytorch.org/docs/stable/generated/torch.isnan.html" style="text-decoration:none;font-size:70%">[link]</a>
#### torch.sort() <a href="https://pytorch.org/docs/stable/generated/torch.sort.html" style="text-decoration:none;font-size:70%">[link]</a>

。

。

。

<br>

### Spectral Ops <a href="https://pytorch.org/docs/stable/torch.html#spectral-ops" style="text-decoration:none;font-size:70%">[link]</a>

### Other Operations <a href="https://pytorch.org/docs/stable/torch.html#other-operations" style="text-decoration:none;font-size:70%">[link]</a>

### BLAS and LAPACK Operations <a href="https://pytorch.org/docs/stable/torch.html#blas-and-lapack-operations" style="text-decoration:none;font-size:70%">[link]</a>

<br>
<br>

## Utilities <a href="https://pytorch.org/docs/stable/torch.html#utilities" style="text-decoration:none;font-size:70%">[link]</a>

<br>
<br>

<br>
<br>
<br>

<font size=3 color=gray>Docs > </font>
# torch.Tensor <a href="https://pytorch.org/docs/stable/tensors.html" style="text-decoration:none;font-size:70%">[link]</a>

A <a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor" style="text-decoration:none;font-size:120%">torch.Tensor</a> is a multi-dimensional matrix containing elements of a single data type.

## Data types <a href="https://pytorch.org/docs/stable/tensors.html#data-types" style="text-decoration:none;font-size:70%">[link]</a>

。

。

。

<a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor" style="text-decoration:none;font-size:120%">torch.Tensor</a> is an alias for the default tensor type (`torch.FloatTensor`).

<br>
<br>

## Initializing and basic operations <a href="https://pytorch.org/docs/stable/tensors.html#initializing-and-basic-operations" style="text-decoration:none;font-size:70%">[link]</a>

A tensor can be constructed from a Python <a href="https://docs.python.org/3/library/stdtypes.html#list" style="text-decoration:none;font-size:120%">list</a> or sequence using the <a href="https://pytorch.org/docs/stable/generated/torch.tensor.html" style="text-decoration:none;font-size:120%">torch.tensor()</a> constructor:

In [47]:
torch.tensor([[1., -1.], [1., -1.]])

tensor([[ 1., -1.],
        [ 1., -1.]])

In [48]:
torch.tensor(np.array([[1, 2, 3], [4, 5, 6]]))

tensor([[1, 2, 3],
        [4, 5, 6]], dtype=torch.int32)

<div class="alert alert-block alert-danger">

<font size=3 color=red><b>WARNING: </b></font>

`torch.tensor()` always copies data. If you have a Tensor data and just want to change its ***requires_grad*** flag, use <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.requires_grad_.html" style="text-decoration:none;font-size:120%">requires_grad_()</a> or <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.detach.html" style="text-decoration:none;font-size:120%">detach()</a> to avoid a copy. If you have a numpy array and want to avoid a copy, use <a href="https://pytorch.org/docs/stable/generated/torch.as_tensor.html" style="text-decoration:none;font-size:120%">torch.as_tensor()</a>.

</div>

<br>

A tensor of specific data type can be constructed by passing a <a href="https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.dtype" style="text-decoration:none;font-size:120%">torch.dtype</a> and/or a <a href="https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.device" style="text-decoration:none;font-size:120%">torch.device</a> to a constructor or tensor creation op:

In [49]:
torch.zeros([2, 4], dtype=torch.int32)

tensor([[0, 0, 0, 0],
        [0, 0, 0, 0]], dtype=torch.int32)

In [50]:
cuda0 = torch.device('cuda:0')
torch.ones([2, 4], dtype=torch.float64, device=cuda0)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]], device='cuda:0', dtype=torch.float64)

<font size=3>For more information about building Tensors, see <a href="https://pytorch.org/docs/stable/torch.html#tensor-creation-ops" style="text-decoration:none;color:maroon;font-size:120%;">Creation Ops</a></font>

<br>

The contents of a tensor can be accessed and modified using Python’s indexing and slicing notation:

In [51]:
x = torch.tensor([[1, 2, 3], [4, 5, 6]])
print(x[1][2])

x[0][1] = 8
print(x)

tensor(6)
tensor([[1, 8, 3],
        [4, 5, 6]])


<br>

Use <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.item.html#torch.Tensor.item" style="text-decoration:none;font-size:120%">torch.Tensor.item()</a> to get a Python number from a tensor containing a single value:

In [52]:
x = torch.tensor([[1]])
x

tensor([[1]])

In [53]:
x.item()

1

In [54]:
x = torch.tensor(2.5)
x

tensor(2.5000)

In [55]:
x.item()

2.5

<font size=3>For more information about indexing, see <a href="https://pytorch.org/docs/stable/torch.html#indexing-slicing-joining" style="text-decoration:none;color:maroon;font-size:120%;">Indexing, Slicing, Joining, Mutating Ops</a></font>

<br>

A tensor can be created with ***requires_grad=True*** so that <a href="https://pytorch.org/docs/stable/autograd.html#module-torch.autograd" style="text-decoration:none;font-size:120%">torch.autograd</a> records operations on them for automatic differentiation.

In [56]:
x = torch.tensor([[1., -1.], [1., 1.]], requires_grad=True)
out = x.pow(2).sum()
out.backward()
x.grad

tensor([[ 2., -2.],
        [ 2.,  2.]])

<br>

Each tensor has an associated `torch.Storage`, which holds its data. The tensor class also provides multi-dimensional, <a href="https://en.wikipedia.org/wiki/Stride_of_an_array" style="text-decoration:none;color:maroon;font-size:120%;">strided</a> view of a storage and defines numeric operations on it.

<div class="alert alert-block alert-info">

<font size=3 color=red><b>NOTE: </b></font>

For more information on tensor views, see <a href="https://pytorch.org/docs/stable/tensor_view.html" style="text-decoration:none;color:maroon;font-size:120%;">Tensor Views</a>.

</div>

<div class="alert alert-block alert-info">

<font size=3 color=red><b>NOTE: </b></font>

For more information on the `torch.dtype`, `torch.device`, and `torch.layout` attributes of a <a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor" style="text-decoration:none;font-size:120%;color:blue;">torch.Tensor</a>, see <a href="https://pytorch.org/docs/stable/tensor_attributes.html" style="text-decoration:none;color:maroon;font-size:120%;">Tensor Attributes</a>.

</div>

<div class="alert alert-block alert-info">

<font size=3 color=red><b>NOTE: </b></font>

Methods which mutate a tensor are marked with an <font color=red>underscore suffix</font>. For example, `torch.FloatTensor.abs_()` computes the absolute value in-place and returns the modified tensor, while `torch.FloatTensor.abs()` computes the result in a new tensor.

</div>

<div class="alert alert-block alert-info">

<font size=3 color=red><b>NOTE: </b></font>

To change an existing tensor’s `torch.device` and/or `torch.dtype`, consider using <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.to.html" style="text-decoration:none;font-size:120%;color:blue;">to()</a> method on the tensor.

</div>

<div class="alert alert-block alert-danger">

<font size=3 color=red><b>WARNING: </b></font>

Current implementation of <a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor" style="text-decoration:none;font-size:120%">torch.Tensor</a> introduces memory overhead, thus it might lead to unexpectedly high memory usage in the applications with many tiny tensors. If this is your case, consider using one large structure.

</div>

<br>
<br>

## Tensor class reference <a href="https://pytorch.org/docs/stable/tensors.html#tensor-class-reference" style="text-decoration:none;font-size:70%">[link]</a>

<div class="alert alert-block alert-info">

<font size=3 color=gray><b>CLASS</b></font>&emsp;
<font size=4><b>torch.Tensor</b></font>

</div>

<font size=3>There are a few main ways to create a tensor, depending on your use case.
<br>
<br>
* To create a tensor with pre-existing data, use <a href="https://pytorch.org/docs/stable/generated/torch.tensor.html#torch.tensor" style="text-decoration:none;font-size:120%">torch.tensor()</a>.
<br>
<br>
* To create a tensor with specific size, use `torch.*` tensor creation ops (see <a href="https://pytorch.org/docs/stable/torch.html#tensor-creation-ops" style="text-decoration:none;color:maroon;font-size:100%;">Creation Ops</a>).
<br>
    <br>
* To create a tensor with the same size (and similar types) as another tensor, use `torch.*_like` tensor creation ops (see <a href="https://pytorch.org/docs/stable/torch.html#tensor-creation-ops" style="text-decoration:none;color:maroon;font-size:100%;">Creation Ops</a>).
<br>
<br>
* To create a tensor with similar type but different size as another tensor, use `tensor.new_*` creation ops.</font>

<br>

<div class="alert alert-block alert-info">

<font size=4><b>Tensor.T</b></font>

</div>

Returns a view of this tensor with its dimensions reversed.

If `n` is the number of dimensions in `x`, `x.T` is equivalent to `x.permute(n-1, n-2, ..., 0)`.

<div class="alert alert-block alert-danger">

<font size=3 color=red><b>WARNING: </b></font>

The use of <a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor.T" style="text-decoration:none;font-size:120%">Tensor.T()</a> on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. Consider <a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor.mT" style="text-decoration:none;font-size:120%">mT</a> to transpose batches of matrices or `x.permute(*torch.arange(x.ndim - 1, -1, -1))` to reverse the dimensions of a tensor.

</div>

<br>

<div class="alert alert-block alert-info">

<font size=4><b>Tensor.H</b></font>

</div>

Returns a view of a matrix (2-D tensor) conjugated and transposed.

`x.H` is equivalent to `x.transpose(0, 1).conj()` for complex matrices and `x.transpose(0, 1)` for real matrices.

SEE ALSO: <br>
<a href="https://pytorch.org/docs/stable/tensors.html#torch.Tensor.mH" style="text-decoration:none;font-size:120%">mH</a>: An attribute that also works on batches of matrices.

<br>

<div class="alert alert-block alert-info">

<font size=4><b>Tensor.mT</b></font>

</div>

Returns a view of this tensor with the last two dimensions transposed.

`x.mT` is equivalent to `x.transpose(-2, -1)`.

<br>

<div class="alert alert-block alert-info">

<font size=4><b>Tensor.mH</b></font>

</div>

Accessing this property is equivalent to calling <a href="https://pytorch.org/docs/stable/generated/torch.adjoint.html#torch.adjoint" style="text-decoration:none;font-size:120%">adjoint()</a>.

<br>
<br>

<font size=4 color=maroon>(原网址此处有一张 Tensor.** 的函数表，非常长)</font>

<br>
<br>

## (无)

<br>
<br>

<font size=3 color=gray>Docs > torch.Tensor > </font>
## torch.Tensor.view <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.view.html" style="text-decoration:none;font-size:70%">[link]</a>

### view(*shape)

<div class="alert alert-block alert-info">

<font size=4><b>Tensor.view(*shape)</b> → Tensor</font>

</div>

<font size=3>Returns <font color=maroon>a **new** tensor with the **same** data</font> as the `self` tensor but of a different `shape`.</font>

<font size=3>The returned tensor shares the same data and must have the same number of elements, but may have a different size. For a tensor to be viewed, the new view size must be compatible with its original size and stride, i.e., 
<br>
<br>
* each new view dimension must either be a ***subspace*** of an original dimension, 
<br>
<br>
* or only **span** across original dimensions <font size=4 color=maroon>$d, d+1, \dots, d+k$</font> that satisfy the following contiguity-like condition that <font size=4 color=maroon>$\forall i = d, \dots, d+k-1$</font>,
<br>
<br>
<font size=5 color=maroon>$$stride[i]=stride[i+1]×size[i+1]$$</font>

Otherwise, it will not be possible to view `self` tensor as `shape` without copying it (e.g., via <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.contiguous.html" style="text-decoration:none;font-size:120%">contiguous()</a>).


When it is unclear whether a `view()` can be performed, it is advisable to use <a href="https://pytorch.org/docs/stable/generated/torch.reshape.html" style="text-decoration:none;font-size:120%">reshape()</a>, which returns a view if the shapes are compatible, and copies (equivalent to calling `contiguous()`) otherwise.
</font>

**Parameters**
* `shape` (torch.Size or int...) – the desired size

Example:

In [57]:
x = torch.randn(4, 4)
x.size()

torch.Size([4, 4])

In [58]:
y = x.view(16)
y.size()

torch.Size([16])

In [59]:
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
z.size()

torch.Size([2, 8])

In [67]:
# `x`、`y` and `z` share the same underlying data.
x.storage().data_ptr() == y.storage().data_ptr() == z.storage().data_ptr()  

True

In [69]:
torch.equal(x, y), torch.equal(x, z), torch.equal(y, z)

(False, False, False)

<br>

In [60]:
a = torch.randn(1, 2, 3, 4)
a.size()

torch.Size([1, 2, 3, 4])

In [61]:
b = a.transpose(1, 2)  # Swaps 2nd and 3rd dimension
b.size()

torch.Size([1, 3, 2, 4])

In [62]:
c = a.view(1, 3, 2, 4)  # Does not change tensor layout in memory
c.size()

torch.Size([1, 3, 2, 4])

In [70]:
torch.equal(b, c), torch.equal(a, b), torch.equal(a, c)

(False, False, False)

In [65]:
# `a`、`b` and `c` share the same underlying data.
a.storage().data_ptr() == b.storage().data_ptr() == c.storage().data_ptr()  

True

In [66]:
print(id(a))
print(id(b))
print(id(c))

1706614840752
1706620448176
1706561625840


<br>
<br>

### view(dtype)

<div class="alert alert-block alert-info">

<font size=4><b>Tensor.view(dtype)</b> → Tensor</font>

</div>

<font size=3>Returns <font color=maroon>a **new** tensor with the **same** data</font> as the `self` tensor but of a different `dtype`.</font>

(略)

<br>
<br>

<font size=3 color=gray>Docs > torch.Tensor > </font>
##  <font color=red>torch.Tensor.detach</font> <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.detach.html" style="text-decoration:none;font-size:70%">[link]</a>

<div class="alert alert-block alert-info">

<font size=4><b>Tensor.detach()</b></font>

</div>

<font size=3>Returns a new Tensor, detached from the current graph.

The result will never require gradient.

This method also affects forward mode AD gradients and the result will never have forward mode AD gradients.</font>

<div class="alert alert-block alert-info">

<font size=3 color=red><b>NOTE: </b></font>

Returned Tensor shares the same storage with the original one. In-place modifications on either of them will be seen, and may trigger errors in correctness checks. 
<br>
<br>
**`IMPORTANT NOTE:`** 

Previously, in-place size / stride / storage changes (such as `resize_ / resize_as_ / set_ / transpose_`) to the returned tensor also update the original tensor. <font color=maroon>Now, these in-place changes will not update the original tensor anymore, and will instead trigger an error.</font> 

For sparse tensors: In-place indices / values changes (such as `zero_ / copy_ / add_`) to the returned tensor <font color=maroon>will not update the original tensor anymore, and will instead trigger an error.</font>

</div>

<br>
<br>

<font size=3 color=gray>Docs > torch.Tensor > </font>
##  torch.Tensor.register_hook <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.register_hook.html" style="text-decoration:none;font-size:70%">[link]</a>

<div class="alert alert-block alert-info">

<font size=4><b>Tensor.register_hook</b>(hook)</font>

</div>

<font size=3>Registers a backward hook.

The hook will be called every time a gradient with respect to the Tensor is computed. The hook should have the following signature:</font>

<font size=4>$$hook(grad) → Tensor \ or \ None$$</font>

<br>
<font size=3>The hook should not modify its argument, but it can optionally return a new gradient which will be used in place of <a href="https://pytorch.org/docs/stable/generated/torch.Tensor.grad.html" style="text-decoration:none;font-size:120%">grad</a>.

This function returns a handle with a method <font color=blue>***handle.remove()***</font> that removes the hook from the module.</font>

Example:

In [71]:
v = torch.tensor([0., 0., 0.], requires_grad=True)
h = v.register_hook(lambda grad: grad * 2)    # double the gradient
v.backward(torch.tensor([1., 2., 3.]))
v.grad

tensor([2., 4., 6.])

In [72]:
h.remove()  # removes the hook

<br>
<br>

<div class="alert alert-block alert-info">

<font size=4><b> torch.Tensor.zero_</b>() → Tensor</font>

</div>

<font size=3>Fills `self` tensor with zeros.</font>

<br>
<br>
<br>

<font size=3 color=gray>Docs > </font>
#  torch.nn.init <a href="https://pytorch.org/docs/stable/nn.init.html" style="text-decoration:none;font-size:70%">[link]</a>

<div class="alert alert-block alert-info">

<font size=4><b> torch.nn.init.calculate_gain</b>(nonlinearity, param=None)</font>

</div>

Return the recommended gain value for the given nonlinearity function. The values are as follows: 

<img src="./1 PyTorch documentation/1 Notes/images/init-calculate_gain.jpeg" width=600px>

<div class="alert alert-block alert-danger">

<font size=3 color=red><b>WARNING: </b></font>

In order to implement <a href="https://papers.nips.cc/paper/2017/hash/5d44ee6f2c3f71b73125876103c8f6c4-Abstract.html" style="text-decoration:none;color:maroon;font-size:120%;">Self-Normalizing Neural Networks</a> , you should use `nonlinearity='linear'` instead of `nonlinearity='selu'`. This gives the initial weights a variance of `1 / N`, which is necessary to induce a stable fixed point in the forward pass. In contrast, the default gain for `SELU` sacrifices the normalisation effect for more stable gradient flow in rectangular layers.

</div>

**Parameters**
* `nonlinearity` – the non-linear function (*nn.functional* name)

* `param` – optional parameter for the non-linear function

Examples

In [74]:
from torch import nn

In [75]:
gain = nn.init.calculate_gain('leaky_relu', 0.2)  # leaky_relu with negative_slope=0.2
gain

1.3867504905630728

In [78]:
nn.init.calculate_gain('tanh')   # 'Tanh' 为无效参数

1.6666666666666667

<br>
<br>

<div class="alert alert-block alert-info">

<font size=4><b> torch.nn.init.uniform_</b>(tensor, a=0.0, b=1.0)</font>

</div>

<font size=3>Fills the input Tensor with values drawn from the uniform distribution $U(a, b)$.</font>

Examples

In [81]:
w = torch.empty(3, 5)
w

tensor([[1.0194e-38, 1.0469e-38, 1.0010e-38, 8.4490e-39, 1.0102e-38],
        [9.0919e-39, 1.0102e-38, 8.9082e-39, 8.4489e-39, 1.0102e-38],
        [1.0561e-38, 1.0286e-38, 9.4592e-39, 9.9184e-39, 9.0000e-39]])

In [85]:
torch.manual_seed(1)
nn.init.uniform_(w)

tensor([[0.7576, 0.2793, 0.4031, 0.7347, 0.0293],
        [0.7999, 0.3971, 0.7544, 0.5695, 0.4388],
        [0.6387, 0.5247, 0.6826, 0.3051, 0.4635]])

In [86]:
w

tensor([[0.7576, 0.2793, 0.4031, 0.7347, 0.0293],
        [0.7999, 0.3971, 0.7544, 0.5695, 0.4388],
        [0.6387, 0.5247, 0.6826, 0.3051, 0.4635]])

<br>
<br>

<div class="alert alert-block alert-info">

<font size=4><b> torch.nn.init.normal_</b>(tensor, mean=0.0, std=1.0)</font>

</div>

<font size=3>Fills the input Tensor with values drawn from the normal distribution $\mathcal{N}(\text{mean}, \text{std}^2)$.</font>

(略)

<br>
<br>

<div class="alert alert-block alert-info">

<font size=4><b> torch.nn.init.constant_</b>(tensor, val)</font>

</div>

<font size=3>Fills the input Tensor with the value $\text{val}$.</font>

(略)

<br>

(还有其它一些函数，详见原网址)

<br>
<br>

<br>
<br>
<br>

<font size=3 color=gray>Docs > </font>
#  torch.nn <a href="https://pytorch.org/docs/stable/nn.html" style="text-decoration:none;font-size:70%">[link]</a>

<br>

<a href="https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html" style="text-decoration:none;font-size:120%">Parameter</a> &emsp; A kind of Tensor that is to be considered a module parameter.

<a href="https://pytorch.org/docs/stable/generated/torch.nn.parameter.UninitializedParameter.html" style="text-decoration:none;font-size:120%">UninitializedParameter</a> &emsp; A parameter that is not initialized.

<a href="https://pytorch.org/docs/stable/generated/torch.nn.parameter.UninitializedBuffer.html" style="text-decoration:none;font-size:120%">UninitializedBuffer</a> &emsp; A buffer that is not initialized.

<br>

## <font style="font-size:110%">These are the basic building blocks for graphs:</font>

### <a href="https://pytorch.org/docs/stable/nn.html#containers" style="text-decoration:none;font-size:110%">Containers</a>

### <a href="https://pytorch.org/docs/stable/nn.html#convolution-layers" style="text-decoration:none;font-size:110%">Convolution Layers</a>

### <a href="https://pytorch.org/docs/stable/nn.html#pooling-layers" style="text-decoration:none;font-size:110%">Pooling layers</a>

### <a href="https://pytorch.org/docs/stable/nn.html#padding-layers" style="text-decoration:none;font-size:110%">Padding Layers</a>

### <a href="https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity" style="text-decoration:none;font-size:110%">Non-linear Activations (weighted sum, nonlinearity)</a>

### <a href="https://pytorch.org/docs/stable/nn.html#non-linear-activations-other" style="text-decoration:none;font-size:110%">Non-linear Activations (other)</a>

### <a href="https://pytorch.org/docs/stable/nn.html#normalization-layers" style="text-decoration:none;font-size:110%">Normalization Layers</a>

&emsp;&emsp;&emsp;&ensp;
<a href="https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html" style="text-decoration:none;font-size:140%;color:maroon;font-weight:bold;">nn.BatchNorm2d</a>

<br>
<br>

### <a href="https://pytorch.org/docs/stable/nn.html#recurrent-layers" style="text-decoration:none;font-size:110%">Recurrent Layers</a>

### <a href="https://pytorch.org/docs/stable/nn.html#transformer-layers" style="text-decoration:none;font-size:110%">Transformer Layers</a>

### <a href="https://pytorch.org/docs/stable/nn.html#linear-layers" style="text-decoration:none;font-size:110%">Linear Layers</a>

### <a href="https://pytorch.org/docs/stable/nn.html#dropout-layers" style="text-decoration:none;font-size:110%">Dropout Layers</a>

&emsp;&emsp;&emsp;&ensp;
<a href="https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html" style="text-decoration:none;color:maroon;font-size:140%;font-weight:bold;">nn.Dropout</a>

&emsp;&emsp;&emsp;&ensp;
<a href="https://pytorch.org/docs/stable/generated/torch.nn.Dropout2d.html" style="text-decoration:none;color:maroon;font-size:140%;font-weight:bold;">nn.Dropout2d</a>

<br>

### <a href="https://pytorch.org/docs/stable/nn.html#sparse-layers" style="text-decoration:none;font-size:110%">Sparse Layers</a>

### <a href="https://pytorch.org/docs/stable/nn.html#distance-functions" style="text-decoration:none;font-size:110%">Distance Functions</a>

### <a href="https://pytorch.org/docs/stable/nn.html#loss-functions" style="text-decoration:none;font-size:110%">Loss Functions</a>

### <a href="https://pytorch.org/docs/stable/nn.html#vision-layers" style="text-decoration:none;font-size:110%">Vision Layers</a>

### <a href="https://pytorch.org/docs/stable/nn.html#shuffle-layers" style="text-decoration:none;font-size:110%">Shuffle Layers</a>

### <a href="https://pytorch.org/docs/stable/nn.html#dataparallel-layers-multi-gpu-distributed" style="text-decoration:none;font-size:110%">DataParallel Layers (multi-GPU, distributed)</a>

### <a href="https://pytorch.org/docs/stable/nn.html#utilities" style="text-decoration:none;font-size:110%">Utilities</a>

### <a href="https://pytorch.org/docs/stable/nn.html#quantized-functions" style="text-decoration:none;font-size:110%">Quantized Functions</a>

### <a href="https://pytorch.org/docs/stable/nn.html#lazy-modules-initialization" style="text-decoration:none;font-size:110%">Lazy Modules Initialization</a>

<br>
<br>
<br>

# torch.nn.Module <a href="https://pytorch.org/docs/stable/generated/torch.nn.Module.html" style="text-decoration:none;font-size:70%">[link]</a>

<font size=3 color=gray>Docs > torch.nn > </font>
<font size=4>**Module**</font> 

***(待完善)***

torch.nn.Module.train

https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.train

torch.nn.Module.eval

https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval

<br>
<br>
<br>