# Documentation


While we cannot possibly introduce every single PyTorch function and class
(and the information might become outdated quickly),
the [API documentation](https://pytorch.org/docs/stable/index.html) and additional [tutorials](https://pytorch.org/tutorials/beginner/basics/intro.html) and examples
provide such documentation.
This section provides some guidance for how to explore the PyTorch API.


In [None]:
import torch

## Functions and Classes in a Module

To know which functions and classes can be called in a module,
we invoke the `dir` function. For instance, we can
(**query all properties in the module for generating random numbers**):


In [None]:
print(dir(torch.distributions))

['AbsTransform', 'AffineTransform', 'Bernoulli', 'Beta', 'Binomial', 'CatTransform', 'Categorical', 'Cauchy', 'Chi2', 'ComposeTransform', 'ContinuousBernoulli', 'CorrCholeskyTransform', 'CumulativeDistributionTransform', 'Dirichlet', 'Distribution', 'ExpTransform', 'Exponential', 'ExponentialFamily', 'FisherSnedecor', 'Gamma', 'Geometric', 'Gumbel', 'HalfCauchy', 'HalfNormal', 'Independent', 'IndependentTransform', 'Kumaraswamy', 'LKJCholesky', 'Laplace', 'LogNormal', 'LogisticNormal', 'LowRankMultivariateNormal', 'LowerCholeskyTransform', 'MixtureSameFamily', 'Multinomial', 'MultivariateNormal', 'NegativeBinomial', 'Normal', 'OneHotCategorical', 'OneHotCategoricalStraightThrough', 'Pareto', 'Poisson', 'PositiveDefiniteTransform', 'PowerTransform', 'RelaxedBernoulli', 'RelaxedOneHotCategorical', 'ReshapeTransform', 'SigmoidTransform', 'SoftmaxTransform', 'SoftplusTransform', 'StackTransform', 'StickBreakingTransform', 'StudentT', 'TanhTransform', 'Transform', 'TransformedDistribution',

Generally, we can ignore functions that start and end with `__` (special objects in Python)
or functions that start with a single `_`(usually internal functions).
Based on the remaining function or attribute names,
we might hazard a guess that this module offers
various methods for generating random numbers,
including sampling from the uniform distribution (`uniform`),
normal distribution (`normal`), and multinomial distribution (`multinomial`).

## Specific Functions and Classes

For specific instructions on how to use a given function or class,
we can invoke the  `help` function. As an example, let's
[**explore the usage instructions for tensors' `ones` function**].


In [None]:
help(torch.ones)

Help on built-in function ones in module torch:

ones(...)
    ones(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) -> Tensor
    
    Returns a tensor filled with the scalar value `1`, with the shape defined
    by the variable argument :attr:`size`.
    
    Args:
        size (int...): a sequence of integers defining the shape of the output tensor.
            Can be a variable number of arguments or a collection like a list or tuple.
    
    Keyword arguments:
        out (Tensor, optional): the output tensor.
        dtype (:class:`torch.dtype`, optional): the desired data type of returned tensor.
            Default: if ``None``, uses a global default (see :func:`torch.set_default_tensor_type`).
        layout (:class:`torch.layout`, optional): the desired layout of returned Tensor.
            Default: ``torch.strided``.
        device (:class:`torch.device`, optional): the desired device of returned tensor.
            Default: if ``None

From the documentation, we can see that the `ones` function
creates a new tensor with the specified shape
and sets all the elements to the value of 1.
Whenever possible, you should (**run a quick test**)
to confirm your interpretation:


In [None]:
torch.ones(4)

tensor([1., 1., 1., 1.])

In the Jupyter notebook, we can use `?` to display the document in another
window. For example, `list?` will create content
that is almost identical to `help(list)`,
displaying it in a new browser window.
In addition, if we use two question marks, such as `list??`,
the Python code implementing the function will also be displayed.

The official documentation provides plenty of descriptions and examples that are beyond this book.
We emphasize important use cases
that will get you started quickly with practical problems,
rather than completeness of coverage.
We also encourage you to study the source code of the libraries
to see examples of high-quality implementations of production code.
By doing this you will become a better engineer
in addition to becoming a better scientist.


[Discussions](https://discuss.d2l.ai/t/39)


# Task
Solve the following questions using PyTorch and NumPy:
1. Explain `torch.ones_like` and `autograd` based on the PyTorch documentation.
2. List all possible outputs of `np.random.choice(4, 2)`.
3. Rewrite `np.random.choice(4, 2)` using `np.random.randint`.

## Look up pytorch documentation

### Subtask:
Provide information about `torch.ones_like` and `autograd` based on the PyTorch website documentation.


**Reasoning**:
I need to search the PyTorch documentation for information on `torch.ones_like` and `autograd`. Since I cannot access external websites directly, I will rely on the knowledge I have been trained on, which includes information from the PyTorch documentation. I will then synthesize this information into a concise explanation.



In [1]:
# Information about torch.ones_like based on PyTorch documentation knowledge
ones_like_info = """
torch.ones_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) -> Tensor

Returns a tensor filled with the scalar value 1, with the same size as input.
torch.ones_like(input) is equivalent to torch.ones(input.size(), dtype=input.dtype, layout=input.layout, device=input.device).

Example:
>>> x = torch.zeros(3, 2)
>>> torch.ones_like(x)
tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])
"""

# Information about autograd based on PyTorch documentation knowledge
autograd_info = """
Autograd is PyTorch's automatic differentiation engine. It records operations on tensors and computes gradients.
It is the core of PyTorch's deep learning capabilities. When a tensor has requires_grad=True, PyTorch tracks all operations performed on it.
When you call .backward() on a tensor, autograd computes the gradients of that tensor with respect to all leaf tensors that have requires_grad=True.
These gradients are then accumulated in the .grad attribute of the leaf tensors.
"""

print("Information about torch.ones_like:")
print(ones_like_info)
print("\nInformation about Autograd:")
print(autograd_info)

Information about torch.ones_like:

torch.ones_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) -> Tensor

Returns a tensor filled with the scalar value 1, with the same size as input.
torch.ones_like(input) is equivalent to torch.ones(input.size(), dtype=input.dtype, layout=input.layout, device=input.device).

Example:
>>> x = torch.zeros(3, 2)
>>> torch.ones_like(x)
tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])


Information about Autograd:

Autograd is PyTorch's automatic differentiation engine. It records operations on tensors and computes gradients.
It is the core of PyTorch's deep learning capabilities. When a tensor has requires_grad=True, PyTorch tracks all operations performed on it.
When you call .backward() on a tensor, autograd computes the gradients of that tensor with respect to all leaf tensors that have requires_grad=True.
These gradients are then accumulated in the .grad attribute of the leaf tensor

## Explain `np.random.choice` output

### Subtask:
Explain the possible outputs of `np.random.choice(4, 2)`.


**Reasoning**:
Explain the behavior of `np.random.choice(4, 2)`, list the possible combinations and permutations, and summarize the complete set of possible outputs.



In [2]:
import numpy as np

# Explanation of np.random.choice(4, 2)
explanation = """
`np.random.choice(4, 2)` selects 2 elements from the range [0, 1, 2, 3].
By default, the selection is done without replacement, meaning each element
can be chosen at most once in a single outcome. The output is an array
containing the two selected elements in a specific order.
"""

# List all possible unique pairs (combinations)
combinations = [(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]

# List all possible ordered outcomes (permutations) for each unique pair
permutations = []
for combo in combinations:
  permutations.append((combo[0], combo[1]))
  permutations.append((combo[1], combo[0]))

# Summarize the complete set of all possible ordered outputs
possible_outputs = sorted(list(set(permutations)))

print(explanation)
print("\nPossible unique pairs (combinations):")
for combo in combinations:
  print(combo)

print("\nPossible ordered outcomes (permutations) for each unique pair:")
for perm in permutations:
  print(perm)

print("\nComplete set of all possible ordered outputs:")
for output in possible_outputs:
  print(output)


`np.random.choice(4, 2)` selects 2 elements from the range [0, 1, 2, 3].
By default, the selection is done without replacement, meaning each element
can be chosen at most once in a single outcome. The output is an array
containing the two selected elements in a specific order.


Possible unique pairs (combinations):
(0, 1)
(0, 2)
(0, 3)
(1, 2)
(1, 3)
(2, 3)

Possible ordered outcomes (permutations) for each unique pair:
(0, 1)
(1, 0)
(0, 2)
(2, 0)
(0, 3)
(3, 0)
(1, 2)
(2, 1)
(1, 3)
(3, 1)
(2, 3)
(3, 2)

Complete set of all possible ordered outputs:
(0, 1)
(0, 2)
(0, 3)
(1, 0)
(1, 2)
(1, 3)
(2, 0)
(2, 1)
(2, 3)
(3, 0)
(3, 1)
(3, 2)


## Rewrite with `np.random.randint`

### Subtask:
Generate code to rewrite `np.random.choice(4, 2)` using `np.random.randint`.


**Reasoning**:
The task requires rewriting `np.random.choice(4, 2)` using `np.random.randint`. This involves explaining the difference between the two functions in this context, proposing a strategy to simulate the behavior of `np.random.choice` using `np.random.randint`, and then implementing this strategy in Python code. I will combine the explanation and implementation into a single code block for efficiency.



In [3]:
import numpy as np

# Explanation of the difference between np.random.choice(4, 2) and using np.random.randint directly
print("Explanation:")
print("`np.random.choice(4, 2)` selects 2 *distinct* numbers from the range [0, 1, 2, 3] without replacement.")
print("Directly using `np.random.randint(0, 4, 2)` would generate two random integers in the range [0, 3], but it is possible for both to be the same.")
print("\nStrategy to simulate `np.random.choice(4, 2)` using `np.random.randint`:")
print("1. Create a list of the possible numbers: [0, 1, 2, 3].")
print("2. Generate a random index for the first number using `np.random.randint` within the bounds of the list.")
print("3. Generate a random index for the second number using `np.random.randint` within the bounds of the list.")
print("4. If the two indices are the same, regenerate the second index until it is distinct from the first.")
print("5. Use these distinct indices to select two numbers from the list.")

# Implement the strategy
numbers = [0, 1, 2, 3]
index1 = np.random.randint(0, len(numbers))
index2 = np.random.randint(0, len(numbers))

# Ensure the second index is distinct
while index2 == index1:
  index2 = np.random.randint(0, len(numbers))

# Select the numbers using the distinct indices
result = np.array([numbers[index1], numbers[index2]])

# Print the resulting two distinct random numbers
print("\nResult simulating np.random.choice(4, 2) using np.random.randint:")
print(result)

Explanation:
`np.random.choice(4, 2)` selects 2 *distinct* numbers from the range [0, 1, 2, 3] without replacement.
Directly using `np.random.randint(0, 4, 2)` would generate two random integers in the range [0, 3], but it is possible for both to be the same.

Strategy to simulate `np.random.choice(4, 2)` using `np.random.randint`:
1. Create a list of the possible numbers: [0, 1, 2, 3].
2. Generate a random index for the first number using `np.random.randint` within the bounds of the list.
3. Generate a random index for the second number using `np.random.randint` within the bounds of the list.
4. If the two indices are the same, regenerate the second index until it is distinct from the first.
5. Use these distinct indices to select two numbers from the list.

Result simulating np.random.choice(4, 2) using np.random.randint:
[2 1]


## Summary:

### Q&A

1.  **Explain `torch.ones_like` and `autograd` based on the PyTorch documentation.**
    *   `torch.ones_like(input)` returns a tensor filled with the scalar value 1, with the same size, data type, layout, and device as the `input` tensor. It is equivalent to `torch.ones(input.size(), dtype=input.dtype, layout=input.layout, device=input.device)`.
    *   Autograd is PyTorch's automatic differentiation engine. It records operations on tensors and computes gradients, forming the core of PyTorch's deep learning capabilities. When a tensor has `requires_grad=True`, Autograd tracks operations on it. Calling `.backward()` on a tensor computes gradients with respect to leaf tensors having `requires_grad=True`, accumulating them in their `.grad` attributes.

2.  **List all possible outputs of `np.random.choice(4, 2)`.**
    The possible ordered outputs are: (0, 1), (0, 2), (0, 3), (1, 0), (1, 2), (1, 3), (2, 0), (2, 1), (2, 3), (3, 0), (3, 1), (3, 2).

3.  **Rewrite `np.random.choice(4, 2)` using `np.random.randint`.**
    This can be achieved by generating two random indices for a list `[0, 1, 2, 3]` using `np.random.randint` and ensuring the second index is distinct from the first through a loop.

### Data Analysis Key Findings

*   `torch.ones_like` is a convenient function for creating a tensor of ones with the same shape and properties as another tensor.
*   Autograd is fundamental to PyTorch, enabling automatic gradient computation for training neural networks.
*   `np.random.choice(n, k)` by default selects `k` distinct elements from the range `0` to `n-1` without replacement, resulting in ordered outputs (permutations).
*   A direct call to `np.random.randint(low, high, size)` can produce duplicate values, unlike `np.random.choice(n, k)` when `k <= n` and sampling is without replacement.

### Insights or Next Steps

*   Understanding the distinction between sampling with and without replacement is crucial when choosing between functions like `np.random.choice` and `np.random.randint` for specific tasks.
*   When simulating sampling without replacement using functions that can produce duplicates, a mechanism (like checking for distinctness in a loop) is required.
