#### torch.nn.Linear  

the linear neuron net:
input dimension
output dimension
the linear network has an attributte called .weight and .bias, both of them are tensors. 
.weight is of shape $\R^{d_{out}\times d_{in}}$
.bias is of shape $\R^{d_{out}}$
During forward propagation, the linear net does this：

$y= x A^{\top} + b$

where $A$ is the weight and b is the bias.

Notice that linear supports batch computation, and it only works on the last dimension of $x$. 
$y: (*, \R^{d_{out}})$
$x: (*, \R^{d_{in}})$

\* indicates arbitrary dimension tuple.


In [2]:
import torch
import torch.nn as nn
in_dim=10
out_dim=5
fc=nn.Linear(in_dim,out_dim)
batch_input_data=torch.randn(3,in_dim)
batch_output_data=fc(batch_input_data)
print(fc.weight.shape)
print(fc.bias.shape)
print(batch_output_data.shape)

torch.Size([5, 10])
torch.Size([5])
torch.Size([3, 5])


## Hydra
How to use hydra to set configurations for the main function


In [None]:
import hydra
from omegaconf import DictConfig

@hydra.main(config_path="config.yaml")
def my_app(cfg: DictConfig) -> None:
    # Your Hydra-enabled application code here
    print(cfg.pretty())

if __name__ == "__main__":
    my_app()

In the above example:
@hydra.main decorates the entry point function my_app. It tells Hydra that it is the main function to be executed.
config_path="config.yaml" specifies the path to the configuration file that Hydra should use.
cfg: DictConfig is the configuration object that will be passed to your main function.
You can then run your Hydra-enabled script, and Hydra will handle the configuration loading and setup based on the config.yaml file you provide.

### How to use hydra to instantiate some kind of function from a function class, as specified by some config file

In [None]:
from hydra.utils import instantiate
# Define your configuration as a dictionary
config = {
    "_target_": "module_name.ClassName",  # Specify the target module and class to instantiate
    "param1": value1,
    "param2": value2
}
# Instantiate an object using the configuration
obj = instantiate(config)

_target_: Specifies the target module and class to instantiate. This key is mandatory.

Other keys such as "param1", "param2", etc., are used to provide values for the parameters of the class being instantiated.

When configuring the _target_ key, you should provide the full import path of the module and class you want to instantiate.

For example: "module_name.ClassName"indicating the module named module_name and the class within it called ClassName.

For example you can use activation dictionary to specify the activation function used in this Python application

In [None]:
activation:
    target_: torch.nn.ELU
    # you can define parameters of the __init__ function for the object here
    inplace: False  # use inplace activation to save memory

### Assert

assert some boolean expression ,  'logging error message string'

In [3]:
def func(var:int)->float:
    assert greedy(var)==int and var < 0, 'var must be a negative integer'
    import numpy as np
    return np.sqrt(-1*var)

print(func(-1))

print(func(1))

1.0


AssertionError: var must be a negative integer

### Cross referencing in .yaml files


train:
  nstep: ${buffer.nstep}
agent:
  gamma: 0.99
  nstep: ${buffer.nstep}

buffer:
  nstep: 1
  gamma: ${agent.gamma}

### Objective-oriented Python


In [None]:
class QNetwork(nn.Module):
    def __init__(self, state_size, action_size, hidden_size, activation):
        super(QNetwork, self).__init__()

If you inherit some class from a super class e.g. the nn.Module in this example, then when you define the __init__ method of the subclass, the first thing to do
is the initialize the super class's constructor first. Like you write super(SubClassName, self).__init__(SuperClassInitParameters)
to pass the subclass's name to superclass' constructor is for readabilty. 

### Use dir() to check the attributes/methods/structure of an unfamiliar module/instance/library

Here's how dir() works and what it returns:

Syntax: The syntax of dir() is dir(object). When you pass an object to dir(), it inspects the object and returns a sorted list of names comprising its attributes.

Output:

For a module: dir(module) returns a list of all the names defined in the module's namespace.

For a class or instance: dir(class_or_instance) returns a list of attributes and methods that the class or instance has access to.

If you call dir() without an object (i.e., dir()), it returns a sorted list of the names in the current local scope.

Using dir() is a handy way to explore an object's attributes and methods, especially when working with unfamiliar libraries or objects. It helps you discover what functionality is available and can be particularly useful for interactive exploration and debugging in Python scripts.

### random generator in numpy

In [None]:
import numpy as np
# random generator abbreviated as 'rng'.
rng=np.random.default_rng

# Generate a random integer between 0 and 100
random_int = rng.integers(0, 101)

# Generate an array of random numbers from a normal distribution
random_array = rng.normal(loc=0, scale=1, size=10)

# check all the methods in rng.
print(dir(rng))


### How to use dataloader

DataLoader receives the dataset, split it into multipl batches with given sizes, and then returns an iterator. If you iter through the returned Iterator, you get each of the batched samples. 

* .to(device, non_blocking=True)

After you load the data in, use .to(device, non_blocking=True)  to 

1. move data to cpu/gpu specified by 'device'

2. use asynchronous loading to accelerating data transfer. non_blocking typically means we do not block other procedures of the cpu during data loading process.

* .pin_memory()
and then you can use .pin_memory() to further accelerate data usage on the GPU:

When you call .pin_memory() on a tensor or a batch of tensors within a PyTorch DataLoader, it ensures that the data is copied into pinned memory, which can be transferred to the GPU more efficiently. 

* .is_contiguous()  and .contiguous()
If a Tensor is created by slicing some rows and/or columns from another Tensor, then its memory is usually not contiguously allocated, which may slow down data processing.
So after loading data or receiving data as an input parameter, you can first check whether it is contiguously allocated, then if it is not, make it so.

here is_contiguous() returns a Boolean number indicating whether data is currently contiguous. 


In [None]:
import torch
from torch.utils.data import DataLoader

# Create a PyTorch DataLoader
dataset = ...
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
device='cuda'
# DataLoader receives the dataset, split it into multipl batches with given sizes, and then
# returns an iterator. If you iter through the returned Iterator, you get each of the batched samples. 

# Iterate through the DataLoader.
for data in dataloader:
    input_data, target = data
    input_data = input_data.to(device, non_blocking=True).pin_memory()
    if not input_data.is_contiguous():
        input_data=input_data.contiguous()


#### Use \__repr\__() to specify the representation string of a class
You do this for readability. You can specify the name of your class when you print an object instantiated from it.

In [5]:
class MyClass:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f"Name_of_MyClass(x={self.x}, y={self.y})"

obj=MyClass(3,2)
print(obj)

Name_of_MyClass(x=3, y=2)


### Deep Copy 
Deep copy:    copy.deepcopy() is  method that makes a deep copy of the object. 
Changing its source will not affect the copy.
v.s. 
shallow copy: only copies a reference, so changing the source will also affect the copied obj.

In [None]:
import copy

# Original list
original_list = [1, 2, [3, 4], 5]

# Deep copy of the original list
copied_list = copy.deepcopy(original_list)

# Modify the nested list within the copied list
copied_list[2][0] = "Changed"

# Output the original and copied list
print("Original List:", original_list)
print("Copied List:", copied_list)

### @torch.no_grad and .detach()

* @torch.no_grad() is use as a decorator to mark a function as not requiring gradient computation.

**When do we use it:**
Inference(forward propagation)
Evaluation(computint validation metrics, outputing logs)

**How to use it:**
we write it on top of a function definition, so that all the tensors within the function will not require gradient computation, i.e. their attribute .requires_grad will be set to False.

**Why do we use it:**
save memory and accelerate computation.

* .detach() is a method of the tensor that creates a new tensor that shares the same storage with the original tensor, but does not require gradient computation.

**When do we use it:**
like when you only want the values of a tensor, like some specific regularization terms

#### How to use zip():
It chooses items from multiple iterables of the same length and pack them into tuples.


In [3]:
for i in zip([1,2,3],['a','b','c'],(-3,-5,-7)):
    print(i, type(i))

(1, 'a', -3) <class 'tuple'>
(2, 'b', -5) <class 'tuple'>
(3, 'c', -7) <class 'tuple'>
