# 2.3.7 Non Reduction Sum

### Keeping Axes Unchanged in Reduction Operations

When calculating the **sum** or **mean**, it can be useful to keep the number of axes unchanged.  
- This is important when we want to use the **broadcast mechanism** later.  
- To achieve this, we can specify the `keepdims=True` argument in the function, which retains the reduced axes with size 1 in the output tensor.

In [1]:
import torch

In [2]:
A=torch.arange(6, dtype=torch.float32).reshape(2,3)
B=A.clone

In [3]:
sum_A = A.sum(axis=1, keepdim=True)
sum_A, sum_A.shape

(tensor([[ 3.],
         [12.]]),
 torch.Size([2, 1]))

### Normalizing Rows with Broadcasting

By keeping the axes after summing, we can use **broadcasting** to normalize a matrix.  
- For example, if `sum_A` retains its two axes after summing each row, we can divide the original matrix `A` by `sum_A` using broadcasting.  
- This creates a matrix where each row sums up to 1.

In [4]:
A / sum_A

tensor([[0.0000, 0.3333, 0.6667],
        [0.2500, 0.3333, 0.4167]])

### Cumulative Sum of Tensor Elements

To calculate the **cumulative sum** of elements in a tensor along a specific axis (e.g., `axis=0` for row-by-row), we can use the `cumsum` function.  
- The `cumsum` function computes the cumulative sum without reducing the input tensor along any axis.  
- The output tensor retains the same shape as the input tensor.

In [9]:
A.cumsum(axis=0)

tensor([[0., 1., 2.],
        [3., 5., 7.]])