# Tensor shape

Here are constraints I introduced to the NSF project:

1. **Tensor shape**: assume all tensors are in shape **(batchsize, length, dim-1, dim-2, ...)**, where 
    * batchsize: batch size of a data batch;
    * length: maximum length of data sequences in the batch;
    * dim-1: dimension of feature vector in one frame;
    * dim-2: when a feature vector per frame has more than 1 dimensions;
   
   Length is equivalent to the number of frames, or number of waveform sampling points.
   
2. **Behavior**: hidden layers should not change **batchsize** and **length** of input tensors unless specified (e.g., down-sampling, up-sampling)
    


### 1. Examples on tensor shape

In [3]:
# At the begining, let's load packages 
from __future__ import absolute_import
from __future__ import print_function
import sys
import numpy as np
import torch

import tool_lib
import matplotlib
import matplotlib.pyplot as plt
matplotlib.rcParams['figure.figsize'] = (10, 5)


In [6]:
# load mel and F0
mel_dim = 80
input_mel = tool_lib.read_raw_mat("data_models/acoustic_features/slt_arctic_b0474.mfbsp", mel_dim)

# convert it into the required tensor format
input_mel_tensor = torch.tensor(input_mel).unsqueeze(0)

In [7]:
print("Shape of original data: " + str(input_mel.shape))
print("Shape of data as tensor: " + str(input_mel_tensor.shape))

Shape of original data: (554, 80)
Shape of data as tensor: torch.Size([1, 554, 80])


In [8]:
input_mel_tensor[0] - input_mel

tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]])

### 2. Note

When using project-NN-Pytorch-scripts, we only need to load the data matrices in shape \[N, M\] (see [__getitem__](https://github.com/nii-yamagishilab/project-NN-Pytorch-scripts/blob/8c8318612e467c61c9d7d9315714e522bce3f2fe/core_scripts/data_io/default_data_io.py#L232) method).

This data io wrapped over [torch.utils.data.Dataset](https://pytorch.org/docs/stable/data.html#map-style-datasets) and [torch.utils.data.DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader) will automatically create tensor in (batchsize=1, N, M).

Of course, the default IO can return a mini-batch (batchsize>1, N, M), when all the data files in the dataset have the same shape \[N, M\]. No need to worry about this because we only use batchsize=1 in this project.

### 3. Conventions

In summary, the tensor used by neural networks in this project should be:

1. in shape (batchsize, length, dim)
2. unless special layers, most of the hidden layers and modules in this project do not change the 1st and 2nd dimension of the data: it takes (batchsize, length, dimM) as input and returns (batchsize, length, dimN)