# Demonstrating Redundant Code in PyTorch's Video Classification Training Script

Script --> https://github.com/pytorch/vision/blob/master/references/video_classification/train.py

In the `init` of a `VideoClips` object, `compute_clips` is called in the last line.

In the `train.py` script, another call is made to `compute_clips` after initialising the training and tesing dataset of class `Kinetics400` (which internally calls `VideoClips`)

```python

143.    dataset = torchvision.datasets.Kinetics400(
144.        traindir,
145.        frames_per_clip=args.clip_len,
146.        step_between_clips=1,
147.        transform=transform_train
148.    )

153.   dataset.video_clips.compute_clips(args.clip_len, 1, frame_rate=15)


175.    dataset_test = torchvision.datasets.Kinetics400(
176.        valdir,
177.        frames_per_clip=args.clip_len,
178.        step_between_clips=1,
179.        transform=transform_test
180.    )

185.   dataset_test.video_clips.compute_clips(args.clip_len, 1, frame_rate=15)
```

The code below demonstrates that the output of the above code in lines `153.` and `185.` can be achieved during the construction of the dataset itself

In [5]:
from pathlib import Path
from torchvision.datasets.kinetics import Kinetics400

In [3]:
base_dir = Path('/Users/rahulsomani/01_github_projects/video-classification/')
data_dir = base_dir/'data'

In [4]:
!tree {data_dir/'train'}

[01;34m/Users/rahulsomani/01_github_projects/video-classification/data/train[00m
├── [01;34mclass1[00m
│   ├── c1-sample1.mp4
│   └── c1-sample2.mp4
└── [01;34mclass2[00m
    ├── c2-sample1.mp4
    └── c2-sample2.mp4

2 directories, 4 files


`data_fps_none` is like the dataset that's constructed in `train.py` while `data_fps_15` shows that the additional code after construction is not needed

In [20]:
def get_data(frame_rate=None, root=data_dir/'train', frames_per_clip=16, extensions=('mp4',), step=1):
    return Kinetics400(root, frames_per_clip, step, frame_rate, extensions)

In [None]:
data_fps_none, data_fps_15 = get_data(), get_data(frame_rate=15)

In [31]:
[len(x) for x in data_fps_none.video_clips.clips]

[17, 32, 2, 0]

In [32]:
[len(x) for x in data_fps_15.video_clips.clips]

[4, 13, 0, 0]

In [33]:
data_fps_none.video_clips.compute_clips(num_frames=16, step=1, frame_rate=15)
[len(x) for x in data_fps_none.video_clips.clips]

[4, 13, 0, 0]