# SlowFast

*Author: FAIR PyTorchVideo*

**SlowFast networks pretrained on the Kinetics 400 dataset**


### Example Usage

#### Imports

Load the model:

In [1]:
import torch
# Choose the `slowfast_r50` model 
model = torch.hub.load('facebookresearch/pytorchvideo', 'slowfast_r50', pretrained=True)

Using cache found in C:\Users\jlee0/.cache\torch\hub\facebookresearch_pytorchvideo_main


Import remaining functions:

In [2]:
from typing import Dict
import json
import urllib
from torchvision.transforms import Compose, Lambda
from torchvision.transforms._transforms_video import (
    CenterCropVideo,
    NormalizeVideo,
)
from pytorchvideo.data.encoded_video import EncodedVideo
from pytorchvideo.transforms import (
    ApplyTransformToKey,
    ShortSideScale,
    UniformTemporalSubsample,
    UniformCropVideo
) 

  "The 'torchvision.transforms._functional_video' module is deprecated since 0.12 and will be removed in 0.14. "
  "The 'torchvision.transforms._transforms_video' module is deprecated since 0.12 and will be removed in 0.14. "


#### Setup

Set the model to eval mode and move to desired device.

In [3]:
# Set to GPU or CPU
device = "cpu"
model = model.eval()
model = model.to(device)

Download the id to label mapping for the Kinetics 400 dataset on which the torch hub models were trained. This will be used to get the category label names from the predicted class ids.

In [4]:
json_url = "https://dl.fbaipublicfiles.com/pyslowfast/dataset/class_names/kinetics_classnames.json"
json_filename = "kinetics_classnames.json"
try: urllib.URLopener().retrieve(json_url, json_filename)
except: urllib.request.urlretrieve(json_url, json_filename)

In [5]:
with open(json_filename, "r") as f:
    kinetics_classnames = json.load(f)

# Create an id to label name mapping
kinetics_id_to_classname = {}
for k, v in kinetics_classnames.items():
    kinetics_id_to_classname[v] = str(k).replace('"', "")

#### Define input transform

In [6]:
side_size = 256 #원하는 video size
mean = [0.45, 0.45, 0.45] #정규화를 위한 mean 정의
std = [0.225, 0.225, 0.225] #정규화를 위한 std 정의
crop_size = 256 #원하는 video size
num_frames = 32 #샘플링을 위한 frames 정의
sampling_rate = 2 #input clip의 길이를 정의하기 위해 사용
frames_per_second = 30 #영상의 기본 fps
slowfast_alpha = 4 #slow path와 fast path의 frames 비율을 정해주기 위해 사용
num_clips = 10
num_crops = 3

class PackPathway(torch.nn.Module):
    """
    Transform for converting video frames as a list of tensors. 
    """
    def __init__(self):
        super().__init__()
        
    def forward(self, frames: torch.Tensor):
        fast_pathway = frames
        # Perform temporal sampling from the fast pathway.
        slow_pathway = torch.index_select(
            frames,
            1,
            torch.linspace(
                0, frames.shape[1] - 1, frames.shape[1] // slowfast_alpha
            ).long(),
        )
        frame_list = [slow_pathway, fast_pathway]
        return frame_list

transform =  ApplyTransformToKey(
    key="video",
    transform=Compose(
        [
            UniformTemporalSubsample(num_frames),
            Lambda(lambda x: x/255.0),
            NormalizeVideo(mean, std), #비디오 정규화, 각 채널(=3)에 대해 적용된다
            ShortSideScale(
                size=side_size
            ),
            CenterCropVideo(crop_size),
            PackPathway()
        ]
    ),
)

# The duration of the input clip is also specific to the model.
clip_duration = (num_frames * sampling_rate)/frames_per_second #전체 video를 몇 초 clip으로 나눌지 선정

#### Run Inference

Download an example video.

In [35]:
url_link = "https://dl.fbaipublicfiles.com/pytorchvideo/projects/archery.mp4"
video_path = 'archery.mp4'
try: urllib.URLopener().retrieve(url_link, video_path)
except: urllib.request.urlretrieve(url_link, video_path)

Load the video and transform it to the input format required by the model.

In [8]:
# Select the duration of the clip to load by specifying the start and end duration
# The start_sec should correspond to where the action occurs in the video
start_sec = 0
end_sec = start_sec + clip_duration

# Initialize an EncodedVideo helper class and load the video
video = EncodedVideo.from_path(video_path)

# Load the desired clip
video_data = video.get_clip(start_sec=start_sec, end_sec=end_sec)

# Apply a transform to normalize the video input
video_data = transform(video_data)

# Move the inputs to the desired device
inputs = video_data["video"]
inputs = [i.to(device)[None, ...] for i in inputs]

In [9]:
print('slow pathway :',inputs[0].shape)
print('fast pathway :',inputs[1].shape)

slow pathway : torch.Size([1, 3, 8, 256, 256])
fast pathway : torch.Size([1, 3, 32, 256, 256])


#### Get Predictions

In [10]:
# Pass the input clip through the model
preds = model(inputs)

# Get the predicted classes
post_act = torch.nn.Softmax(dim=1)
preds = post_act(preds)
pred_classes = preds.topk(k=5).indices[0]

# Map the predicted classes to the label names
pred_class_names = [kinetics_id_to_classname[int(i)] for i in pred_classes]
print("Top 5 predicted labels: %s" % ", ".join(pred_class_names))

Top 5 predicted labels: archery, throwing axe, playing paintball, disc golfing, riding or walking with horse


### Model Description
SlowFast model architectures are based on [1] with pretrained weights using the 8x8 setting
on the Kinetics dataset. 

| arch | depth | frame length x sample rate | top 1 | top 5 | Flops (G) | Params (M) |
| --------------- | ----------- | ----------- | ----------- | ----------- | ----------- |  ----------- | ----------- |
| SlowFast | R50   | 8x8                        | 76.94 | 92.69 | 65.71     | 34.57      |
| SlowFast | R101  | 8x8                        | 77.90 | 93.27 | 127.20    | 62.83      |


### References
[1] Christoph Feichtenhofer et al, "SlowFast Networks for Video Recognition"
https://arxiv.org/pdf/1812.03982.pdf

### Predict for overlapping clips (archery)

하나의 영상에 대해 frames를 overlapping해 분류 작업을 수행하였다

In [11]:
num_frames = 30
sampling_rate = 2 
frames_per_second = 30

clip_duration = (num_frames * sampling_rate)/frames_per_second

In [12]:
video = EncodedVideo.from_path(video_path)

start_sec = 0

while True:
    if start_sec == 9:
        print('end of prediction')
        break;
    start_sec = start_sec + 1
    end_sec = start_sec + clip_duration

    video_data = video.get_clip(start_sec=start_sec, end_sec=end_sec)
    video_data = transform(video_data)
    inputs = video_data["video"]
    inputs = [i.to(device)[None, ...] for i in inputs]

    preds = model(inputs)

    post_act = torch.nn.Softmax(dim=1)
    preds = post_act(preds)
    pred_classes = preds.topk(k=1).indices[0]

    # Map the predicted classes to the label names
    pred_class_names = [kinetics_id_to_classname[int(i)] for i in pred_classes]
    print(start_sec,"sec","Top 1 predicted labels: %s" % ", ".join(pred_class_names))

1 sec Top 1 predicted labels: archery
2 sec Top 1 predicted labels: archery
3 sec Top 1 predicted labels: archery
4 sec Top 1 predicted labels: archery
5 sec Top 1 predicted labels: archery
6 sec Top 1 predicted labels: archery
7 sec Top 1 predicted labels: archery
8 sec Top 1 predicted labels: archery
9 sec Top 1 predicted labels: archery
end of prediction


### Test Slowfast with Custom data

Kinetics 400에 대해 pretrained 된 slowfast 모델이 UCF 101 dataset에 대해서도 어느정도 잘 동작함을 확인하였다.

In [13]:
# step1.opencv 라이브러리 불러오기
import cv2

# step2.영상 파일 열기
cap = cv2.VideoCapture('Basketball_1.avi')

# step3.영상의 가로, 세로 사이즈, 전체 프레임수, FPS 등을 출력
print('Frame width:', int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)))
print('Frame height:', int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)))
print('Frame count:', int(cap.get(cv2.CAP_PROP_FRAME_COUNT)))
print('FPS:', cap.get(cv2.CAP_PROP_FPS))

# step4.영상 닫고 모든창 종료
cap.release()

Frame width: 320
Frame height: 240
Frame count: 171
FPS: 29.97002997002997


In [14]:
start_sec = 0
end_sec = start_sec + clip_duration

# Initialize an EncodedVideo helper class and load the video
video = EncodedVideo.from_path('Basketball_1.mp4')

# Load the desired clip
video_data = video.get_clip(start_sec=start_sec, end_sec=end_sec)

# Apply a transform to normalize the video input
video_data = transform(video_data)

# Move the inputs to the desired device
inputs = video_data["video"]
inputs = [i.to(device)[None, ...] for i in inputs]

In [15]:
print('slow pathway :',inputs[0].shape)
print('fast pathway :',inputs[1].shape)

slow pathway : torch.Size([1, 3, 8, 256, 256])
fast pathway : torch.Size([1, 3, 32, 256, 256])


In [16]:
# Pass the input clip through the model
preds = model(inputs)

# Get the predicted classes
post_act = torch.nn.Softmax(dim=1)
preds = post_act(preds)
pred_classes = preds.topk(k=5).indices[0]

# Map the predicted classes to the label names
pred_class_names = [kinetics_id_to_classname[int(i)] for i in pred_classes]
print("Top 5 predicted labels: %s" % ", ".join(pred_class_names))

Top 5 predicted labels: shooting basketball, dunking basketball, playing basketball, dribbling basketball, passing American football (not in game)


### fine tuning with custom data

kinetics 400에 대해 pretrained 된 slowfast 모델을 custom data에 대해 fine tuning 하겠다

In [17]:
# Choose the `slowfast_r50` model 
model = torch.hub.load('facebookresearch/pytorchvideo', 'slowfast_r50', pretrained=True)

Using cache found in C:\Users\jlee0/.cache\torch\hub\facebookresearch_pytorchvideo_main


In [18]:
from torchinfo import summary

summary(model)

Layer (type:depth-idx)                                       Param #
Net                                                          --
├─ModuleList: 1-1                                            --
│    └─MultiPathWayWithFuse: 2-1                             --
│    │    └─ModuleList: 3-1                                  15,432
│    │    └─FuseFastToSlow: 3-2                              928
│    └─MultiPathWayWithFuse: 2-2                             --
│    │    └─ModuleList: 3-3                                  225,760
│    │    └─FuseFastToSlow: 3-4                              14,464
│    └─MultiPathWayWithFuse: 2-3                             --
│    │    └─ModuleList: 3-5                                  1,287,552
│    │    └─FuseFastToSlow: 3-6                              57,600
│    └─MultiPathWayWithFuse: 2-4                             --
│    │    └─ModuleList: 3-7                                  10,369,536
│    │    └─FuseFastToSlow: 3-8                              229,8

In [19]:
print(model)

Net(
  (blocks): ModuleList(
    (0): MultiPathWayWithFuse(
      (multipathway_blocks): ModuleList(
        (0): ResNetBasicStem(
          (conv): Conv3d(3, 64, kernel_size=(1, 7, 7), stride=(1, 2, 2), padding=(0, 3, 3), bias=False)
          (norm): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activation): ReLU()
          (pool): MaxPool3d(kernel_size=(1, 3, 3), stride=(1, 2, 2), padding=[0, 1, 1], dilation=1, ceil_mode=False)
        )
        (1): ResNetBasicStem(
          (conv): Conv3d(3, 8, kernel_size=(5, 7, 7), stride=(1, 2, 2), padding=(2, 3, 3), bias=False)
          (norm): BatchNorm3d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activation): ReLU()
          (pool): MaxPool3d(kernel_size=(1, 3, 3), stride=(1, 2, 2), padding=[0, 1, 1], dilation=1, ceil_mode=False)
        )
      )
      (multipathway_fusion): FuseFastToSlow(
        (conv_fast_to_slow): Conv3d(8, 16, kernel_size=(7, 1, 1), st

#### Modify fc layer

기존 slowfast fc layer(2304,400)을 이진 분류 모델에 맞게 (2304,2)로 수정한다

In [20]:
# freezing
for param in model.parameters():
    param.requires_grad = False

In [21]:
len(model.blocks)

7

In [22]:
model.blocks[6].proj

Linear(in_features=2304, out_features=400, bias=True)

In [23]:
import torch
import torch.nn as nn
import torch.nn.functional as F

classes = ('basketball', 'not-basketball') #basketball인지 아닌지 분류하는 이진 분류 문제

# fc layer 수정
fc_in_features = model.blocks[6].proj.in_features
model.blocks[6].proj = nn.Linear(fc_in_features, len(classes))
model = model.to(device)

In [24]:
print(model) #fc layer(out_features = 2)로 수정 완료

Net(
  (blocks): ModuleList(
    (0): MultiPathWayWithFuse(
      (multipathway_blocks): ModuleList(
        (0): ResNetBasicStem(
          (conv): Conv3d(3, 64, kernel_size=(1, 7, 7), stride=(1, 2, 2), padding=(0, 3, 3), bias=False)
          (norm): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activation): ReLU()
          (pool): MaxPool3d(kernel_size=(1, 3, 3), stride=(1, 2, 2), padding=[0, 1, 1], dilation=1, ceil_mode=False)
        )
        (1): ResNetBasicStem(
          (conv): Conv3d(3, 8, kernel_size=(5, 7, 7), stride=(1, 2, 2), padding=(2, 3, 3), bias=False)
          (norm): BatchNorm3d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (activation): ReLU()
          (pool): MaxPool3d(kernel_size=(1, 3, 3), stride=(1, 2, 2), padding=[0, 1, 1], dilation=1, ceil_mode=False)
        )
      )
      (multipathway_fusion): FuseFastToSlow(
        (conv_fast_to_slow): Conv3d(8, 16, kernel_size=(7, 1, 1), st

#### Create dataset

이진 분류를 위한 데이터 셋을 구축한다.

dataset은 UCF101의 basketball, boxing video로 이루어져 있으며, basketball 영상인지 아닌지를 분류하는 것을 목표로 한다.

In [25]:
cd dataset_prac/train

C:\Users\jlee0\Desktop\KYU\hanim ict\테트라부이&테트라인\추락감지 모델\dataset_prac\train


In [26]:
import pandas as pd

metadata = pd.read_csv('metadata_train.csv')

In [27]:
metadata

Unnamed: 0,video_name,preds,video_path
0,train_0.avi,1,C:/Users/jlee0/Desktop/KYU/hanim ict/테트라부이&테트라...
1,train_1.avi,1,C:/Users/jlee0/Desktop/KYU/hanim ict/테트라부이&테트라...
2,train_2.avi,1,C:/Users/jlee0/Desktop/KYU/hanim ict/테트라부이&테트라...
3,train_3.avi,1,C:/Users/jlee0/Desktop/KYU/hanim ict/테트라부이&테트라...
4,train_4.avi,1,C:/Users/jlee0/Desktop/KYU/hanim ict/테트라부이&테트라...
...,...,...,...
208,train_208.avi,0,C:/Users/jlee0/Desktop/KYU/hanim ict/테트라부이&테트라...
209,train_209.avi,0,C:/Users/jlee0/Desktop/KYU/hanim ict/테트라부이&테트라...
210,train_210.avi,0,C:/Users/jlee0/Desktop/KYU/hanim ict/테트라부이&테트라...
211,train_211.avi,0,C:/Users/jlee0/Desktop/KYU/hanim ict/테트라부이&테트라...


In [31]:
train = metadata

len(train)

213

In [28]:
side_size = 256 #원하는 video size
mean = [0.45, 0.45, 0.45] #정규화를 위한 mean 정의
std = [0.225, 0.225, 0.225] #정규화를 위한 std 정의
crop_size = 256 #원하는 video size
num_frames = 32 #샘플링을 위한 frames 정의
sampling_rate = 2 #input clip의 길이를 정의하기 위해 사용
frames_per_second = 30 #영상의 기본 fps
slowfast_alpha = 4 #slow path와 fast path의 frames 비율을 정해주기 위해 사용
num_clips = 10
num_crops = 3


class PackPathway(torch.nn.Module):
    """
    Transform for converting video frames as a list of tensors. 
    """
    def __init__(self):
        super().__init__()
        
    def forward(self, frames: torch.Tensor):
        fast_pathway = frames
        # Perform temporal sampling from the fast pathway.
        slow_pathway = torch.index_select(
            frames,
            1,
            torch.linspace(
                0, frames.shape[1] - 1, frames.shape[1] // slowfast_alpha
            ).long(),
        )
        frame_list = [slow_pathway, fast_pathway]
        return frame_list

    
transform =  ApplyTransformToKey(
    key="video",
    transform=Compose(
        [
            UniformTemporalSubsample(num_frames),
            Lambda(lambda x: x/255.0),
            NormalizeVideo(mean, std), #비디오 정규화, 각 채널(=3)에 대해 적용된다
            ShortSideScale(
                size=side_size
            ),
            CenterCropVideo(crop_size),
            PackPathway()
        ]
    ),
)

In [42]:
from torch.utils.data import Dataset, DataLoader

class CustomDataset(Dataset):
    def __init__(self,metadata,transform=None):
        self.metadata = metadata
        
        self.video_path_list = metadata['video_path']
        self.video_class_list = metadata['preds']
        
        self.transform = transform
        
    def __len__(self):
        return len(metadata)
    
    def __getitem__(self,idx):
        video_path = self.video_path_list[idx]
        video_class = self.video_class_list[idx]
        
        video = EncodedVideo.from_path(video_path)
        
        if self.transform is not None:
            video_data = video.get_clip(start_sec=1, end_sec=2)

            video_data = transform(video_data)
            
            inputs = video_data["video"]
            inputs = [i.to(device)[None, ...] for i in inputs]
            
        return inputs, video_class

In [65]:
dataset = CustomDataset(metadata=metadata,transform=transform)
trainloader = DataLoader(dataset=dataset,
                        batch_size=10,
                        shuffle=True,
                        drop_last=False)

### Training model

구축된 데이터 셋을 통해 UCF101에 대한 Slowfast 모델을 fine-tunning 한다.

In [66]:
from torch import optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001,
                      momentum=0.9)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=200)

In [116]:
from tqdm import tqdm

def train(epoch, model, criterion, optimizer):
    model.train()
    train_loss = 0
    correct = 0
    total = 0
    for batch_idx, (videos, labels) in enumerate(tqdm(trainloader)):
        fuse = []
        for _, X in enumerate(videos):
            X = X.reshape((-1,) + X.shape[2:])
            fuse.append(X)
        
        optimizer.zero_grad()
        
        preds = model(fuse)
        post_act = torch.nn.Softmax(dim=1)
        preds = post_act(preds)
        #outputs = torch.argmax(preds,dim = 1)
        
        loss = criterion(preds, labels)
        loss.backward()
        optimizer.step()

        train_loss += loss.item()*videos[0].shape[0]
        _, predicted = preds.max(1)
        total += labels.size(0)
        correct += predicted.eq(labels).sum().item()
    
    epoch_loss = train_loss/total
    epoch_acc = correct/total*100
    print("Train | Loss:%.4f Acc: %.2f%% (%s/%s)" 
        % (epoch_loss, epoch_acc, correct, total))
    return epoch_loss, epoch_acc

In [120]:
import time
import copy

start_time = time.time()
best_acc = 0
epoch_length = 2
save_loss = {"train":[]}
save_acc = {"train":[]}
for epoch in range(epoch_length):
    print("Epoch %s" % epoch)
    train_loss, train_acc = train(epoch, model, criterion, optimizer)
    save_loss['train'].append(train_loss)

    scheduler.step()

    # Save model
    if train_acc > best_acc:
        best_acc = train_acc
        best_model_wts = copy.deepcopy(model.state_dict())
    model.load_state_dict(best_model_wts)
    
learning_time = time.time() - start_time
print(f'**Learning time: {learning_time // 60:.0f}m {learning_time % 60:.0f}s')


  0%|          | 0/22 [00:00<?, ?it/s][A

Epoch 0



  5%|▍         | 1/22 [00:25<08:50, 25.28s/it][A
  9%|▉         | 2/22 [00:50<08:26, 25.32s/it][A
 14%|█▎        | 3/22 [01:15<07:58, 25.17s/it][A
 18%|█▊        | 4/22 [01:40<07:31, 25.09s/it][A
 23%|██▎       | 5/22 [02:05<07:05, 25.01s/it][A
 27%|██▋       | 6/22 [02:30<06:40, 25.03s/it][A
 32%|███▏      | 7/22 [02:55<06:15, 25.05s/it][A
 36%|███▋      | 8/22 [03:20<05:50, 25.02s/it][A
 41%|████      | 9/22 [03:45<05:26, 25.08s/it][A
 45%|████▌     | 10/22 [04:10<05:02, 25.17s/it][A
 50%|█████     | 11/22 [04:35<04:35, 25.08s/it][A
 55%|█████▍    | 12/22 [05:00<04:10, 25.05s/it][A
 59%|█████▉    | 13/22 [05:27<03:49, 25.48s/it][A
 64%|██████▎   | 14/22 [05:54<03:29, 26.15s/it][A
 68%|██████▊   | 15/22 [06:23<03:07, 26.81s/it][A
 73%|███████▎  | 16/22 [06:49<02:40, 26.67s/it][A
 77%|███████▋  | 17/22 [07:15<02:12, 26.52s/it][A
 82%|████████▏ | 18/22 [07:41<01:45, 26.33s/it][A
 86%|████████▋ | 19/22 [08:07<01:18, 26.16s/it][A
 91%|█████████ | 20/22 [08:33<00:52, 26

Train | Loss:0.3684 Acc: 100.00% (213/213)



  0%|          | 0/22 [00:00<?, ?it/s][A

Epoch 1



  5%|▍         | 1/22 [00:26<09:26, 26.96s/it][A
  9%|▉         | 2/22 [00:52<08:53, 26.68s/it][A
 14%|█▎        | 3/22 [01:18<08:22, 26.45s/it][A
 18%|█▊        | 4/22 [01:45<07:59, 26.64s/it][A
 23%|██▎       | 5/22 [02:11<07:27, 26.30s/it][A
 27%|██▋       | 6/22 [02:37<06:57, 26.08s/it][A
 32%|███▏      | 7/22 [03:02<06:28, 25.92s/it][A
 36%|███▋      | 8/22 [03:29<06:05, 26.08s/it][A
 41%|████      | 9/22 [03:54<05:38, 26.01s/it][A
 45%|████▌     | 10/22 [04:20<05:10, 25.87s/it][A
 50%|█████     | 11/22 [04:46<04:45, 25.95s/it][A
 55%|█████▍    | 12/22 [05:11<04:17, 25.73s/it][A
 59%|█████▉    | 13/22 [05:37<03:51, 25.69s/it][A
 64%|██████▎   | 14/22 [06:03<03:25, 25.66s/it][A
 68%|██████▊   | 15/22 [06:29<03:00, 25.77s/it][A
 73%|███████▎  | 16/22 [06:56<02:36, 26.14s/it][A
 77%|███████▋  | 17/22 [07:22<02:10, 26.10s/it][A
 82%|████████▏ | 18/22 [07:47<01:43, 25.85s/it][A
 86%|████████▋ | 19/22 [08:13<01:17, 25.82s/it][A
 91%|█████████ | 20/22 [08:39<00:51, 25

Train | Loss:0.3576 Acc: 100.00% (213/213)
**Learning time: 18m 21s


In [121]:
# 모델의 state_dict 출력
print("Model's state_dict:")
for param_tensor in model.state_dict():
    print(param_tensor, "\t", model.state_dict()[param_tensor].size())

print()

# 옵티마이저의 state_dict 출력
print("Optimizer's state_dict:")
for var_name in optimizer.state_dict():
    print(var_name, "\t", optimizer.state_dict()[var_name])

Model's state_dict:
blocks.0.multipathway_blocks.0.conv.weight 	 torch.Size([64, 3, 1, 7, 7])
blocks.0.multipathway_blocks.0.norm.weight 	 torch.Size([64])
blocks.0.multipathway_blocks.0.norm.bias 	 torch.Size([64])
blocks.0.multipathway_blocks.0.norm.running_mean 	 torch.Size([64])
blocks.0.multipathway_blocks.0.norm.running_var 	 torch.Size([64])
blocks.0.multipathway_blocks.0.norm.num_batches_tracked 	 torch.Size([])
blocks.0.multipathway_blocks.1.conv.weight 	 torch.Size([8, 3, 5, 7, 7])
blocks.0.multipathway_blocks.1.norm.weight 	 torch.Size([8])
blocks.0.multipathway_blocks.1.norm.bias 	 torch.Size([8])
blocks.0.multipathway_blocks.1.norm.running_mean 	 torch.Size([8])
blocks.0.multipathway_blocks.1.norm.running_var 	 torch.Size([8])
blocks.0.multipathway_blocks.1.norm.num_batches_tracked 	 torch.Size([])
blocks.0.multipathway_fusion.conv_fast_to_slow.weight 	 torch.Size([16, 8, 7, 1, 1])
blocks.0.multipathway_fusion.norm.weight 	 torch.Size([16])
blocks.0.multipathway_fusion.nor

blocks.1.multipathway_blocks.1.res_blocks.1.branch2.norm_a.weight 	 torch.Size([8])
blocks.1.multipathway_blocks.1.res_blocks.1.branch2.norm_a.bias 	 torch.Size([8])
blocks.1.multipathway_blocks.1.res_blocks.1.branch2.norm_a.running_mean 	 torch.Size([8])
blocks.1.multipathway_blocks.1.res_blocks.1.branch2.norm_a.running_var 	 torch.Size([8])
blocks.1.multipathway_blocks.1.res_blocks.1.branch2.norm_a.num_batches_tracked 	 torch.Size([])
blocks.1.multipathway_blocks.1.res_blocks.1.branch2.conv_b.weight 	 torch.Size([8, 8, 1, 3, 3])
blocks.1.multipathway_blocks.1.res_blocks.1.branch2.norm_b.weight 	 torch.Size([8])
blocks.1.multipathway_blocks.1.res_blocks.1.branch2.norm_b.bias 	 torch.Size([8])
blocks.1.multipathway_blocks.1.res_blocks.1.branch2.norm_b.running_mean 	 torch.Size([8])
blocks.1.multipathway_blocks.1.res_blocks.1.branch2.norm_b.running_var 	 torch.Size([8])
blocks.1.multipathway_blocks.1.res_blocks.1.branch2.norm_b.num_batches_tracked 	 torch.Size([])
blocks.1.multipathway_

blocks.2.multipathway_blocks.0.res_blocks.3.branch2.norm_b.weight 	 torch.Size([128])
blocks.2.multipathway_blocks.0.res_blocks.3.branch2.norm_b.bias 	 torch.Size([128])
blocks.2.multipathway_blocks.0.res_blocks.3.branch2.norm_b.running_mean 	 torch.Size([128])
blocks.2.multipathway_blocks.0.res_blocks.3.branch2.norm_b.running_var 	 torch.Size([128])
blocks.2.multipathway_blocks.0.res_blocks.3.branch2.norm_b.num_batches_tracked 	 torch.Size([])
blocks.2.multipathway_blocks.0.res_blocks.3.branch2.conv_c.weight 	 torch.Size([512, 128, 1, 1, 1])
blocks.2.multipathway_blocks.0.res_blocks.3.branch2.norm_c.weight 	 torch.Size([512])
blocks.2.multipathway_blocks.0.res_blocks.3.branch2.norm_c.bias 	 torch.Size([512])
blocks.2.multipathway_blocks.0.res_blocks.3.branch2.norm_c.running_mean 	 torch.Size([512])
blocks.2.multipathway_blocks.0.res_blocks.3.branch2.norm_c.running_var 	 torch.Size([512])
blocks.2.multipathway_blocks.0.res_blocks.3.branch2.norm_c.num_batches_tracked 	 torch.Size([])
bl

blocks.3.multipathway_blocks.0.res_blocks.0.branch2.norm_b.bias 	 torch.Size([256])
blocks.3.multipathway_blocks.0.res_blocks.0.branch2.norm_b.running_mean 	 torch.Size([256])
blocks.3.multipathway_blocks.0.res_blocks.0.branch2.norm_b.running_var 	 torch.Size([256])
blocks.3.multipathway_blocks.0.res_blocks.0.branch2.norm_b.num_batches_tracked 	 torch.Size([])
blocks.3.multipathway_blocks.0.res_blocks.0.branch2.conv_c.weight 	 torch.Size([1024, 256, 1, 1, 1])
blocks.3.multipathway_blocks.0.res_blocks.0.branch2.norm_c.weight 	 torch.Size([1024])
blocks.3.multipathway_blocks.0.res_blocks.0.branch2.norm_c.bias 	 torch.Size([1024])
blocks.3.multipathway_blocks.0.res_blocks.0.branch2.norm_c.running_mean 	 torch.Size([1024])
blocks.3.multipathway_blocks.0.res_blocks.0.branch2.norm_c.running_var 	 torch.Size([1024])
blocks.3.multipathway_blocks.0.res_blocks.0.branch2.norm_c.num_batches_tracked 	 torch.Size([])
blocks.3.multipathway_blocks.0.res_blocks.1.branch2.conv_a.weight 	 torch.Size([256

blocks.3.multipathway_blocks.1.res_blocks.0.branch2.conv_b.weight 	 torch.Size([32, 32, 1, 3, 3])
blocks.3.multipathway_blocks.1.res_blocks.0.branch2.norm_b.weight 	 torch.Size([32])
blocks.3.multipathway_blocks.1.res_blocks.0.branch2.norm_b.bias 	 torch.Size([32])
blocks.3.multipathway_blocks.1.res_blocks.0.branch2.norm_b.running_mean 	 torch.Size([32])
blocks.3.multipathway_blocks.1.res_blocks.0.branch2.norm_b.running_var 	 torch.Size([32])
blocks.3.multipathway_blocks.1.res_blocks.0.branch2.norm_b.num_batches_tracked 	 torch.Size([])
blocks.3.multipathway_blocks.1.res_blocks.0.branch2.conv_c.weight 	 torch.Size([128, 32, 1, 1, 1])
blocks.3.multipathway_blocks.1.res_blocks.0.branch2.norm_c.weight 	 torch.Size([128])
blocks.3.multipathway_blocks.1.res_blocks.0.branch2.norm_c.bias 	 torch.Size([128])
blocks.3.multipathway_blocks.1.res_blocks.0.branch2.norm_c.running_mean 	 torch.Size([128])
blocks.3.multipathway_blocks.1.res_blocks.0.branch2.norm_c.running_var 	 torch.Size([128])
block

blocks.4.multipathway_blocks.0.res_blocks.0.branch1_norm.num_batches_tracked 	 torch.Size([])
blocks.4.multipathway_blocks.0.res_blocks.0.branch2.conv_a.weight 	 torch.Size([512, 1280, 3, 1, 1])
blocks.4.multipathway_blocks.0.res_blocks.0.branch2.norm_a.weight 	 torch.Size([512])
blocks.4.multipathway_blocks.0.res_blocks.0.branch2.norm_a.bias 	 torch.Size([512])
blocks.4.multipathway_blocks.0.res_blocks.0.branch2.norm_a.running_mean 	 torch.Size([512])
blocks.4.multipathway_blocks.0.res_blocks.0.branch2.norm_a.running_var 	 torch.Size([512])
blocks.4.multipathway_blocks.0.res_blocks.0.branch2.norm_a.num_batches_tracked 	 torch.Size([])
blocks.4.multipathway_blocks.0.res_blocks.0.branch2.conv_b.weight 	 torch.Size([512, 512, 1, 3, 3])
blocks.4.multipathway_blocks.0.res_blocks.0.branch2.norm_b.weight 	 torch.Size([512])
blocks.4.multipathway_blocks.0.res_blocks.0.branch2.norm_b.bias 	 torch.Size([512])
blocks.4.multipathway_blocks.0.res_blocks.0.branch2.norm_b.running_mean 	 torch.Size([

In [131]:
#파일 위치 설정할 것
torch.save(model.state_dict(),'model_prac')

In [132]:
#파일 위치 설정할 것
torch.save(model,'wholemodel_prac')

### load model test

학습된 모델이 잘 작동하는지 확인한다.

In [None]:
#파일 위치 설정할 것
loadmodel = torch.load('wholemodel_prac')

In [144]:
dataset = CustomDataset(metadata=metadata,transform=transform)

In [191]:
for i in range(1,200,10):
    video,label = dataset[i]
    
    prediction = loadmodel(video)
    post_act = torch.nn.Softmax(dim=1)
    prediction = torch.argmax(post_act(prediction))
    
    if prediction == 1:
        pred_label_name = 'Basketball'
    else:
        pred_label_name = 'Boxing'
    
    if label == 1:
        label_name = 'Basketball'
    else:
        label_name = 'Boxing'
    
    if prediction == label:
        print('prediction :',pred_label_name,'label :',label_name,',',i,'th prediction is correct')
    else:
        print('prediction :',pred_label_name,'label :',label_name,',',i,'th prediction is wrong')

prediction : Basketball label : Basketball , 1 th prediction is correct
prediction : Basketball label : Basketball , 11 th prediction is correct
prediction : Basketball label : Basketball , 21 th prediction is correct
prediction : Basketball label : Basketball , 31 th prediction is correct
prediction : Basketball label : Basketball , 41 th prediction is correct
prediction : Basketball label : Basketball , 51 th prediction is correct
prediction : Basketball label : Basketball , 61 th prediction is correct
prediction : Basketball label : Basketball , 71 th prediction is correct
prediction : Basketball label : Basketball , 81 th prediction is correct
prediction : Basketball label : Basketball , 91 th prediction is correct
prediction : Boxing label : Boxing , 101 th prediction is correct
prediction : Boxing label : Boxing , 111 th prediction is correct
prediction : Boxing label : Boxing , 121 th prediction is correct
prediction : Boxing label : Boxing , 131 th prediction is correct
predict