<a href="https://colab.research.google.com/github/kazuhiro1999/Automatic-Evaluation-of-Dance-Movements/blob/main/inference.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Inference Notebook

This notebook demonstrates the inference process for my dataset using a custom feature extraction and data processing pipeline. The primary objectives of this notebook include:

1. **Loading the Dataset**: The notebook assumes that the dataset is preprocessed and ready for inference. A program to create the dataset using MediaPipe is in development and will be made publicly available in the future.
2. **Temporary Code Updates**: Due to the outdated implementation of `DataLoader` and `FeatureExtractor`, updated versions of these components are included within this notebook for demonstration purposes.
3. **Inference Workflow**: The notebook showcases the end-to-end inference process, including feature extraction and predictions using the pre-trained model.

Please note that this notebook is designed specifically for my dataset and may require modifications to work with other datasets or configurations.

In [1]:
# clone repository
!git clone https://github.com/kazuhiro1999/Automatic-Evaluation-of-Dance-Movements.git
%cd Automatic-Evaluation-of-Dance-Movements

Cloning into 'Automatic-Evaluation-of-Dance-Movements'...
remote: Enumerating objects: 168, done.[K
remote: Counting objects: 100% (168/168), done.[K
remote: Compressing objects: 100% (92/92), done.[K
remote: Total 168 (delta 77), reused 152 (delta 68), pack-reused 0 (from 0)[K
Receiving objects: 100% (168/168), 6.04 MiB | 18.91 MiB/s, done.
Resolving deltas: 100% (77/77), done.
/content/Automatic-Evaluation-of-Dance-Movements


In [2]:
!pip install onnxruntime==1.17.1

Collecting onnxruntime==1.17.1
  Downloading onnxruntime-1.17.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (4.3 kB)
Collecting coloredlogs (from onnxruntime==1.17.1)
  Downloading coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)
Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime==1.17.1)
  Downloading humanfriendly-10.0-py2.py3-none-any.whl.metadata (9.2 kB)
Downloading onnxruntime-1.17.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.8/6.8 MB[0m [31m31.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading coloredlogs-15.0.1-py2.py3-none-any.whl (46 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.0/46.0 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading humanfriendly-10.0-py2.py3-none-any.whl (86 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.8/86.8 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[?25hInst

In [3]:
import gdown
import numpy as np
import onnxruntime

%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib import animation, rc
from mpl_toolkits.mplot3d import Axes3D

In [4]:
# Download Dataset
%mkdir data

gdown.download('https://drive.google.com/uc?id=1HhCZ1SrpI4E-5IE2cGAoJcP8LcyuVlyJ', './data/annotations.csv')
gdown.download('https://drive.google.com/uc?id=164qa0uFIc4iX0WORmxvL3AZavrq2VY96', './data/keypoints.zip')

!unzip ./data/keypoints.zip -d data
!rm ./data/keypoints.zip

Downloading...
From: https://drive.google.com/uc?id=1HhCZ1SrpI4E-5IE2cGAoJcP8LcyuVlyJ
To: /content/Automatic-Evaluation-of-Dance-Movements/data/annotations.csv
100%|██████████| 3.04k/3.04k [00:00<00:00, 781kB/s]
Downloading...
From (original): https://drive.google.com/uc?id=164qa0uFIc4iX0WORmxvL3AZavrq2VY96
From (redirected): https://drive.google.com/uc?id=164qa0uFIc4iX0WORmxvL3AZavrq2VY96&confirm=t&uuid=d455c2e2-9ecf-4318-9a1e-3863f454b11f
To: /content/Automatic-Evaluation-of-Dance-Movements/data/keypoints.zip
100%|██████████| 234M/234M [00:06<00:00, 34.4MB/s]


Archive:  ./data/keypoints.zip
  inflating: data/keypoints2024/d0_20221213_1.pkl  
  inflating: data/keypoints2024/d0_20221213_2.pkl  
  inflating: data/keypoints2024/d0_20230528_1.pkl  
  inflating: data/keypoints2024/d0_20230528_2.pkl  
  inflating: data/keypoints2024/d1_20221213_1.pkl  
  inflating: data/keypoints2024/d1_20221213_2.pkl  
  inflating: data/keypoints2024/d11_20230528_1.pkl  
  inflating: data/keypoints2024/d11_20230528_2.pkl  
  inflating: data/keypoints2024/d12_20230528_1.pkl  
  inflating: data/keypoints2024/d12_20230528_2.pkl  
  inflating: data/keypoints2024/d13_20230528_1.pkl  
  inflating: data/keypoints2024/d13_20230528_2.pkl  
  inflating: data/keypoints2024/d14_20230528_1.pkl  
  inflating: data/keypoints2024/d14_20230528_2.pkl  
  inflating: data/keypoints2024/d15_20230528_1.pkl  
  inflating: data/keypoints2024/d15_20230528_2.pkl  
  inflating: data/keypoints2024/d16_20230528_1.pkl  
  inflating: data/keypoints2024/d16_20230528_2.pkl  
  inflating: data/key

In [5]:
# Download pretrained model (onnx)
# These models are availavle at github releases/v1.0
%mkdir onnx

gdown.download('https://github.com/kazuhiro1999/Automatic-Evaluation-of-Dance-Movements/releases/download/v1.0/autoencoder_20241120.onnx', './onnx/autoencoder.onnx')
gdown.download('https://github.com/kazuhiro1999/Automatic-Evaluation-of-Dance-Movements/releases/download/v1.0/encoder_triplet_euclidean_20241120.onnx', './onnx/encoder_triplet_euclidean.onnx')
gdown.download('https://github.com/kazuhiro1999/Automatic-Evaluation-of-Dance-Movements/releases/download/v1.0/reference_model_dynamics_20241120.onnx', './onnx/reference_model_dynamics.onnx')

Downloading...
From: https://github.com/kazuhiro1999/Automatic-Evaluation-of-Dance-Movements/releases/download/v1.0/autoencoder_20241120.onnx
To: /content/Automatic-Evaluation-of-Dance-Movements/onnx/autoencoder.onnx
100%|██████████| 21.0M/21.0M [00:00<00:00, 130MB/s] 
Downloading...
From: https://github.com/kazuhiro1999/Automatic-Evaluation-of-Dance-Movements/releases/download/v1.0/encoder_triplet_euclidean_20241120.onnx
To: /content/Automatic-Evaluation-of-Dance-Movements/onnx/encoder_triplet_euclidean.onnx
100%|██████████| 36.8M/36.8M [00:00<00:00, 88.5MB/s]
Downloading...
From: https://github.com/kazuhiro1999/Automatic-Evaluation-of-Dance-Movements/releases/download/v1.0/reference_model_dynamics_20241120.onnx
To: /content/Automatic-Evaluation-of-Dance-Movements/onnx/reference_model_dynamics.onnx
100%|██████████| 572k/572k [00:00<00:00, 12.2MB/s]


'./onnx/reference_model_dynamics.onnx'

## Dataset

In [6]:
import pandas as pd
import pickle
import numpy as np

class DataLoader:

    def __init__(self, path, root_dir="./data/keypoints"):

        self.data = pd.read_csv(path)
        self.items = ['Dynamics', 'Sharpness', 'Scalability', 'Timing', 'Accuracy', 'Stability']
        self.keypoints = {}
        self.rotations = {}
        self.scores = {}
        self.standardized_scores = {}

        valid_keys = []

        for _, row in self.data.iterrows():
            key = row['ID']
            keypoints_path = row['DataPath']
            try:
                with open(f"{root_dir}/{keypoints_path}", 'rb') as p:
                    keypoints = pickle.load(p)
                self.keypoints[key] = keypoints['position18']
                self.rotations[key] = keypoints['rotation18']
                valid_keys.append(key)
            except Exception as e:
                print(f"couldn't load data from {root_dir}/{keypoints_path}: {e}")

        self.data = self.data[self.data['ID'].isin(valid_keys)].reset_index(drop=True)
        self.keys = valid_keys

        scores_np = self.data[self.items].to_numpy()
        mask = np.any(scores_np > 0, axis=-1)
        valid_scores = scores_np[mask]
        means = valid_scores.mean(axis=0)
        stds = valid_scores.std(axis=0)

        scores_np[self.data['IsReference'].tolist()] = 10
        standardized_scores_np = np.where(scores_np>0, (scores_np - means) / stds, -1)

        for i, key in enumerate(self.keys):
            self.scores[key] = {}
            self.standardized_scores[key] = {}
            for j, item in enumerate(self.items):
                self.scores[key][item] = scores_np[i,j]
                self.standardized_scores[key][item] = standardized_scores_np[i,j]

        return

    def load_keypoints3d(self, key, start_frame=0, end_frame=-1):
        return self.keypoints[key][start_frame:end_frame]

    def load_rotations(self, key, start_frame=0, end_frame=-1):
        return self.rotations[key][start_frame:end_frame]

    def load_score(self, key, item, standard=False):
        if standard:
            return self.standardized_scores[key][item]
        else:
            return self.scores[key][item]

In [7]:
path = 'data/annotations.csv'
root_dir = 'data/keypoints2024'
dataloader = DataLoader(path, root_dir)

couldn't load data from data/keypoints2024/d4_20221213_1.pkl: [Errno 2] No such file or directory: 'data/keypoints2024/d4_20221213_1.pkl'
couldn't load data from data/keypoints2024/d4_20221213_2.pkl: [Errno 2] No such file or directory: 'data/keypoints2024/d4_20221213_2.pkl'
couldn't load data from data/keypoints2024/d10_20230528_1.pkl: [Errno 2] No such file or directory: 'data/keypoints2024/d10_20230528_1.pkl'
couldn't load data from data/keypoints2024/d10_20230528_2.pkl: [Errno 2] No such file or directory: 'data/keypoints2024/d10_20230528_2.pkl'


In [8]:
dataloader.data

Unnamed: 0,ID,Dancer,Date,DataPath,Grade,IsReference,Annotated,Dynamics,Sharpness,Scalability,Timing,Accuracy,Stability
0,20221201,d0,20221213,d0_20221213_1.pkl,0,True,False,-1,-1,-1,-1,-1,-1
1,20221202,d0,20221213,d0_20221213_2.pkl,0,True,False,-1,-1,-1,-1,-1,-1
2,20230501,d0,20230528,d0_20230528_1.pkl,0,True,False,-1,-1,-1,-1,-1,-1
3,20230502,d0,20230528,d0_20230528_2.pkl,0,True,False,-1,-1,-1,-1,-1,-1
4,20221203,d1,20221213,d1_20221213_1.pkl,1,False,True,4,5,4,4,3,4
5,20221204,d1,20221213,d1_20221213_2.pkl,1,False,False,-1,-1,-1,-1,-1,-1
6,20221205,d2,20221213,d2_20221213_1.pkl,1,False,True,5,5,5,5,4,4
7,20221206,d2,20221213,d2_20221213_2.pkl,1,False,False,-1,-1,-1,-1,-1,-1
8,20221207,d3,20221213,d3_20221213_1.pkl,1,False,True,3,3,3,1,2,1
9,20221208,d3,20221213,d3_20221213_2.pkl,1,False,False,-1,-1,-1,-1,-1,-1


In [9]:
import numpy as np
from scipy import stats

class FeatureExtractor:
    """
    A feature extraction class for processing dance motion data.

    This class supports various feature extraction options including:
    - Position
    - Rotation
    - Velocity
    """

    INPUT_LENGTH = 89  # currently model input is 89 frames @ 60 fps
    NUM_JOINTS = 18
    JOINTS = {
        'hips': 0, 'chest': 1, 'neck': 2, 'head': 3,
        'left_upperarm': 4, 'left_lowerarm': 5, 'left_hand': 6,
        'right_upperarm': 7, 'right_lowerarm': 8, 'right_hand': 9,
        'left_upperleg': 10, 'left_lowerleg': 11, 'left_foot': 12, 'left_toe': 13,
        'right_upperleg': 14, 'right_lowerleg': 15, 'right_foot': 16, 'right_toe': 17
    }

    def __init__(self, config=None):
        """
        Initialize feature extractor with configuration.

        :param config: Dictionary of configuration options
        """
        default_config = {
            'use_position': True,
            'use_rotation': True,
            'use_velocity': True,
            'normalize': False,
            'use_root': True,
            'output_type': 'default'
        }
        self.config = {**default_config, **(config or {})}

    def apply(self, positions, rotations):
        """
        Extract features from positions and rotations and split into batches.

        :param positions: 3D keypoint positions, shape (n_frames, n_joints, 3)
        :param rotations: Rotation data, shape (n_frames, n_joints, 3)
        :return: Extracted features with shape based on configuration
        """
        n_frames, n_joints, _ = positions.shape
        positions = positions.copy()
        rotations = rotations.copy()
        features = []

        # Normalize positions if configured
        if self.config['normalize']:
            positions = self._normalize_positions(positions)

        # Position and relative positioning
        if self.config['use_position'] or self.config['use_velocity']:
            position = self._process_position(positions)
            if self.config['use_position']:
                features.append(position)

        # Rotation features
        if self.config['use_rotation']:
            features.append(rotations)

        # Velocity features
        if self.config['use_velocity']:
            velocity = self._calculate_velocity(positions)
            if self.config['use_velocity']:
                features.append(velocity)

        # Concatenate features
        features = np.concatenate(features, axis=-1, dtype=np.float32)  # shape (n_frames, n_joints, n_features)

        # Reshape into batches of INPUT_LENGTH
        n_features = features.shape[-1]
        n_batches = n_frames // self.INPUT_LENGTH
        truncated_length = n_batches * self.INPUT_LENGTH
        features = features[:truncated_length].reshape(n_batches, self.INPUT_LENGTH, n_joints, n_features)

        # Format output based on config
        if self.config['output_type'] == "graph":
            # Return (n_batches, INPUT_LENGTH, n_joints, n_features) as (n_batches, INPUT_LENGTH, n_joints, n_features)
            features = features.reshape(n_batches, self.INPUT_LENGTH, n_joints, n_features)
        else:
            # Return (n_batches, INPUT_LENGTH, n_joints, n_features) as (n_batches, INPUT_LENGTH, n_joints * n_features)
            features = features.reshape(n_batches, self.INPUT_LENGTH, n_joints * n_features)

        return features

    def _normalize_positions(self, positions):
        """Normalize positions based on height"""
        y_min = positions[:,:,1].min()
        positions[:,:,1] = positions[:,:,1] - y_min
        y_max = positions[:,3,1].mean()  # Head reference point
        scale_factor = 1 / y_max
        return positions * scale_factor

    def _process_position(self, positions):
        """Process positions, optionally using root joint as reference"""
        position = positions.copy()
        if self.config['use_root']:
            position[:,1:] -= position[:,:1]
        else:
            position -= position[:,:1]
        return position

    def _calculate_velocity(self, positions):
        """Calculate velocity between frames"""
        array_pad = np.concatenate([positions[:1], positions, positions[-1:]])
        diff = array_pad[2:] - array_pad[:-2]
        return diff

In [10]:
# Feature Extractor for triplet_encoder
feature_cfg_1 = {
    'use_position':True,
    'use_rotation':True,
    'use_velocity':True,
    'normalize':False,
    'use_root':True,
    'output_type':'default'
}

feature_extractor_1 = FeatureExtractor(feature_cfg_1)

In [11]:
# Feature extractor for autoencoder

feature_cfg_2 = {
    'use_position':True,
    'use_rotation':False,
    'use_velocity':False,
    'normalize':False,
    'use_root':False,
    'output_type':'graph'
}

feature_extractor_2 = FeatureExtractor(feature_cfg_2)

In [12]:
# preprocess data for encoder
def extract_features(positions, rotations):
    features1 = feature_extractor_1.apply(positions, rotations)
    features2 = feature_extractor_2.apply(positions, rotations)
    return features1, features2

## Load model

In [13]:
import onnxruntime

# load models as onnx inference session
autoencoder_sess = onnxruntime.InferenceSession('onnx/autoencoder.onnx', providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
triplet_encoder_sess = onnxruntime.InferenceSession('onnx/encoder_triplet_euclidean.onnx', providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
evaluation_sess = onnxruntime.InferenceSession('onnx/reference_model_dynamics.onnx', providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])



In [14]:
def run_encoder(session, features):
    inputs = {
        session.get_inputs()[0].name: features,
    }
    outputs = [outputs.name for outputs in session.get_outputs()]

    out = session.run(outputs, inputs)
    return out[0]

In [15]:
def run_evaluation(session, student_f1, student_f2, reference_f1, reference_f2):
    inputs = {
        session.get_inputs()[0].name: student_f1,
        session.get_inputs()[1].name: student_f2,
        session.get_inputs()[2].name: reference_f1,
        session.get_inputs()[3].name: reference_f2
    }
    outputs = [outputs.name for outputs in session.get_outputs()]

    out = session.run(outputs, inputs)
    return out[0]

## Inference

In [16]:
student_key = dataloader.keys[-1]
reference_key = dataloader.keys[0]

start_frame = 594
end_frame = 1309

# preprocess student data
student_positions = dataloader.load_keypoints3d(student_key, start_frame, end_frame)
student_rotations = dataloader.load_rotations(student_key, start_frame, end_frame)
student_f1, student_f2 = extract_features(student_positions, student_rotations)

# preprocess reference data
reference_positions = dataloader.load_keypoints3d(reference_key, start_frame, end_frame)
reference_rotations = dataloader.load_rotations(reference_key, start_frame, end_frame)
reference_f1, reference_f2 = extract_features(reference_positions, reference_rotations)

In [17]:
# inference model
student_z1 = run_encoder(triplet_encoder_sess, student_f1)
student_z2 = run_encoder(autoencoder_sess, student_f2)

reference_z1 = run_encoder(triplet_encoder_sess, reference_f1)
reference_z2 = run_encoder(autoencoder_sess, reference_f2)

preds = run_evaluation(evaluation_sess, student_z1, student_z2, reference_z1, reference_z2)

In [18]:
# scores weight (1～10)
weights = np.arange(1, 11)

# calculate weighted average (N, 10) × (10, ) -> (N, )
weighted_scores = np.dot(preds, weights)

# output final predicted score
predicted_score = weighted_scores.mean()

print(f"predicted score for dynamics: {predicted_score:.2f}")

predicted score for dynamics: 7.95
