# 01 - Parsing MPII Dataset

- Date: 2022
- Author: Walid BENBIHI
- Source: [wbenbihi/hourglasstensorflow](https://github.com/wbenbihi/hourglasstensorflow) 

## Setup

### Imports

In [1]:
# Standard imports
import os
import sys
import re
import json
sys.path.append(os.path.join('..'))

In [2]:
# Specific Imports
import scipy.io
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [3]:
# Hourglass Tensorflow Imports
from hourglass_tensorflow.utils.parsers import mpii as mpii_parser

from hourglass_tensorflow.utils.parsers.mpii import MPIIAct
from hourglass_tensorflow.utils.parsers.mpii import MPIIDataset
from hourglass_tensorflow.utils.parsers.mpii import MPIIAnnorect
from hourglass_tensorflow.utils.parsers.mpii import MPIIAnnoPoint
from hourglass_tensorflow.utils.parsers.mpii import MPIIDatapoint
from hourglass_tensorflow.utils.parsers.mpii import MPIIAnnotation

### Global Variables

In [4]:
ROOT_FOLDER = '..'
DATA_FOLDER = "data"
MPII_MAT = "mpii.ignore.mat"
MPII_FILE = os.path.join(ROOT_FOLDER, DATA_FOLDER, MPII_MAT)

## Function definition

## MPII Documentation

--------------------------------------------------------------------------- 
MPII Human Pose Dataset, Version 1.0 
Copyright 2015 Max Planck Institute for Informatics 
Licensed under the Simplified BSD License [see bsd.txt] 
--------------------------------------------------------------------------- 

We are making the annotations and the corresponding code freely available for research 
purposes. If you would like to use the dataset for any other purposes please contact 
the authors. 

### Introduction
MPII Human Pose dataset is a state of the art benchmark for evaluation
of articulated human pose estimation. The dataset includes around
**25K images** containing over **40K people** with annotated body
joints. The images were systematically collected using an established
taxonomy of every day human activities. Overall the dataset covers
**410 human activities** and each image assigned an activity
label. Each image was extracted from a YouTube video and provided with
preceding and following un-annotated frames. In addition, for the test
set we obtained richer annotations including body part occlusions and
3D torso and head orientations.

Following the best practices for the performance evaluation benchmarks
in the literature we withhold the test annotations to prevent
overfitting and tuning on the test set. We are working on an automatic
evaluation server and performance analysis tools based on rich test
set annotations.

### Citing the dataset
```
@inproceedings{andriluka14cvpr,
               author = {Mykhaylo Andriluka and Leonid Pishchulin and Peter Gehler and Schiele, Bernt}
               title = {2D Human Pose Estimation: New Benchmark and State of the Art Analysis},
               booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
               year = {2014},
               month = {June}
}
```

### Download

-. **Images (12.9 GB)**
   
   http://datasets.d2.mpi-inf.mpg.de/andriluka14cvpr/mpii_human_pose_v1.tar.gz
-. **Annotations (12.5 MB)**	
   
   http://datasets.d2.mpi-inf.mpg.de/andriluka14cvpr/mpii_human_pose_v1_u12.tar.gz
-. **Videos for each image (25 batches x 17 GB)**	

   http://datasets.d2.mpi-inf.mpg.de/andriluka14cvpr/mpii_human_pose_v1_sequences_batch1.tar.gz
   ...
   http://datasets.d2.mpi-inf.mpg.de/andriluka14cvpr/mpii_human_pose_v1_sequences_batch25.tar.gz
-. **Image - video mapping (239 KB)**	
   
   http://datasets.d2.mpi-inf.mpg.de/andriluka14cvpr/mpii_human_pose_v1_sequences_keyframes.mat

### Annotation description 
Annotations are stored in a matlab structure `RELEASE` having following fields

- `.annolist(imgidx)` - annotations for image `imgidx`
  - `.image.name` - image filename
  - `.annorect(ridx)` - body annotations for a person `ridx`
		  - `.x1, .y1, .x2, .y2` - coordinates of the head rectangle
		  - `.scale` - person scale w.r.t. 200 px height
		  - `.objpos` - rough human position in the image
		  - `.annopoints.point` - person-centric body joint annotations
		    - `.x, .y` - coordinates of a joint
		    - `id` - joint id 
[//]: # "(0 - r ankle, 1 - r knee, 2 - r hip, 3 - l hip, 4 - l knee, 5 - l ankle, 6 - pelvis, 7 - thorax, 8 - upper neck, 9 - head top, 10 - r wrist, 10 - r wrist, 12 - r shoulder, 13 - l shoulder, 14 - l elbow, 15 - l wrist)"
		    - `is_visible` - joint visibility
  - `.vidx` - video index in `video_list`
  - `.frame_sec` - image position in video, in seconds
 
- `img_train(imgidx)` - training/testing image assignment 
- `single_person(imgidx)` - contains rectangle id `ridx` of *sufficiently separated* individuals
- `act(imgidx)` - activity/category label for image `imgidx`
  - `act_name` - activity name
  - `cat_name` - category name
  - `act_id` - activity id
- `video_list(videoidx)` - specifies video id as is provided by YouTube. To watch video on youtube go to https://www.youtube.com/watch?v=video_list(videoidx) 

### Browsing the dataset
- Please use our online tool for browsing the data
http://human-pose.mpi-inf.mpg.de/#dataset
- Red rectangles mark testing images

### References
- **2D Human Pose Estimation: New Benchmark and State of the Art Analysis.**

  Mykhaylo Andriluka, Leonid Pishchulin, Peter Gehler and Bernt Schiele. 

  IEEE CVPR'14
- **Fine-grained Activity Recognition with Holistic and Pose based Features.**

  Leonid Pishchulin, Mykhaylo Andriluka and Bernt Schiele.

  GCPR'14

### Contact
You can reach us via `<lastname>@mpi-inf.mpg.de`
We are looking forward to your feedback. If you have any questions related to the dataset please let us know.


## Main Code

### Read Data

Since MPII Human Pose Dataset labels are recorded in a MATLAB .mat file, we need to parse it to a clean pandas DataFrame. This format is heavily nested and needs a little bit of exploration to parse it completely

In [5]:
# Load .mat file
mat = scipy.io.loadmat(MPII_FILE, struct_as_record=False)
release_mat = mat['RELEASE'][0][0]

### Explore Data

We check if the `_fieldnames` are correct

In [6]:
print("MAT _fieldnames",release_mat._fieldnames)
# Accessing coordinates X of Point 0 from Person 0 in Image 4
print("<Joint 0> X Coordinate of <Person 0> from <Image 4>", release_mat.__dict__['annolist'][0][4].annorect[0][0].annopoints[0][0].point[0][0].x)

MAT _fieldnames ['annolist', 'img_train', 'version', 'single_person', 'act', 'video_list']
<Joint 0> X Coordinate of <Person 0> from <Image 4> [[610]]


In [7]:
# Train/Test Label
img_train = release_mat.img_train[0]
# List of Videos
video_list = release_mat.video_list[0]
video_list_json = [{'video': {'videoidx':i, 'video_list':item[0]}} for i, item in enumerate(video_list)]
# Read Data
mpii_version = release_mat.version[0]
annolist = release_mat.annolist[0]
single_person = release_mat.single_person
act = release_mat.act

#### Explore Act

In [8]:
# Explore Object Length and Type
print("Act List Size", len(act))
print("Act List Types", {type(i) for i in act})
print("Act List Object Types", {type(i[0]) for i in act})
# Get _fieldnames
print("Act Fieldnames", act[4][0]._fieldnames)

Act List Size 24987
Act List Types {<class 'numpy.ndarray'>}
Act List Object Types {<class 'scipy.io.matlab._mio5_params.mat_struct'>}
Act Fieldnames ['cat_name', 'act_name', 'act_id']


In [9]:
# Parse Act
act_json = [
    {
        'act':{
            'imgidx':i,
            'cat_name':elem[0].cat_name[0] if len(elem[0].cat_name) else None,
            'act_name':elem[0].act_name[0].split(', ') if len(elem[0].act_name) else None,
            'act_id':elem[0].act_id[0][0]
        }
    } 
    for i, elem in enumerate(act)
]
print("Sample Act", act_json[4])

Sample Act {'act': {'imgidx': 4, 'cat_name': 'sports', 'act_name': ['curling'], 'act_id': 1}}


#### Explore Single Person

In [10]:
# Explore Object Length and Type
print("Single Person List Size", len(single_person))
print("Single Person List Types", {type(i) for i in single_person})
print("Single Person List Object Types", {type(i[0][0]) if 0 not in i[0].shape else type(i[0]) for i in single_person})
# Get _fieldnames
print("Single Person Fieldnames", single_person[4])

Single Person List Size 24987
Single Person List Types {<class 'numpy.ndarray'>}
Single Person List Object Types {<class 'numpy.ndarray'>}
Single Person Fieldnames [array([[1],
        [2]], dtype=uint8)]


In [11]:
#Parse single_person
single_person_json = [
    {
        'single_person':{
            'imgidx':i,
            'ridx': [elm[0] for elm in item[0]] if 0 not in item[0].shape else None
        }
    }
    for i, item in enumerate(single_person)
]
print("Sample Single Person", single_person_json[4])

Sample Single Person {'single_person': {'imgidx': 4, 'ridx': [1, 2]}}


#### Explore Annolist

In [12]:
# Explore Object Length and Type
print("Annolist List Size", len(annolist))
print("Annolist List Types", {type(i) for i in annolist})
# Get _fieldnames
print("Annolist Fieldnames", annolist[4]._fieldnames)

Annolist List Size 24987
Annolist List Types {<class 'scipy.io.matlab._mio5_params.mat_struct'>}
Annolist Fieldnames ['image', 'annorect', 'frame_sec', 'vididx']


In [13]:
#Parse annolist
annolist_parse_json = [
    {
        'annopoint':{
            'imgidx':i,
            'image':item.image[0][0].name[0],
            'annorect':item.annorect,
            'frame_sec':item.frame_sec[0] if 0 not in item.frame_sec.shape else None,
            'vididx':item.vididx[0][0] if 0 not in item.vididx.shape else None,
        }
    }
    for i, item in enumerate(annolist)
]
print("Annolist Person", annolist_parse_json[4])

Annolist Person {'annopoint': {'imgidx': 4, 'image': '015601864.jpg', 'annorect': array([[<scipy.io.matlab._mio5_params.mat_struct object at 0x2a2a878b0>,
        <scipy.io.matlab._mio5_params.mat_struct object at 0x2a2a8f370>]],
      dtype=object), 'frame_sec': array([11], dtype=uint8), 'vididx': 1660}}


In [14]:
# Sample with raw parsing
annolist_parse_json[2:5]

[{'annopoint': {'imgidx': 2,
   'image': '073199394.jpg',
   'annorect': array([[<scipy.io.matlab._mio5_params.mat_struct object at 0x2a2a87550>]],
         dtype=object),
   'frame_sec': None,
   'vididx': None}},
 {'annopoint': {'imgidx': 3,
   'image': '059865848.jpg',
   'annorect': array([[<scipy.io.matlab._mio5_params.mat_struct object at 0x2a2a87700>]],
         dtype=object),
   'frame_sec': None,
   'vididx': None}},
 {'annopoint': {'imgidx': 4,
   'image': '015601864.jpg',
   'annorect': array([[<scipy.io.matlab._mio5_params.mat_struct object at 0x2a2a878b0>,
           <scipy.io.matlab._mio5_params.mat_struct object at 0x2a2a8f370>]],
         dtype=object),
   'frame_sec': array([11], dtype=uint8),
   'vididx': 1660}}]

##### Explore Annolist Sample

In [15]:
# Set Variables
IMAGE_INDEX = 4
BODY_INDEX = 0
JOINT_INDEX = 0

In [16]:
# Get Persons on the IMAGE_INDEX
bodies = annolist_parse_json[IMAGE_INDEX]['annopoint']['annorect'][0]
print(f"Image {IMAGE_INDEX} contains {len(bodies)} person(s)")
bodies


Image 4 contains 2 person(s)


array([<scipy.io.matlab._mio5_params.mat_struct object at 0x2a2a878b0>,
       <scipy.io.matlab._mio5_params.mat_struct object at 0x2a2a8f370>],
      dtype=object)

In [17]:
# Explore Annorect _fieldnames
print("Annorect structure has the following fields", bodies[BODY_INDEX]._fieldnames)
# Get Person Bounding Box
print(f"<Person {BODY_INDEX}> from <Image {IMAGE_INDEX}> has the following bounding box {{ x1:{bodies[BODY_INDEX].x1[0][0]}, y1:{bodies[BODY_INDEX].y1[0][0]}, x2:{bodies[BODY_INDEX].x2[0][0]}, y2:{bodies[BODY_INDEX].y2[0][0]} }}")
# Get the scale attribute
print(f"<Person {BODY_INDEX}> from <Image {IMAGE_INDEX}> has Scale={bodies[BODY_INDEX].scale[0][0]}")
# Get the objpos attribute
print(f"<Person {BODY_INDEX}> from <Image {IMAGE_INDEX}> has ObjPos={{ x:{bodies[BODY_INDEX].objpos[0][0].x[0][0]}, y:{bodies[BODY_INDEX].objpos[0][0].y[0][0]} }}")
# Get the joints
print(f"<Person {BODY_INDEX}> from <Image {IMAGE_INDEX}> has {len(bodies[BODY_INDEX].annopoints[0][0].point[0])} joints")
print(f"<Person {BODY_INDEX}> from <Image {IMAGE_INDEX}> Joints=", [
    dict(x = j.x[0, 0], y=j.y[0, 0], id=j.id[0, 0], is_visible=j.is_visible)
    for j in bodies[BODY_INDEX].annopoints[0][0].point[0]
])

Annorect structure has the following fields ['x1', 'y1', 'x2', 'y2', 'annopoints', 'scale', 'objpos']
<Person 0> from <Image 4> has the following bounding box { x1:627, y1:100, x2:706, y2:198 }
<Person 0> from <Image 4> has Scale=3.021046176409755
<Person 0> from <Image 4> has ObjPos={ x:594, y:257 }
<Person 0> from <Image 4> has 16 joints
<Person 0> from <Image 4> Joints= [{'x': 610, 'y': 187, 'id': 6, 'is_visible': array([[0]], dtype=uint8)}, {'x': 647, 'y': 176, 'id': 7, 'is_visible': array([[1]], dtype=uint8)}, {'x': 637.0201, 'y': 189.8183, 'id': 8, 'is_visible': array([], shape=(0, 0), dtype=uint8)}, {'x': 695.9799, 'y': 108.1817, 'id': 9, 'is_visible': array([], shape=(0, 0), dtype=uint8)}, {'x': 620, 'y': 394, 'id': 0, 'is_visible': array([[1]], dtype=uint8)}, {'x': 616, 'y': 269, 'id': 1, 'is_visible': array([[1]], dtype=uint8)}, {'x': 573, 'y': 185, 'id': 2, 'is_visible': array([[1]], dtype=uint8)}, {'x': 647, 'y': 188, 'id': 3, 'is_visible': array([[0]], dtype=uint8)}, {'x':

### Parse MPII

This package contains utility functions to help you parse the MPII Dataset

```python
from hourglass_tensorflow.utils.parsers import mpii as mpii_parser

from hourglass_tensorflow.utils.parsers.mpii import parse_mpii

from hourglass_tensorflow.utils.parsers.mpii import parse_objpos
from hourglass_tensorflow.utils.parsers.mpii import parse_is_visible
from hourglass_tensorflow.utils.parsers.mpii import parse_annopoints
from hourglass_tensorflow.utils.parsers.mpii import parse_additional_annorect_item
from hourglass_tensorflow.utils.parsers.mpii import parse_annorect_item
from hourglass_tensorflow.utils.parsers.mpii import parse_annorect
from hourglass_tensorflow.utils.parsers.mpii import parse_annolist
from hourglass_tensorflow.utils.parsers.mpii import parse_img_train
from hourglass_tensorflow.utils.parsers.mpii import parse_single_person
from hourglass_tensorflow.utils.parsers.mpii import parse_video_list
from hourglass_tensorflow.utils.parsers.mpii import parse_act
```

In [18]:
# Parse data with utility functions
## `remove_null_keys` is a keyword argument that remove None values when set to True
parsed_img_train = mpii_parser.parse_img_train(img_train, remove_null_keys=False)
parsed_video_list = mpii_parser.parse_video_list(video_list, remove_null_keys=False)
parsed_annolist = mpii_parser.parse_annolist(annolist, remove_null_keys=False)
parsed_single_person = mpii_parser.parse_single_person(single_person, remove_null_keys=False)
parsed_act = mpii_parser.parse_act(act[0], remove_null_keys=False)
# Sample
parsed_annolist[3:5]

[{'index': 3,
  'frame_sec': None,
  'image': '059865848.jpg',
  'vididx': None,
  'annorect': [{'index': 0,
    'objpos': {'x': 684, 'y': 309},
    'scale': 4.928480496055553}]},
 {'index': 4,
  'frame_sec': 11,
  'image': '015601864.jpg',
  'vididx': 1660,
  'annorect': [{'index': 0,
    'annopoints': [{'id': 6, 'x': 610, 'y': 187, 'is_visible': 0},
     {'id': 7, 'x': 647, 'y': 176, 'is_visible': 1},
     {'id': 8, 'x': 637, 'y': 189},
     {'id': 9, 'x': 695, 'y': 108},
     {'id': 0, 'x': 620, 'y': 394, 'is_visible': 1},
     {'id': 1, 'x': 616, 'y': 269, 'is_visible': 1},
     {'id': 2, 'x': 573, 'y': 185, 'is_visible': 1},
     {'id': 3, 'x': 647, 'y': 188, 'is_visible': 0},
     {'id': 4, 'x': 661, 'y': 221, 'is_visible': 1},
     {'id': 5, 'x': 656, 'y': 231, 'is_visible': 1},
     {'id': 10, 'x': 606, 'y': 217, 'is_visible': 1},
     {'id': 11, 'x': 553, 'y': 161, 'is_visible': 1},
     {'id': 12, 'x': 601, 'y': 167, 'is_visible': 1},
     {'id': 13, 'x': 692, 'y': 185, 'is_v

The previously parsed arrays are stored as records of generic dictionaries. To make linting and autocompletion available to all, `pydantic.BaseModel` classes are available

```python
from hourglass_tensorflow.utils.parsers.mpii import MPIIAct
from hourglass_tensorflow.utils.parsers.mpii import MPIIDataset
from hourglass_tensorflow.utils.parsers.mpii import MPIIAnnorect
from hourglass_tensorflow.utils.parsers.mpii import MPIIAnnoPoint
from hourglass_tensorflow.utils.parsers.mpii import MPIIDatapoint
from hourglass_tensorflow.utils.parsers.mpii import MPIIAnnotation
```

In [19]:
# Let's use one BaseModel to make parsed_annolist as a record of structured object
structured_annolist = [MPIIAnnotation.parse_obj(an) for an in parsed_annolist]
structured_annolist[4]


MPIIAnnotation(index=4, annorect=[MPIIAnnorect(index=0, annopoints=[MPIIAnnoPoint(id=6, x=610, y=187, is_visible=0), MPIIAnnoPoint(id=7, x=647, y=176, is_visible=1), MPIIAnnoPoint(id=8, x=637, y=189, is_visible=None), MPIIAnnoPoint(id=9, x=695, y=108, is_visible=None), MPIIAnnoPoint(id=0, x=620, y=394, is_visible=1), MPIIAnnoPoint(id=1, x=616, y=269, is_visible=1), MPIIAnnoPoint(id=2, x=573, y=185, is_visible=1), MPIIAnnoPoint(id=3, x=647, y=188, is_visible=0), MPIIAnnoPoint(id=4, x=661, y=221, is_visible=1), MPIIAnnoPoint(id=5, x=656, y=231, is_visible=1), MPIIAnnoPoint(id=10, x=606, y=217, is_visible=1), MPIIAnnoPoint(id=11, x=553, y=161, is_visible=1), MPIIAnnoPoint(id=12, x=601, y=167, is_visible=1), MPIIAnnoPoint(id=13, x=692, y=185, is_visible=1), MPIIAnnoPoint(id=14, x=693, y=240, is_visible=1), MPIIAnnoPoint(id=15, x=688, y=313, is_visible=1)], objpos=MPIIObjPos(x=594, y=257), scale=3.021046176409755, x1=627, y1=627, x2=706, y2=706, head_r11=None, head_r12=None, head_r13=None, 

### Command Line Interface (HTF)

To make use of these utility functions, a command line interface (CLI) is available when installing the `hourglass_tensorflow` package

```bash
$ htf [OPTIONS] COMMAND [ARGS]
$ htf mpii --help # For MPII related operations
$ htf mpii parse   ... # See documentation
$ htf mpii convert ... # See documentation
```

In [20]:
!htf --help

Usage: htf [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  mpii  Operation related to MPII management / parsing


In [21]:
!htf mpii --help

Usage: htf mpii [OPTIONS] COMMAND [ARGS]...

  Operation related to MPII management / parsing

Options:
  --help  Show this message and exit.

Commands:
  convert  Convert a MPII .mat file to a HTF compliant record
  parse    Parse a MPII .mat file to a more readable record


In [22]:
!htf mpii convert --help

Usage: htf mpii convert [OPTIONS] INPUT OUTPUT

  Convert a MPII .mat file to a HTF compliant record

Options:
  -v, --verbose / --no-verbose  Activate Logs
  --help                        Show this message and exit.


In [23]:
!htf mpii parse --help

Usage: htf mpii parse [OPTIONS] INPUT OUTPUT

  Parse a MPII .mat file to a more readable record

Options:
  -v, --verbose / --no-verbose  Activate Logs
  --validate / --no-validate    Whether to use validation checks (default
                                false)
  --struct / --no-struct        Whether or not to apply pydantic parsing
                                (default false)
  --as-list / --no-as-list      Activate to return list of records (default
                                false)
  --null / --no-null            Keep null values in records (default true)
  --help                        Show this message and exit.
