# Video Actor Synchroncy and Causality (VASC)
## RAEng: Measuring Responsive Caregiving Project
### Caspar Addyman, 2020
### https://github.com/infantlab/VASC

# Step 1  Process videos using OpenPose

This script uses [OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) human figure recognition neural network to create labeled wireframes for each figure in each frame of a video. OpenPoseDemo will go through a video frame by frame outputing a JSON file for each frame that contains a set of coordinate points and for a wireframe for each video.

## 1.1 - Libraries

In [28]:
#import the python libraries we need
import os
import sys
import time
import glob
import json
import cv2               #computervision toolkit
import numpy as np
from datetime import datetime


#turn on debugging
%pdb on

Automatic pdb calling has been turned ON


### 1.2 Where is OpenPose?

We need the full path to your openpose directory

In [29]:
# location of openposedemo - THIS WILL BE DIFFERENT ON YOUR COMPUTER
openposepath = "C:\\Users\\cas\\openpose-1.5.0-binaries-win64-gpu-python-flir-3d_recommended\\"
#openposepath = "C:\\Users\\caspar\\openpose-1.4.0-win64-cpu-binaries\\"

if sys.platform == "win32":
    app = "bin\\OpenPoseDemo.exe"
else:
    app = 'bin\\OpenPoseDemo.bin'

openposeapp = openposepath + app
print(openposeapp)

C:\Users\cas\openpose-1.5.0-binaries-win64-gpu-python-flir-3d_recommended\bin\OpenPoseDemo.exe


### 1.3 Where are your videos?

In the next cell you need to specify the folder with your set of video files. So that we process them. These scripts use the following director structure. It expects your videos to be in a subfolder of your project 

```
path\to\project\myvideos
```

and then it creates a folder `out` in the project at the same level as the videos with three subfolders for JSON files, the aggregated timeseries and the analyses

```
path\to\project\out\openpose
path\to\project\out\timeseries
path\to\project\out\analyses
```

In [30]:
# where's the project folder? (with trailing slash)
# projectpath = os.getcwd() + "\\..\\lookit\\"
projectpath = "C:\\Users\\Cas\\OneDrive - Goldsmiths College\\Projects\\Measuring Responsive Caregiving\\lookit\\"

# locations of videos and output
videos_in = projectpath 
videos_out   = projectpath + "out"
videos_out_openpose   = videos_out + "\\openpose"
videos_out_timeseries = videos_out + "\\timeseries"
videos_out_analyses   = videos_out + "\\analyses"

print(videos_in)
print(videos_out)
print(videos_out_openpose)
print(videos_out_timeseries)
print(videos_out_analyses)

C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\
C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out
C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\openpose
C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\timeseries
C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\analyses


In [31]:
#first get list of videos in the inbox
avis = glob.glob(videos_in + "*.avi")
mp4s = glob.glob(videos_in + "*.mp4")

print("We found %d avis" % len(avis))
print("We found %d mp4s" % len(mp4s))

#For the moment we will manually specify what videos to process. 
#TODO generate a list of force or skip videos to automate things slightly
allvideos = []
allvideos.extend(avis)
allvideos.extend(mp4s)

We found 0 avis
We found 10 mp4s


### 1.4 Calling the OpenPose app
To operate OpenPose we pass a set of parameters to the demo executable. For the full list of options see  [OpenPoseDemo](https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/demo_overview.md)

Our main parameters are

```
--video        path\to\video_to_process   #input video
--write_json   path\to\output_directory   #one json file per frame
--write_video  path\to\output_directory   #video with identified figures
--write_images path\to\output_directory   #one image per frame with wireframes
--disable_blending true/false             # wireframes on black background (true) or blended on top of video (false)
 ```

Other useful params
 ```
--frame_first  100    #start from frame 100
--display 0           #don't show the images as they are processed
 ```


In [32]:
#put all params in a dictionary object
params = dict()
params["write_json"] = videos_out_openpose
params["write_images"] = videos_out_openpose  #for the moment dump images in output file - TODO name subfolder
params["disable_blending"] = "true"
params["display"]  = "1"

#### The main openpose loop

Call the openpose app for each of the videos at a time. For each one print the full command that we use so that you can use it manually to investigate any errors. 

Finally, we write a list of the processed videos to a file called `videos.json`. 
Note that we will add other information to this file as we go through other steps. 

In [33]:
currdir =  os.getcwd() + "\\" #keep track of current directory so we can change back to it after processing

optstring = ""
for key in params:
    optstring += " --" + key +  ' "' + params[key] + '"' #need to quote paths 

print(optstring)

videos = {}
count = 0
os.chdir(openposepath)
for vid in allvideos:
    #first we need base name of video for the output file name
    fullname = os.path.basename(vid)
    base, fmt = os.path.splitext(fullname)
    video_outname = base + "_output.avi"
    #log some info about this video
    videos[base] = {}
    videos[base]["fullname"] = fullname
    videos[base]["fullpath"] = vid
    videos[base]["index"] = None         #the numerical index this data will have in np.array.
    videos[base]["format"] = fmt
    videos[base]["openpose"] =  {"exitcode" : None, "when" : None} 
    
    print("\n\nStaring openpose processing of " + vid)
    try:
        # Log the time
        time_start = time.time()
        video = ' --video "' + vid + '"'
        video_out = ' --write_video "' + videos_out_openpose + '\\' + video_outname + '"'
        opbin = openposeapp + video + video_out + optstring
        print(opbin)
        exitcode = os.system(opbin)
        videos[base]["openpose"]["exitcode"] = exitcode
        # Log the time again
        time_end = time.time()
        if (exitcode == 0):
            videos[base]["index"] = count  #TODO - Use this 
            count += 1
            videos[base]["openpose"]["when"] = datetime.now().isoformat()
            videos[base]["openpose"]["out"] = videos_out_openpose + '\\' + video_outname
            print ("Done " + vid)
            print ("It took %d seconds for conversion." % (time_end-time_start))
        else:
            print("OpenPose error. Exit code %d" % exitcode)
    except Exception as e:
        print("Error: ", e)
        pass
    
#change the directory back
os.chdir(currdir)
    
#now we've finished, write a list of processed videos to a file
with open(videos_out + '\\videos.json', 'w') as outfile:
    json.dump(videos, outfile)

 --write_json "C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\openpose" --write_images "C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\openpose" --disable_blending "true" --display "1"


Staring openpose processing of C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\lookit.01.mp4
C:\Users\cas\openpose-1.5.0-binaries-win64-gpu-python-flir-3d_recommended\bin\OpenPoseDemo.exe --video "C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\lookit.01.mp4" --write_video "C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\openpose\lookit.01_output.avi" --write_json "C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\openpose" --write_images "C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\

Done C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\lookit.09.mp4
It took 17 seconds for conversion.


Staring openpose processing of C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\lookit.10.mp4
C:\Users\cas\openpose-1.5.0-binaries-win64-gpu-python-flir-3d_recommended\bin\OpenPoseDemo.exe --video "C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\lookit.10.mp4" --write_video "C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\openpose\lookit.10_output.avi" --write_json "C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\openpose" --write_images "C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregiving\lookit\out\openpose" --disable_blending "true" --display "1"
Done C:\Users\Cas\OneDrive - Goldsmiths College\Projects\Measuring Responsive Caregivi

## 1.5 Gather the data into useable format.

OpenPose has created one JSON file per frame of video. We want to group these up into bigger arrays. 

This routine needs to know where to find the processed videos and what are the base names. These are listed in the `videos.json` file we created.

In [34]:
#retrieve the list of base names of processed videos.
with open(videos_out + '\\videos.json') as json_file:
    videos = json.load(json_file)

First find out the height, width and frames per second for each video and add this to `videos.json`

In [35]:
#make a note of some video properties

for vid in videos:
    cap = cv2.VideoCapture(videos[vid]["fullpath"]) # 0=camera
    if cap.isOpened(): 
        videos[vid]["height"] = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
        videos[vid]["width"] = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
        videos[vid]["fps"] = int(cap.get(cv2.CAP_PROP_FPS))
        videos[vid]["n_frames"] = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    cap.release()

In [36]:
#optional
#print these out to remind ourselves. 
for vid in videos:  
    print(vid)
    print(videos[vid])

lookit.01
{'fullname': 'lookit.01.mp4', 'fullpath': 'C:\\Users\\Cas\\OneDrive - Goldsmiths College\\Projects\\Measuring Responsive Caregiving\\lookit\\lookit.01.mp4', 'index': 0, 'format': '.mp4', 'openpose': {'exitcode': 0, 'when': '2020-02-24T10:33:45.063115', 'out': 'C:\\Users\\Cas\\OneDrive - Goldsmiths College\\Projects\\Measuring Responsive Caregiving\\lookit\\out\\openpose\\lookit.01_output.avi'}, 'height': 480, 'width': 640, 'fps': 15, 'n_frames': 162}
lookit.02
{'fullname': 'lookit.02.mp4', 'fullpath': 'C:\\Users\\Cas\\OneDrive - Goldsmiths College\\Projects\\Measuring Responsive Caregiving\\lookit\\lookit.02.mp4', 'index': 1, 'format': '.mp4', 'openpose': {'exitcode': 0, 'when': '2020-02-24T10:33:56.472137', 'out': 'C:\\Users\\Cas\\OneDrive - Goldsmiths College\\Projects\\Measuring Responsive Caregiving\\lookit\\out\\openpose\\lookit.02_output.avi'}, 'height': 480, 'width': 640, 'fps': 19, 'n_frames': 162}
lookit.03
{'fullname': 'lookit.03.mp4', 'fullpath': 'C:\\Users\\Cas\\O

####  Extracting all the numeric data from the json files

We loop through the list of names in `videos` and search for all json files associated with that name. We then extract all the coordinates and confidence scores for all identified people in each frame and store them in one big multidimensional padded array.

```
1st dimension - number of videos
2nd dimension - max nummber of frames
3rd dimension - max number of people
4th dimension - number of values (per person) output by openpose
```

For example, if we had the following videos 

```
video1 - 200 frames  - 3 people (max) 
video2 - 203 frames  - 2 people (max) 
video3 - 219 frames  - 4 people (max) 
```

then we'd create a `3 x 219 x 4 x 75` array.

First we see how big the first two dimensions of the array have to be. 
And create an numpy array called `keypoints_array` big enough to hold all of this.

As a sanity check we count the number of frames processed by openpose. Ought to be same as above.


In [37]:
nvideos = len(videos)
maxframes = 0
maxpeople = 10 #maximum people we might expect (large upper bound)
ncoords = 75 #the length of the array coming back from openpose x,y coords of each point plus pafs

for vid in videos:    
    #use glob to get all the individual json files.
    alljson = glob.glob(videos_out_openpose + "\\" + vid + "*.json")
    nframes = len(alljson)
    print("Video", vid, "has {0} frames.".format(nframes))
    videos[vid]["frames"] = nframes
    maxframes = max(maxframes,nframes)
    
    
keypoints_array = np.zeros([nvideos,maxframes,maxpeople,ncoords]) #big array to hold all the numbers
print("Initialise numpy array of size", keypoints_array.shape)

Video lookit.01 has 162 frames.
Video lookit.02 has 161 frames.
Video lookit.03 has 164 frames.
Video lookit.04 has 379 frames.
Video lookit.05 has 412 frames.
Video lookit.06 has 229 frames.
Video lookit.07 has 282 frames.
Video lookit.08 has 222 frames.
Video lookit.09 has 245 frames.
Video lookit.10 has 259 frames.
Initialise numpy array of size (10, 412, 10, 75)


Now loop through all the videos copying the frame data into our big `keypoints_array` and also seeing how many people (max) are detected in each one. 

In [38]:
npeople = np.zeros(maxframes)  #an array to track how many people detected per frame.
globalmaxpeople =  0

for vid in videos:  
    #use glob to get all the individual json files.
    alljson = glob.glob(videos_out_openpose + "\\" + vid + "*.json") 
    v = videos[vid]["index"]
    i = 0
    for frame in alljson:
        with open(frame, "r") as read_file:
            data = json.load(read_file)
            j = 0
            for p in data["people"]:
                keypoints = p["pose_keypoints_2d"]  
                keypoints_array[v,i,j,:]=keypoints
                j += 1
            npeople[i] = j
            i += 1
    #end loop for this video
    people = int(max(npeople))
    print("Video", vid, "has {0} people detected.".format(people))
    videos[vid]["maxpeople"] = people
    #how many people did it contain? Is this biggest number so far?
    globalmaxpeople = max(globalmaxpeople, people)
    v += 1
    
#and just like that n videos have been reduced to a big block of people coords.
#we now truncate the array for the maximum number of people as the rest of it is all zeros

keypoints_array = np.delete(keypoints_array,np.s_[int(globalmaxpeople):],2)

print("keypoints_array has size", keypoints_array.shape)

Video lookit.01 has 4 people detected.
Video lookit.02 has 2 people detected.
Video lookit.03 has 3 people detected.
Video lookit.04 has 3 people detected.
Video lookit.05 has 4 people detected.
Video lookit.06 has 4 people detected.
Video lookit.07 has 3 people detected.
Video lookit.08 has 3 people detected.
Video lookit.09 has 3 people detected.
Video lookit.10 has 3 people detected.
keypoints_array has size (10, 412, 4, 75)


## 1.6 Save the data!

Saving the data at this stage so we don't have to repeat these steps again if we reorganise or reanalyse the data.

We create a compressed NumPy array `allframedata.npz` containing the person location data for all the videos. 

We also update the `videos.json` file with more info about the videos. 

In [39]:
#update the json file in the video out directory
with open(videos_out + '\\videos.json', 'w') as outfile:
    json.dump(videos, outfile)

# in the time series folder we save the data file. 
#in a compressed format as it has a lot of empty values
np.savez_compressed(videos_out_timeseries + '\\allframedata.npz', keypoints_array=keypoints_array)



#### That's it. 

Now go onto [Step 2 - Organising the data](Step2.OrganiseData.ipynb)