# Sync activity video clips with accelerometer data

Project status:
- COMPLETE: Get start times for videos using Python
- COMPLETE: Use start/stop frame number and convert to UTC
- IN PROGRESS: deal with cisrol12
- Modify GUI function to use my start/stop times to label data for relevant subjects and cycles below

Notes:
- fps = 29.97 aka Video Frame Rate
- 33.367 milliseconds per frame

In [163]:
# Importing the Libraries
import os
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import re
import datetime as dt
from moviepy.video.io.ffmpeg_tools import ffmpeg_extract_subclip

# Test case video clip to get timestamps


## subject 1050
id = 'cisuabn14'
path = r'//FS2.smpp.local\RTO\CIS-PD Videos'
subj_path = os.path.join(path,id)
video_name = 'cisuabn14_cycle2.mp4'
video_clip_path = os.path.join(subj_path,video_name)

video_clip_path

## these give my downloaded time not the actual time
import os.path, time
print("Last modified: %s" % time.ctime(os.path.getmtime(video_clip_path)))
print("Created: %s" % time.ctime(os.path.getctime(video_clip_path)))

# Load all sec_annotation.csv files for each subj, concatenate into 1 df

In [164]:
# read in timestamp file with mp4 metadata
path = r'X:\CIS-PD Videos\timestamp'
filename = os.path.join(path, 'video_utc_timestamp.csv')
timestamp_df = pd.read_csv(filename)
timestamp_df = timestamp_df.drop(columns=['Unnamed: 0','videoname','CreateDate','ModifyDate','UTC_modify_date'])

In [165]:
# list of subjects without ciscid4, ciscih8, ciccij10 due to it being edited multiple times

# omitted cisrol12 as it doesn't have a sec_annotation file
names_minus3 = ['cisnwh8','cisuabd4','cisuabe5','cisuabf6','cisuabg7','cisnwe5','cisnwf6','cisuabn14']

# create empty list
appended_data = []

# create 1 dataframe from each subject's sec_annotation.csv file
for i, k in enumerate(names_minus3):
    path = r'X:\CIS-PD Videos'
    path_subj = os.path.join(path,k) 
    path_file = os.path.join(path_subj,'sec_annotation.csv')
    data = pd.read_csv(path_file)
    appended_data.append(data)
    
# concatenate list of dataframes
appended_data = pd.concat(appended_data, ignore_index=True)
appended_data = appended_data.drop(columns=['Unnamed: 0'])

# combine subjid and cycle number column to create a key for merge
# combine strings of both columns
timestamp_df.cycle = timestamp_df.cycle.astype(str)
timestamp_df.cycle = timestamp_df.subjid + timestamp_df.cycle
# drop subjid column
timestamp_df = timestamp_df.drop(columns=['subjid'])
# change name of column
timestamp_df = timestamp_df.rename(index=str,columns={'cycle':'subj_cycle'})

In [166]:
# Combine subject code and cycle column to create a key for merge in appended_data dataframe that
# has the activity clip frame annotations
appended_data.cycle = appended_data.cycle.astype(str)
appended_data['subj_cycle'] = appended_data['subject code'] + appended_data.cycle

# Merge dataframes based on subj_cycle columns in both

In [167]:
utc_df = pd.merge(timestamp_df, appended_data, on='subj_cycle',how='outer')

# Transform start and stop frame with UTC create time

In [168]:
utc_df['start_utc'] = utc_df['start frame']*33.367+utc_df.UTC_create_date
utc_df['stop_utc'] = utc_df['stop frame']*33.367+utc_df.UTC_create_date

# Adjust UAB site data by... 1yr 5 hrs
- UAB subject: 1003, 1005, 1007, 1009, 1050
- cisuabd4 cisuabe5 cisuabf6 cisuabg7 cisuabn14

In [169]:
# millisecond conversions
year = 31556952000
fivehr = 18000000
uab_convertor = year + fivehr

In [170]:
# Add 1 year and 5 hrs to uab subjects
uab_names = ('cisuabd4','cisuabe5','cisuabf6','cisuabg7','cisuabn14')
for i, k in enumerate(uab_names):
    utc_df.loc[utc_df['subject code'] == k, 'start_utc'] += uab_convertor
    utc_df.loc[utc_df['subject code'] == k, 'stop_utc'] += uab_convertor

# Combine NtsBts activity split into cycle 6 part 1 and 2 videos into 1 row

In [171]:
# combine rows 503 and 504
utc_df.stop_utc[503] = utc_df.stop_utc[504]

In [172]:
# drop row and reindex
utc_df = utc_df.drop([504]).reset_index(drop=True)

# Change cycle 7 to 6

In [173]:
utc_df.loc[utc_df.cycle == '7', 'cycle'] = '6'

# temp dataframe

In [174]:
df = utc_df.copy()

# Deal with cisrol12 (1048) separately
- utc_df does NOT have cisrol12 data on it
- Create separate script for cisrol12 since it doesn't have sec_annotation.csv

In [175]:
def keeprightstring(string, sep='cisrol12'):
    """Take a string and keep text after specified character.
    Default character is 'cisrol12'."""
    new_string = string.split(sep, 1)[-1]
    return new_string

In [176]:
# Add necessary data for cisrol12
# subject code
df.loc[df['subj_cycle'].str.contains('cisrol12'), 'subject code'] = 'cisrol12'
# start frame
df.loc[df['subj_cycle'].str.contains('cisrol12'), 'start frame'] = 0
# start_utc = create time
df.loc[df['subj_cycle'].str.contains('cisrol12'), 'start_utc'] = df.UTC_create_date
# cycle
df.loc[df['subj_cycle'].str.contains('cisrol12'), 'cycle'] = df.subj_cycle
for i in range(391,433):
    df.cycle[i] = keeprightstring(df.cycle[i])
# SKIP start time
# SKIP stop time

# Test to get cycle number from subj_cycle

In [16]:
df['subj_cycle'][391] # - 'cisrol12'

'cisrol121'

In [17]:
df['subj_cycle'][391].replace('cisrol12','')

'1'

# Add column containing 4 digit id
- need to complete cisrol12 data first to add its 4 digit id
- make sure to user 'df' dataframe

In [178]:
# Get subject id or code
path_id = r'X:\CIS-PD MUSC\decoded_forms'
filename_id = os.path.join(path_id, 'videoID.csv') # ie. file = 'videoID.csv'
subjid_df = pd.read_csv(filename_id)
subjid_df.SubjectCode = subjid_df.SubjectCode.astype('int')
# get 4 digit subject code
reverse_id_dict = subjid_df.set_index('FoxInsightID').to_dict()['SubjectCode']

In [19]:
# test
reverse_id_dict.get('cisrol12', 'Unknown')

1048

In [179]:
df['subject_number'] = df['subject code']
for i, k in enumerate(df.subject_number):
    df['subject_number'][i] = reverse_id_dict.get(k,1048)

In [192]:
df.groupby('subject_number').count()
# unknown is cisrol12

Unnamed: 0_level_0,UTC_create_date,subj_cycle,subject code,start frame,stop frame,activity,cycle,shortname,start time sec,stop time sec,start_utc,stop_utc
subject_number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1003,90,90,90,90,90,90,90,90,90,90,90,90
1005,60,60,60,60,60,60,60,60,60,60,60,60
1007,60,60,60,60,60,60,60,60,60,60,60,60
1009,45,45,45,45,45,45,45,45,45,45,45,45
1019,15,15,15,15,15,15,15,15,15,15,15,15
1023,5,5,5,0,0,0,0,0,0,0,0,0
1024,15,15,15,15,15,15,15,15,15,15,15,15
1030,90,90,90,90,90,90,90,90,90,90,90,90
1039,6,6,6,0,0,0,0,0,0,0,0,0
1043,5,5,5,0,0,0,0,0,0,0,0,0


# Adjust cycle for Nick's GUI

In [182]:
# Change cycle dtype from str to integer
df.cycle = pd.to_numeric(df.cycle, downcast='integer')

In [183]:
# Start cycle 1 at 0 for GUI
df.cycle += -1

# Correct subject_number and subject code for 3 subjects
- ciscid4, ciscih8, ciscij10

In [190]:
df.loc[df['subj_cycle'].str.contains('ciscid4'), 'subject_number'] = 1023
df.loc[df['subj_cycle'].str.contains('ciscih8'), 'subject_number'] = 1039
df.loc[df['subj_cycle'].str.contains('ciscij10'), 'subject_number'] = 1043

df.loc[df['subject_number']==1023, 'subject code'] = 'ciscid4'
df.loc[df['subject_number']==1039, 'subject code'] = 'ciscih8'
df.loc[df['subject_number']==1043, 'subject code'] = 'ciscij10'

# Fix Drinking activity label with full name
'Taking a glass of water and drinking'

In [197]:
df.loc[df['shortname']=='Drnkg', 'activity'] = 'Taking a glass of water and drinking'

In [None]:
# check label
df.loc[df['shortname']=='Drnkg']

# Add cisrol12 missing data

In [199]:
# missing data

# shortname list
shortname = (
# cisrol12 cycle 0 (cycle 1 folder)
'Fldg_trial1', 'Fldg_trial2', 'Sitng',
# cisrol12 cycle 1 (cycle 2 folder)
'none','none',
'Stndg', 'Wlkg', 'WlkgCnt', 'FtnR', 'FtnL', 
'RamR', 'RamL', 'SitStand', 'Drwg', 'Typg', 'NtsBts',
'none',
'Drnkg', 'Sheets_trial1', 'Sheets_trial2',
'Fldg', 'Sitng',
# cycle 2 empty (cycle 3) not included
# cisrol12 cycle 3 (cycle 4 folder)
'none',
'Stndg', 'FtnR', 'FtnL', 'RamR', 'RamL', 
'SitStand', 'Drwg', 'Typg', 'NtsBts','Drnkg','Sitng',
# cisrol12 cycle 4 (cycle 5 folder)
'FtnL', 'RamR', 'RamL', 'SitStand', 'Drwg', 'Typg', 'NtsBts', 'Sitng'
)

# end frame
stopframe = (585,1288,1033,
0, 0,# shaking for video 35 and 36
1048,1063,988,988,1228,1333,538,403,1438,988,973,
253, # nothing for video 50
1363,1048,1258,2158,1048,
# omitted for cycle 3
0, # shaking for 75
1453,913,958,1198,523,403,988,973,1093,1348,1258,
958,703,583,403,988,943,958,1003)

# add shortname
df.loc[df['subject code']=='cisrol12', 'shortname'] = shortname
# add end frame
df.loc[df['subject code']=='cisrol12', 'stop frame'] = stopframe

# Delete 'none' Shortnames (should be 4 rows) to get rid of Shaking rows
df.drop(df[df.shortname == 'none'].index, inplace=True)

# Recreate stop_utc column which will update cisrol12 data
df.loc[df['subject code']=='cisrol12', 'stop_utc'] = df['stop frame']*33.367+df.UTC_create_date
#df['stop_utc'] = df['stop frame']*33.367+df.UTC_create_date

# add start and stop time 0 / empty value
df.loc[df['subject code']=='cisrol12', 'start time sec'] = 0
df.loc[df['subject code']=='cisrol12', 'stop time sec'] = 0

# drop 
df.drop(df[df.shortname == 'none'].index, inplace=True)

# combine rows: Fldg_trial1 and Fldg_trial2 timestamps in cycle 1
df.stop_utc[391] = df.stop_utc[392]
# combine rows: Sheets_trial1 and Sheets_trial2 timestamps in cycle 2
df.stop_utc[409] = df.stop_utc[410]
# drop row and reindex
df = df.drop([392, 410]).reset_index(drop=True)
# change activity short name
df.iloc[391,7] = 'Fldg'
df.iloc[405,7] = 'Sheets'

# create full activity name
key = ['Stndg', 'Wlkg', 'WlkgCnt', 'FtnR', 'FtnL', 'RamR', 'RamL', 'SitStand', 
                  'Drwg', 'Typg', 'NtsBts', 'Drnkg', 'Sheets', 'Fldg', 'Sitng']
value = ['Standing','Walking','Walking while counting','Finger to nose--right hand',
        'Finger to nose--left hand','Alternating right hand movements',
        'Alternating left hand movements','Sit to stand','Drawing on a paper',
        'Typing on a computer keyboard','Assembling nuts and bolts',
        'Taking a glass of water and drinking','Organizing sheets in a folder',
        'Folding towels','Sitting']
name_dict = dict(zip(key,value))
df.loc[df['subject code']=='cisrol12', 'activity'] = df['shortname'].map(name_dict)

In [None]:
# check cisrol12
df.loc[df['subject code']=='cisrol12']

# Save df as csv file

In [201]:
# datat set has missing data for ciscid4, ciscih8, ciscij10
path = r'//FS2.smpp.local\RTO\CIS-PD Videos\timestamp'
fname = 'GUI_timestamp.csv'
filename = os.path.join(path, fname)
with open(filename,'wb') as f:
    df.to_csv(filename, sep=',')

In [202]:
df.shape

(502, 13)

# Drop NaN for 3 subjects
- df_minus3 is a temporary df

In [39]:
df_minus3 = df.dropna()

In [40]:
print(df_minus3.shape)

(450, 13)


In [45]:
df_minus3 = df_minus3.reset_index(drop=True)

# Subjects 1030, 1019, and 1024 UTC timestamps were not adjusted for any offset from watch timestamp.

# Combine cisuabn14 cycle 6 part 1 and 2 ntsbts video

In [None]:
# option 1
ffmpeg -i input1.mp4 -c copy -bsf:v h264_mp4toannexb -f mpegts intermediate1.ts
ffmpeg -i input2.mp4 -c copy -bsf:v h264_mp4toannexb -f mpegts intermediate2.ts
ffmpeg -i "concat:intermediate1.ts|intermediate2.ts" -c copy -bsf:a aac_adtstoasc output.mp4

In [58]:
# option 2 - Worked
from moviepy.editor import VideoFileClip, concatenate_videoclips
path_video1 = r'//FS2.smpp.local\RTO\CIS-PD Videos\cisuabn14\cycle6_part1'
videoname1 = os.path.join(path_video1, 'NtsBts_part1.mp4')
path_video2 = r'//FS2.smpp.local\RTO\CIS-PD Videos\cisuabn14\cycle6_part2'
videoname2 = os.path.join(path_video2, 'NtsBts_part2.mp4')

clip1 = VideoFileClip(videoname1)
clip2 = VideoFileClip(videoname2)#.subclip(50,60)
final_clip = concatenate_videoclips([clip1,clip2])
path_video3 = r'//FS2.smpp.local\RTO\CIS-PD Videos\cisuabn14\cycle6_part1'
videoname_final = os.path.join(path_video3, 'NtsBts.mp4')
final_clip.write_videofile(videoname_final)

Imageio: 'ffmpeg-win32-v3.2.4.exe' was not found on your computer; downloading it now.
Try 1. Download from https://github.com/imageio/imageio-binaries/raw/master/ffmpeg/ffmpeg-win32-v3.2.4.exe (34.1 MB)
Downloading: 8192/35749888 bytes (0.0548864/35749888 bytes (1.5892928/35749888 bytes (2.51204224/35749888 bytes (3.4%1728512/35749888 bytes (4.8%2064384/35749888 bytes (5.8%2457600/35749888 bytes (6.9%2957312/35749888 bytes (8.3%3465216/35749888 bytes (9.7%3899392/35749888 bytes (10.94423680/35749888 bytes (12.44849664/35749888 bytes (13.65275648/35749888 bytes (14.85677056/35749888 bytes (15.96201344/35749888 bytes (17.36791168/35749888 bytes (19.07208960/35749888 bytes (20.27733248/35749888 bytes (21.68241152/35749888 bytes (23.18667136/35749888 bytes (24.29191424/35749888 bytes (25.79830400/35749888 bytes (27.510264576/35749888 bytes (28.7%10641408/35749888 bytes (29.8%11108352/35749888 bytes (31.1%11476992/35749888 bytes (32.1%11870208/35749888 bytes (33.2%12304384/35749888 bytes (

100%|███████████████████████████████████████████████████████████████████████████████| 728/728 [00:03<00:00, 184.72it/s]


[MoviePy] Done.
[MoviePy] Writing video //FS2.smpp.local\RTO\CIS-PD Videos\cisuabn14\cycle6_part1\NtsBts.mp4


100%|████████████████████████████████████████████████████████████████████████████████| 990/990 [00:35<00:00, 27.81it/s]


[MoviePy] Done.
[MoviePy] >>>> Video ready: //FS2.smpp.local\RTO\CIS-PD Videos\cisuabn14\cycle6_part1\NtsBts.mp4 



# Combine Fldg_trial1 and Fldg_trial2 videos for cisrol12 cycle 1

In [127]:
path_video1 = r'//FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle1'
videoname1 = os.path.join(path_video1, 'Fldg_trial1.mp4')
path_video2 = r'//FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle1'
videoname2 = os.path.join(path_video2, 'Fldg_trial2.mp4')

clip1 = VideoFileClip(videoname1)
clip2 = VideoFileClip(videoname2)#.subclip(50,60)
final_clip = concatenate_videoclips([clip1,clip2])
path_video3 = r'//FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle1'
videoname_final = os.path.join(path_video3, 'Fldg.mp4')
final_clip.write_videofile(videoname_final)

[MoviePy] >>>> Building video //FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle1\Fldg.mp4
[MoviePy] Writing audio in FldgTEMP_MPY_wvf_snd.mp3


100%|█████████████████████████████████████████████████████████████████████████████| 1380/1380 [00:07<00:00, 186.20it/s]


[MoviePy] Done.
[MoviePy] Writing video //FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle1\Fldg.mp4


100%|██████████████████████████████████████████████████████████████████████████████| 1875/1875 [01:13<00:00, 25.41it/s]


[MoviePy] Done.
[MoviePy] >>>> Video ready: //FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle1\Fldg.mp4 



# Combine Sheets_trial1 and Sheets_trial2 videos for cisrol12 cycle 2

In [128]:
path_video1 = r'//FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle2'
videoname1 = os.path.join(path_video1, 'Sheets_trial1.mp4')
path_video2 = r'//FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle2'
videoname2 = os.path.join(path_video2, 'Sheets_trial2.mp4')

clip1 = VideoFileClip(videoname1)
clip2 = VideoFileClip(videoname2)#.subclip(50,60)
final_clip = concatenate_videoclips([clip1,clip2])
path_video3 = r'//FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle2'
videoname_final = os.path.join(path_video3, 'Sheets.mp4')
final_clip.write_videofile(videoname_final)

[MoviePy] >>>> Building video //FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle2\Sheets.mp4
[MoviePy] Writing audio in SheetsTEMP_MPY_wvf_snd.mp3


100%|█████████████████████████████████████████████████████████████████████████████| 1700/1700 [00:09<00:00, 176.44it/s]


[MoviePy] Done.
[MoviePy] Writing video //FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle2\Sheets.mp4


100%|██████████████████████████████████████████████████████████████████████████████| 2311/2311 [01:27<00:00, 26.32it/s]


[MoviePy] Done.
[MoviePy] >>>> Video ready: //FS2.smpp.local\RTO\CIS-PD Videos\cisrol12\cycle2\Sheets.mp4 



# Reference info

Project status

Completed
1019

In progress
1003 (cycle 0-5 / time 0,30,60,90,120,150) offset -21800
- Error: saved Standing, Walking, Walk with count to cycle 2. Can I just save over in this situation?
- Not clear on stand, walk, and walk with count in cycle 1
1005 (cycle 0-3 / time 0,30,60,90) offset -20420
- Error: saved first cycle under '2 Weeks: Time 90' with offset -20929.32926742833
1007 (cycle 0-3/ time 0,30,60,90) offset -21940.197417575815
- unclear on activities
1009 (cycle 0,3,5 / time 0,90,150)
- offset -19000 doesn't look right
1023 - skip UAB
1024 (cycle 2 / time 60)
- no offset, but walk vs walk and count not very clear
1038 ???
1039 - skip UAB
1043 - skip UAB
1048 - skip cisrol12
1050 (cycle 1-5 / time 30,60,90,120,150) offset -20800
- unclear on activities


Steps to complete annotation:
Load GUI
Set subj number
Set time: 90
Activity: Standing
Look for standing and walking activities
Set offset
Check start/stop, adjust
Save Data
Next Task - don't modify NtsBts activity

Note on cycle/time
cycle 0 / time 0
cycle 1 / time 30
cycle 2 / time 60
cycle 3 / time 90
cycle 4 / time 120

Summary of data that is off
- 1003 is off by 1 yr and 5 hours and some seconds, cycle 3 was off by an additional 9 min
- 1030 is off by several seconds (usually around 30 sec)
- 1005 is off by 1 yr 5 hrs and some seconds, cycle 2 is off by an additional 5 min
- 1007 is off by 1 yr 5 hrs and some seconds
- 1009 is off by 1 yr 5 hrs and some seconds
- 1019 is off by 29.5 min
- 1024 is off by 50 sec
- 1048 cycle 1 3.5 hours off, cycle 2 missing, cycle 4 is 4 hrs off, cycle 5 is about 4 hrs off
- 1050 is off by 1 yr, 5 hrs and some sec

These videos we suspect editing, so Skip these subjects (ciscid4, ciscih8, ciccij10):
- 1023 is off by 18 days, but the watch shaking time for all cycles the same
- 1039 is off by 2 months, 13 days, and variable time but the watch shaking time for all cycles the same
- 1043 is off by 2 months, 1 day, but  the watch shaking time for all cycles the same