Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting info from the H5 files #32

Open
mirix opened this issue Aug 8, 2023 · 6 comments
Open

Extracting info from the H5 files #32

mirix opened this issue Aug 8, 2023 · 6 comments

Comments

@mirix
Copy link

mirix commented Aug 8, 2023

Hello,

I would be interested to train an audio-only model (or, perhaps, a bimodal audio-text one) using CMU-MOSEI data.

I would be recomputing the audio embeddings.

So I would need only the links to the videos plus the timestamps and the annotated emotions per timestamp range.

How would I go about extracting this information?

Thanks,

Ed

@mirix mirix changed the title Links and annotated transcriptions Extracting info from the H5 files Aug 8, 2023
@mirix
Copy link
Author

mirix commented Aug 8, 2023

Ok, perhaps I am getting to something:

import h5py
import numpy as np
import pandas as pd

filename = '/home/emoman/Downloads/mosei/CMU_MOSEI_Labels.csd'

hf = h5py.File(filename)

features = hf.get('All Labels/data/zv0Jl4TIQDc/features')
feat = np.array(features)
df_feat = pd.DataFrame(feat)
print(df_feat)

intervals = hf.get('All Labels/data/zv0Jl4TIQDc/intervals')
intval = np.array(intervals)
df_intval = pd.DataFrame(intval)
print(df_intval)

This gives:

          0         1    2         3    4    5    6
0  0.333333  0.666667  0.0  0.666667  0.0  0.0  0.0
1  1.000000  2.000000  0.0  0.000000  0.0  0.0  0.0
2  2.333333  2.666667  0.0  0.000000  0.0  0.0  0.0
        0       1
0  56.852  60.845
1  29.764  35.633
2  42.146  49.242

My interpretation is that video zv0Jl4TIQDc has three intervals annotated with the relative weights of Ekman's basic emotions.

Is that correct?

If that is the case, what would be the mapping of the emotions?

What is the highest possible value for a given emotion?

@mirix
Copy link
Author

mirix commented Aug 8, 2023

Each sentence is annotated for sentiment on a [-3,3]
Likert scale of: [−3: highly negative, −2 negative,
−1 weakly negative, 0 neutral, +1 weakly positive,
+2 positive, +3 highly positive]. Ekman emotions
(Ekman et al., 1980) of {happiness, sadness, anger,
fear, disgust, surprise} are annotated on a [0,3] Lik-
ert scale for presence of emotion x: [0: no evidence
of x, 1: weakly x, 2: x, 3: highly x].

So column zero is the Likert score and then the other columns would be, in this order, {happiness, sadness, anger, fear, disgust, surprise} ?

@mirix
Copy link
Author

mirix commented Aug 8, 2023

The issue with this interpretation is that segment 0 above would have been labelled with happiness and anger in similar amounts...

@mirix
Copy link
Author

mirix commented Aug 8, 2023

Or is it (Anger Disgust Fear Happy Sad Surprise) as in Table 3?

Then it would be Anger and Fear, which is more consistent, but the sentiment would be slightly positive...

@mirix
Copy link
Author

mirix commented Aug 8, 2023

Checking the entries with the most negative and positive sentiment, it seems to be {happiness, sadness, anger, fear, disgust, surprise}

@mirix
Copy link
Author

mirix commented Aug 10, 2023

I have forked MOSEI to build a unimodal SER dataset:

https://github.com/mirix/messaih/tree/main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant