This repo looks at the data used in the test. 

So we first have data where we expect activation, localizer data, then we have control data which is what we are to contrast/subtract out.

Goal is to generate ~100 positive example and ~100 negative examples.

In [1]:
import pandas as pd

In [2]:
splits = {'train': 'split/train-00000-of-00001.parquet', 'validation': 'split/validation-00000-of-00001.parquet', 'test': 'split/test-00000-of-00001.parquet'}
df = pd.read_parquet("hf://datasets/dair-ai/emotion/" + splits["validation"])

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
# maps index to label (ie label=0 corresponds to sadness)
label_map = ["sadness", 
			 "joy",
			 "love",
			 "anger",
			 "fear",
			 "surprise"]

In [5]:
df["emotion"] = df["label"].apply(lambda i: label_map[i])
df

Unnamed: 0,text,label,emotion
0,im feeling quite sad and sorry for myself but ...,0,sadness
1,i feel like i am still looking at a blank canv...,0,sadness
2,i feel like a faithful servant,2,love
3,i am just feeling cranky and blue,3,anger
4,i can have for a treat or if i am feeling festive,1,joy
...,...,...,...
1995,im having ssa examination tomorrow in the morn...,0,sadness
1996,i constantly worry about their fight against n...,1,joy
1997,i feel its important to share this info for th...,1,joy
1998,i truly feel that if you are passionate enough...,1,joy


Which will I consider the positive examples? So I'll be looking for anger circuits, so I'll just take anger and sample from the rest to get negative emotion examples.

In [10]:
df_anger = df[df["emotion"] == "anger"]
df_other = df[df["emotion"] != "anger"]

In [11]:
df_anger


Unnamed: 0,text,label,emotion
3,i am just feeling cranky and blue,3,anger
10,i feel bitchy but not defeated yet,3,anger
17,i know what it feels like he stressed glaring ...,3,anger
38,i feel like taking a whack at someone s eye an...,3,anger
44,i feel mmf and i cant be bothered to fight it,3,anger
...,...,...,...
1982,i feel like the world is just being bitter and...,3,anger
1983,i see people who have accomplished so much mor...,3,anger
1985,i started to see a concerning pattern i d rush...,3,anger
1993,i feel so tortured by it,3,anger


In [12]:
df_other

Unnamed: 0,text,label,emotion
0,im feeling quite sad and sorry for myself but ...,0,sadness
1,i feel like i am still looking at a blank canv...,0,sadness
2,i feel like a faithful servant,2,love
4,i can have for a treat or if i am feeling festive,1,joy
5,i start to feel more appreciative of what god ...,1,joy
...,...,...,...
1995,im having ssa examination tomorrow in the morn...,0,sadness
1996,i constantly worry about their fight against n...,1,joy
1997,i feel its important to share this info for th...,1,joy
1998,i truly feel that if you are passionate enough...,1,joy


In [19]:
import numpy as np
np.random.seed(0)
rows = np.random.choice(df_other.index.values, len(df_anger), replace=False)
df_other_sampled = df_other.loc[rows]

In [20]:
df_other_sampled

Unnamed: 0,text,label,emotion
1876,diagnosis that i have a stomache ulcer,4,fear
147,ive been having more frequent hot flashes thro...,2,love
750,i was laughing at my husband because he was st...,4,fear
288,i walk into a restaurant well any public place...,4,fear
1858,ive just been feeling extremely outcasted and ...,4,fear
...,...,...,...
34,i feel that i m so pathetic and downright dumb...,0,sadness
760,i am definitely feeling a bit melancholy but i...,0,sadness
1591,i won t feel like there would be a dull moment...,0,sadness
602,id feel so defeated and id have to lick my wounds,0,sadness


Now combine into one dataframe that just has the data and whether this is a positive example or negative example. 

In [21]:
df_combined = pd.concat((df_other_sampled, df_anger)) 

In [26]:
df_combined["angry"] = df_combined["emotion"].apply(lambda x: int(x == "anger"))
df_combined

Unnamed: 0,text,label,emotion,positive,angry
1876,diagnosis that i have a stomache ulcer,4,fear,False,0
147,ive been having more frequent hot flashes thro...,2,love,False,0
750,i was laughing at my husband because he was st...,4,fear,False,0
288,i walk into a restaurant well any public place...,4,fear,False,0
1858,ive just been feeling extremely outcasted and ...,4,fear,False,0
...,...,...,...,...,...
1982,i feel like the world is just being bitter and...,3,anger,True,1
1983,i see people who have accomplished so much mor...,3,anger,True,1
1985,i started to see a concerning pattern i d rush...,3,anger,True,1
1993,i feel so tortured by it,3,anger,True,1


In [27]:
df_combined[["text", "angry"]].to_csv("./data/angry.csv", index=False)