### Naive Bayes Classifier Task
### 문장에서 느껴지는 감정 예측
##### 다중 분류(Multiclass Classification)
- 비대면 심리 상담사로서 메세지를 전달한 환자에 대한 감정 데이터를 수집했다.
- 각 메세지 별로 감정이 표시되어 있다.
- 미래에 동일한 메세지를 보내는 환자에게 어떤 심리 치료가 적합할 수 있는지 알아보기 위한 모델을 구축한다.

In [1]:
import pandas as pd

emo = pd.read_csv('./dataset/feeling.csv')
emo

Unnamed: 0,message;feeling
0,im feeling quite sad and sorry for myself but ...
1,i feel like i am still looking at a blank canv...
2,i feel like a faithful servant;love
3,i am just feeling cranky and blue;anger
4,i can have for a treat or if i am feeling fest...
...,...
17995,i just had a very brief time in the beanbag an...
17996,i am now turning and i feel pathetic that i am...
17997,i feel strong and good overall;joy
17998,i feel like this was such a rude comment and i...


In [2]:
emo[['Message', 'Target']] = emo['message;feeling'].str.split(';', expand=True)
emo = emo.drop(labels=['message;feeling'], axis=1)
emo

Unnamed: 0,Message,Target
0,im feeling quite sad and sorry for myself but ...,sadness
1,i feel like i am still looking at a blank canv...,sadness
2,i feel like a faithful servant,love
3,i am just feeling cranky and blue,anger
4,i can have for a treat or if i am feeling festive,joy
...,...,...
17995,i just had a very brief time in the beanbag an...,sadness
17996,i am now turning and i feel pathetic that i am...,sadness
17997,i feel strong and good overall,joy
17998,i feel like this was such a rude comment and i...,anger


In [3]:
from sklearn.preprocessing import LabelEncoder

emo_encorder = LabelEncoder()
targets = emo_encorder.fit_transform(emo.loc[:,'Target'])
emo['Target'] = targets

# anger       0
# fear        1
# joy         2
# love        3
# sadness     4
# surprise    5

In [4]:
emo

Unnamed: 0,Message,Target
0,im feeling quite sad and sorry for myself but ...,4
1,i feel like i am still looking at a blank canv...,4
2,i feel like a faithful servant,3
3,i am just feeling cranky and blue,0
4,i can have for a treat or if i am feeling festive,2
...,...,...
17995,i just had a very brief time in the beanbag an...,4
17996,i am now turning and i feel pathetic that i am...,4
17997,i feel strong and good overall,2
17998,i feel like this was such a rude comment and i...,0


In [5]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(emo.Message, emo.Target, stratify= emo.Target, test_size=0.2, random_state=124)

In [6]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import Pipeline

m_nb_pipe =  Pipeline([('count_vectorizer', CountVectorizer()) ,('multinimual_NB', MultinomialNB())])
m_nb_pipe.fit(x_train, y_train)

<img src='./images/image16.png'>

In [7]:
prediction = m_nb_pipe.predict(x_test)

In [8]:
m_nb_pipe.score(x_test, y_test)

0.7536111111111111

In [9]:
m_nb_pipe.predict([emo.iloc[3333].Message])

array([4])

In [15]:
anger = emo[emo.Target == 0].sample(653,random_state=124)
fear = emo[emo.Target == 1].sample(653,random_state=124)
joy = emo[emo.Target == 2].sample(653,random_state=124)
love = emo[emo.Target == 3].sample(653,random_state=124)
sadness = emo[emo.Target == 4].sample(653,random_state=124)
surprise = emo[emo.Target == 5]
emo = pd.concat([anger,fear,joy,love,sadness,surprise]).reset_index(drop=True)