
## ***EMOTION DETECTOR FROM GIVEN TEXT***  

+ Uses ML Model  
+ Uses sklearn  
+ Dataset from internet  
    

### Importing the necessary libraries

In [2]:
import re 
from collections import Counter
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.svm import SVC
from sklearn.svm import LinearSVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.feature_extraction import DictVectorizer

#### Importing the training dataset into the program  
+ Function read_data() used to return the data from data set from the txt file  
    
    

In [3]:
def read_data(file):
    data = []
    with open(file, 'r') as f:
        for line in f:
            line = line.strip()
            label = ' '.join(line[1:line.find("]")].strip().split())
            text = line[line.find("]")+1:].strip()
            data.append([label, text])
    return data

file = 'text.txt'
data = read_data(file)
print(f"Number of instances: {len(data)}")

Number of instances: 7480


### Tokenisation and feature generation
* function ngram() is used to tokenise the given text  
    >Tokenisation is a process in which the raw data is broken into small chunks so that the ML model is easily able to analyse the data.  
* function create_feature() is used to do a feature generation from the existing data  
    >Feature generation is an important feature to build our ML Model as it enables to get new refined data from existing one.  

In [4]:
def ngram(token, n): 
    output = []
    for i in range(n-1, len(token)): 
        ngram = ' '.join(token[i-n+1:i+1])
        output.append(ngram) 
    return output

def create_feature(text, nrange=(1, 1)):
    text_features = [] 
    text = text.lower() 
    text_alphanum = re.sub('[^a-z0-9#]', ' ', text)
    for n in range(nrange[0], nrange[1]+1): 
        text_features += ngram(text_alphanum.split(), n)    
    text_punc = re.sub('[a-z0-9]', ' ', text)
    text_features += ngram(text_punc.split(), 1)
    return Counter(text_features)

#### Function to create list containing the emotions and the labels  
##### ***Emotions are***  
- **Joy**  
- **Anger**
- **Fear**
- **Sadness**  
- **Disgust**  
- **Shame**
- **Guilt**

In [5]:
def convert_label(item, name): 
    items = list(map(float, item.split()))
    label = ""
    for idx in range(len(items)): 
        if items[idx] == 1: 
            label += name[idx] + " "
    
    return label.strip()

emotions = ["joy", 'fear', "anger", "sadness", "disgust", "shame", "guilt"]

xAll = []
yAll = []
for label, text in data:
    yAll.append(convert_label(label, emotions))
    xAll.append(create_feature(text, nrange=(1, 4)))

### Splitting the data to test set and trainig set.  
>This will be useful for us to make our model more efficient.  


In [6]:
xTrain, xTest, yTrain, yTest = train_test_split(xAll, yAll, test_size = 0.2, random_state = 123)

def train_test(clf, X_train, X_test, y_train, y_test):
    clf.fit(X_train, y_train)
    train_acc = accuracy_score(y_train, clf.predict(X_train))
    test_acc = accuracy_score(y_test, clf.predict(X_test))
    return train_acc, test_acc

from sklearn.feature_extraction import DictVectorizer
vectorizer = DictVectorizer(sparse = True)
xTrain = vectorizer.fit_transform(xTrain)
xTest = vectorizer.transform(xTest)

## Training the ML Model  
+ Here we use 4 types of classifier model based on ML and we use the one with the best accuracy  
    - ***SVC*** \- stands for Support Vector Classification. It is a very useful classification model based     on supervised machine learning. It comes under the Support Vector Machines.  
    - ***LSVC*** \- stands for Linear Support Vector Classification. It also comes under Support Vector Machines. Another supervised machine learning algorithm.  
    - ***Random Forest Classifier*** \- It is a meta estimator which fits a lot of decision tree classifiers on sub samples of the data set and uses averaging to imporve its result.  
    - ***Decision Tree Classifier*** \- It is a classsifier which converts the given data into the form of decision trees which can be interpreted as a set of rules and can be used to classsify test data.  

In [7]:
svc = SVC()
lsvc = LinearSVC(random_state=123)
rforest = RandomForestClassifier(random_state=123)
dtree = DecisionTreeClassifier()

clifs = [svc, lsvc, rforest, dtree]

# train and test them 
print("| {:25} | {} | {} |".format("Classifier", "Training Accuracy", "Test Accuracy"))
print("| {} | {} | {} |".format("-"*25, "-"*17, "-"*13))
for clf in clifs: 
    clf_name = clf.__class__.__name__
    train_acc, test_acc = train_test(clf, xTrain, xTest, yTrain, yTest)
    print("| {:25} | {:17.7f} | {:13.7f} |".format(clf_name, train_acc, test_acc))

| Classifier                | Training Accuracy | Test Accuracy |
| ------------------------- | ----------------- | ------------- |
| SVC                       |         0.9067513 |     0.4512032 |




| LinearSVC                 |         0.9988302 |     0.5768717 |
| RandomForestClassifier    |         0.9988302 |     0.5541444 |
| DecisionTreeClassifier    |         0.9988302 |     0.4632353 |


### Detecting the emotions
>We now assign the labels in our data set to the emotions and print the output to check.

In [8]:
emo = ["joy", 'fear', "anger", "sadness", "disgust", "shame", "guilt"]
emo.sort()
label_freq = {}
for label, _ in data: 
    label_freq[label] = label_freq.get(label, 0) + 1

# print the labels and their counts in sorted order 
for l in sorted(label_freq, key=label_freq.get, reverse=True):
    print("{:10}({})  {}".format(convert_label(l, emotions), l, label_freq[l]))

joy       (1. 0. 0. 0. 0. 0. 0.)  1084
anger     (0. 0. 1. 0. 0. 0. 0.)  1080
sadness   (0. 0. 0. 1. 0. 0. 0.)  1079
fear      (0. 1. 0. 0. 0. 0. 0.)  1078
disgust   (0. 0. 0. 0. 1. 0. 0.)  1057
guilt     (0. 0. 0. 0. 0. 0. 1.)  1057
shame     (0. 0. 0. 0. 0. 1. 0.)  1045


>#### ***Finally we use some text to see the emotions. This can be predefined text or user input which afterwards detects the emotions based on our model and gives output***

>Here we take a paragraph as input from user and make our model analyse each sentence. This will enable us to understand the dominant emotion in the user and give advice based on the emotion that is there.

In [17]:
#emoji_dict = {"joy":"😂", "fear":"😱", "anger":"😠", "sadness":"😢", "disgust":"😒", "shame":"😳", "guilt":"😳"}
user_input = input("Enter whatever sentences you feel! Please use '.' after each sentence.")

user_emo=[]
texts = user_input.split(".")
for l in range(0,len(texts)):
    texts[l] = texts[l].strip()
for text in texts: 
    features = create_feature(text, nrange=(1, 4))
    features = vectorizer.transform(features)
    prediction = clf.predict(features)[0]
    #print(text,emoji_dict[prediction])
    user_emo.append(prediction)
dom_emo_dic = dict(Counter(user_emo))
max = 0
dom_emo = list(dom_emo_dic.keys())[0]
for keys in dom_emo_dic:
    if(dom_emo_dic[keys]>max):
        max,dom_emo=dom_emo_dic[keys],keys
emotional_text={"joy":"Happines", "fear":"Fear", "anger":"Anger", "sadness":"Sadness", "disgust":"Disgust", "shame":"Shamefullness", "guilt":"Guilty"}
emotional_advice = {'joy':"It is nice that you are happy and cheerful.\nKeep going the same way and dont lose hope!!",
                    'sadness':"Don't be sad. Being sad wont solve anything.\nFind the cause of that sadness and fight it to be happy.\nEnjoy Life and be happy.!",
                    'fear':"Fear is absolutely normal.\nBut staying afraid isn't nice.\nHence break out of that shell and face your fear.\nTake help from a close friend and overcome it!!!.",
                    'anger':"Anger is a dangerous emotion!\nNever give into anger.\nMeditate and let go of it.\nRevenge and anger is never the solution to anything.\nLet go of that emotion!!",
                    'disgust':"Feeling disgust about something?\nMake it into something else.\nBeing disgusted will only make you hate it more.\nHence let it go and be calm.",
                    'shame':"Shameful about somethin you did?\nIt is natural. But dont stay that way.\nDo something to make it right.\nIt'll make you more happy and less shamefull.",
                    'guilt':"Guilty?\nIt is not good.\nBeing in guilt will ruin yuor life.\nTry to get rid of it as soon as possible and make ammends.\nThat is the only way to move forward."}
print(f"So you are mainly composed with {emotional_text[dom_emo]}!!\n")
print("Here is some advice for you!!!")
print(emotional_advice[dom_emo])

So you are mainly composed with Happines!!

Here is some advice for you!!!
It is nice that you are happy and cheerful.
Keep going the same way and dont lose hope!!
