### Step 0: Install important libraries


Step 0.1: We can install libraries with the “pip” command by writing the following codes. It will install the library if it does not exist in the system.

In [1]:
%%time
!pip install numpy
!pip install pandas
!pip install nltk
!pip install sklearn
!pip install matplotlib
import nltk
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')
nltk.download('stopwords')
nltk.download('omw-1.4')



[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\orange pc\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package wordnet to C:\Users\orange
[nltk_data]     pc\AppData\Roaming\nltk_data...


Wall time: 44.7 s


[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package stopwords to C:\Users\orange
[nltk_data]     pc\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

### Step 1: Import necessary library’s


Step 1.1: We have to import some important libraries by writing the following codes.

In [2]:
%%time
import numpy as np
import pandas as pd
import re
import time
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.tag import pos_tag
from nltk.corpus import wordnet
from nltk.corpus import stopwords
from sklearn import tree
import matplotlib.pyplot as plt

Wall time: 7.3 s


### Step 2: Import Dataset

Step 2.1: The dataset is stored in github and we are importing it by writing the following codes.

In [3]:
%%time
tweets = pd.read_csv("Tweets.csv")
pd.DataFrame(tweets)

Wall time: 353 ms


Unnamed: 0,sentiment,tweet
0,Positive,im getting on borderlands and i will murder yo...
1,Positive,I am coming to the borders and I will kill you...
2,Positive,im getting on borderlands and i will kill you ...
3,Positive,im coming on borderlands and i will murder you...
4,Positive,im getting on borderlands 2 and i will murder ...
...,...,...
74677,Positive,Just realized that the Windows partition of my...
74678,Positive,Just realized that my Mac window partition is ...
74679,Positive,Just realized the windows partition of my Mac ...
74680,Positive,Just realized between the windows partition of...


The dataset is imported in “tweets” which is an object of pandas. We can see a preview of the dataset by executing the following code

The code will display a table containing five rows like the following. Because the function head() in pandas displays the first five rows of the entire dataframe.

In [4]:
tweets.head(600)

Unnamed: 0,sentiment,tweet
0,Positive,im getting on borderlands and i will murder yo...
1,Positive,I am coming to the borders and I will kill you...
2,Positive,im getting on borderlands and i will kill you ...
3,Positive,im coming on borderlands and i will murder you...
4,Positive,im getting on borderlands 2 and i will murder ...
...,...,...
595,Positive,IN THE SO FICKING EXCELLENT
596,Positive,The Proton-M launch vehicle that disappeared f...
597,Positive,IM SO FUCKING IN
598,Positive,IM AT SO GO FUCKING EXCITED


### Step 3: Data Preprocessing

Step 3.1: Separate features and labels for further preprocessing.

In [5]:
%%time
features = tweets["tweet"]
labels = tweets["sentiment"]

Wall time: 0 ns


Step 3.2: Clean individual tweets by removing hypterlinks, reply tags, special characters, extra spaces, single character and finally convert into lower case.

In [6]:
%%time
processed_features = []
for sentence in range(0, len(features)):
    processed_feature = re.sub("(@[A-Za-z0-9_]+)", "",
                               str(features[sentence]))
    processed_feature = re.sub(
        'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+#]|[!*\(\),]|' \
        '(?:%[0-9a-fA-F][0-9a-fA-F]))+', '', processed_feature)
 
    processed_feature = re.sub(r"'", '', processed_feature)
 
    processed_feature = re.sub(r'\W', ' ', processed_feature)
 
    processed_feature = re.sub(r'\s+[a-zA-Z1-9]\s+', ' ', processed_feature)
 
    processed_feature = re.sub(r'^[a-zA-Z1-9]\s+', '', processed_feature) 
 
    processed_feature = re.sub(r'\s+[a-zA-Z1-9]$', '', processed_feature) 
 
    processed_feature = re.sub(r'\s+', ' ', processed_feature)
 
    processed_feature = processed_feature.strip()
 
    processed_feature = processed_feature.lower()
    
    processed_features.append(processed_feature)
    

Wall time: 4.2 s


Step 3.3:  Preview the cleaned dataset with the following code.

In [7]:
pd.DataFrame(processed_features).head()

Unnamed: 0,0
0,im getting on borderlands and will murder you all
1,am coming to the borders and will kill you all
2,im getting on borderlands and will kill you all
3,im coming on borderlands and will murder you all
4,im getting on borderlands and will murder you ...


Step 3.4: Tokenize each word then Remove StopWords and Numbers

In [8]:
%%time
lemmaTokens = []
rawTokens = []
lemmatizer = WordNetLemmatizer()
 
# Tokenizing
for sentence in processed_features:
    tempTokens = sentence.split()
    rawTokens.append(tempTokens)
    lemmatizedTokens = []
    
    # proceed lemmatization
    for singleToken, tag in pos_tag(tempTokens):
        # Preparing for morphological analysis
        if tag.startswith("NN"):
            pos = 'n'
        elif tag.startswith('VB'):
            pos = 'v'
        else:
            pos = 'a'
            
        token = lemmatizer.lemmatize(singleToken, pos)
        
        if token not in stopwords.words("english"):
            if token.isdigit() == False:
                lemmatizedTokens.append(token)
        
    # append the lemmatized tokens of sentence in the list
    lemmaTokens.append(lemmatizedTokens)

Wall time: 11min 38s


Step 3.5: Remove empty document

In [9]:
count = 0
for tok in lemmaTokens:
    if len(tok)==0:
        lemmaTokens.pop(count)
        rawTokens.pop(count)
        labels.pop(count)
        
    count+=1

Now we have cleaned token list of each tweets. Preview the dataset with the following code.

In [10]:
pd.DataFrame(lemmaTokens).head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,188,189,190,191,192,193,194,195,196,197
0,im,get,borderland,murder,,,,,,,...,,,,,,,,,,
1,come,border,kill,,,,,,,,...,,,,,,,,,,
2,im,get,borderland,kill,,,,,,,...,,,,,,,,,,
3,im,come,borderland,murder,,,,,,,...,,,,,,,,,,
4,im,get,borderland,murder,,,,,,,...,,,,,,,,,,


Step 3.6: Convert the Text data into numerical form by Calculating TFIDF.

Step 3.6.1: Calculate TF. In order to calculate TFIDF we will need to calculate TF then IDF and then finally TFIDF.

In [None]:
%%time
words = set()
for documentTokens in lemmaTokens:
    for token in documentTokens:
        words.add(token)
uniqueWords = list(words)
documents = []
for i in range(0,len(lemmaTokens)):
    documents.append(dict.fromkeys(uniqueWords,0))
    
count = 0
for documentTokens in lemmaTokens:
    for token in documentTokens:
        documents[count][token]+=1
    count+=1

def calculateTF(wordDict, bow):
    tfDict = {}
    bowCount = len(bow)
    for word, count in wordDict.items():
        if bowCount == 0:
            print(wordDict)
            print(bow)
        try:
            tfDict[word] = count/float(bowCount)
        except ZeroDivisionError:
            print("error")
    return tfDict
 
TF = []
count = 0
for document in documents:
    TF.append(calculateTF(document,lemmaTokens[count]))
    count += 1

Step 3.6.2: Calculate IDF.

In [None]:
def calculateIDF(docList):
    import math
    idfDict = {}
    N = len(docList)
    
    idfDict = dict.fromkeys(docList[0].keys(), 0)
    for doc in docList:
        for word, val in doc.items():
            if val > 0:
                idfDict[word] += 1
    
    for word, val in idfDict.items():
        idfDict[word] = math.log10(N / float(val))
        
    return idfDict

IDFs = calculateIDF(documents)

Step 3.6.3: Calculate TFIDF.

In [None]:
%%time
 
def calculateTFIDF(tfBow, idfs):
    tfidf = {}
    for word, val in tfBow.items():
        tfidf[word] = val*idfs[word]
    return tfidf
 
TFIDF = []
count = 0
for document in TF:
    TFIDF.append(calculateTFIDF(document,IDFs))
    count += 1

TFIDF = pd.DataFrame(TFIDF)

We have successfully converted our text data into numerical data.

### Step 4: Split Train and Test data.

In order to train different Machine Learning Algorithms, we have to split our dataset into train and test data. The following code will split the dataset.

In [None]:
%%time
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(TFIDF, labels, test_size=0.3, random_state = 40)

### Step 5: Train Machine Learning Algorithms with Training data.

Step 5.1: Training with Decision Tree Classifier.

In [None]:
%%time
from sklearn.tree import DecisionTreeClassifier
decisionTree = DecisionTreeClassifier()
decisionTree.fit(X_train,y_train)

Step 5.2: Training with Naïve Bayes classifier.

In [None]:
%%time
from sklearn.naive_bayes import MultinomialNB
naiveBayes = MultinomialNB()
naiveBayes.fit(X_train,y_train)

Step 5.3: Training with KNN(K- Nearest Neighbors) classifier .

In [None]:
%%time
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier()
knn.fit(X_train,y_train)

Step 5.4: Training with Support Vector classifier.

In [None]:
%%time
from sklearn.svm import SVC
svc = SVC()
svc.fit(X_train,y_train)

### Step 6: Training Ensemble Learning Algorithms

Step 6.1: Training with Voting Classifier.

In [None]:
%%time
from sklearn. ensemble import VotingClassifier
votingClf = VotingClassifier(estimators=[('decisionTree',decisionTree),
('naiveBayes',naiveBayes),('knn',knn),('svc',svc)],voting='hard')
votingClf.fit(X_train,y_train)

Step 6.2: Training with Bagging Classifier.

In [None]:
%%time
from sklearn.ensemble import BaggingClassifier
bagging = BaggingClassifier(n_estimators=4)
bagging.fit(X_train,y_train)

Step 6.3: Training with AdaBoost Classifier.

In [None]:
%%time
from sklearn.ensemble import AdaBoostClassifier
adaBoost = AdaBoostClassifier(n_estimators=4)
adaBoost.fit(X_train,y_train)

Step 6.4: Training with RandomForest Classifier.

In [None]:
%%time
from sklearn.ensemble import RandomForestClassifier
randomForest = RandomForestClassifier(n_estimators=4)
randomForest.fit(X_train,y_train)

### Step 7: Predict the labels of Test data using Trained Algorithms.

Step 7.1: Predicting with Decision Tree.

In [None]:
%%time
dtPreds = decisionTree.predict(X_test)

Step 7.2: Predicting with Naïve Bayes.

In [None]:
%%time
nbPreds = naiveBayes.predict(X_test)

Step 7.3: Predicting with KNN.

In [None]:
%%time
knnPreds = knn.predict(X_test)

Step 7.4: Predicting with Support Vector Machine

In [None]:
%%time
svmPreds = svc.predict(X_test)

Step 7.5: Predicting with Voting Classifier

In [None]:
%%time
votingPreds = votingClf.predict(X_test)

Step 7.6: Predicting with Bagging Classifier

In [None]:
%%time
bagPreds = bagging.predict(X_test)

Step 7.7: Predicting with AdaBoost Classifier

In [None]:
%%time
adaBoostPreds = adaBoost.predict(X_test)

Step 7.8: Predicting with RandomForest Classifier

In [None]:
%%time
randomForestPreds = randomForest.predict(X_test)

### Step 8: Finding Confusion Matrix, Classification Report, Accuracy Score

Step 8.1: For Decision Tree.

In [None]:
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test,dtPreds))
print('\nClassification Report:\n\n',classification_report(y_test,dtPreds))
print('\nAccuracy Score: ',accuracy_score(y_test, dtPreds))

Step 8.2: For Naïve Bayes.

In [None]:
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test,nbPreds))
print('\nClassification Report:\n\n',classification_report(y_test,nbPreds))
print('\nAccuracy Score: ',accuracy_score(y_test, nbPreds))

Step 8.3: For KNN (K Nearest Neighbors)

In [None]:
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test,knnPreds))
print('\nClassification Report:\n\n',classification_report(y_test,knnPreds))
print('\nAccuracy Score: ',accuracy_score(y_test, knnPreds))

Step 8.4: For Support Vector Machine.

In [None]:
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test,svmPreds))
print('\nClassification Report:\n\n',classification_report(y_test,svmPreds))
print('\nAccuracy Score: ',accuracy_score(y_test, svmPreds))

Step 8.5: For Voting Classifier.

In [None]:
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test,votingPreds))
print('\nClassification Report:\n\n',
	classification_report(y_test,votingPreds))
print('\nAccuracy Score: ',accuracy_score(y_test, votingPreds))


Step 8.5: For Bagging Classifier.

In [None]:
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test,bagPreds))
print('\nClassification Report:\n\n',classification_report(y_test,bagPreds))
print('\nAccuracy Score: ',accuracy_score(y_test, bagPreds))

Step 8.6: For AdaBoost Classifier.

In [None]:
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test,adaBoostPreds))
print('\nClassification Report:\n\n',
classification_report(y_test,adaBoostPreds))
print('\nAccuracy Score: ',accuracy_score(y_test, adaBoostPreds))

Step 8.7: For RandomForest Classifier.

In [None]:
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test,randomForestPreds))
print('\nClassification Report:\n\n',
classification_report(y_test,randomForestPreds))
print('\nAccuracy Score: ',accuracy_score(y_test, randomForestPreds))

### Step 9: Implementing proposed or TieWeakVoting Ensemble Learning algorithm which is updated version of the Voting Ensemble Learning algorithm.

Step 9.1: Implementation of the TieWeakVoting Class

In [None]:
import pandas as panda
import time


class TieWeakVoting:
    def __init__(self,estimators,allPreds=None):
        self.estimators = estimators
        self.allPreds = allPreds
    
    def fit(self,X_train,y_train):
        print("Finding weak label on traing data")
        
        uniqueEntry = self.getUniqueEntry(y_train)
        print(uniqueEntry)
        entryCount = self.getEntryCount(uniqueEntry,y_train)
        print(entryCount)
        self.weak = self.findWeakLabel(entryCount)
        print("Weak Label Found: "+self.weak)
        
        for clf in self.estimators:
            startTime = time.time()
            print(clf[0] + " Fitting...")
            clf[1].fit(X_train,y_train)
            print(clf[0] +" Fitted in "+str((time.time()- startTime)))
        print("Fitting Successful")
       
    
    def getEntryCount(self,uniqueEntry,y_train):
        entryCount = dict()
        for name in uniqueEntry:
            entryCount[name] = 0

        for row in y_train:
            for entry in uniqueEntry:
                if row==entry:
                    entryCount[entry]  = entryCount[entry] + 1
        
        return entryCount
    
    
    def predict(self,X_test):
        self.allPreds = pd.DataFrame()
        
        #Predictin with estimators
        for clf in self.estimators:
            
            start_time = time.time() 
            print(clf[0] +" Predicting...")
            preds = clf[1].predict(X_test)
            predsDF = panda.DataFrame(preds)
            self.allPreds[clf[0]] = predsDF[0]
            print(clf[0] +" Predicted in "+str((time.time()- start_time)))
        
        
        print("Applying Ensemble Learning Algorihtm on Predicted Data.")
        
        
        return self.getPreds()
    
    
    
    def getPreds(self):
        newPreds = list()

        for index, row in self.allPreds.iterrows():
            totalItemDict = dict()
            tempSet = set(row)

            for item in tempSet:
                totalItemDict[item] = 0

            for item in list(row):
                totalItemDict[item] = totalItemDict[item] + 1

            newPreds.append(self.chooseLabel(totalItemDict,self.weak))
        print("Prediction Successcul")
        return newPreds
    
    
    
    def getUniqueEntry(self,y_train):
        return list(y_train.drop_duplicates())
    
    
    def findWeakLabel(self,entryCount):
        entries = list(entryCount.items())

        label = entries[0][0]
        weak = entries[0][1]

        for count in entries:

            if weak>count[1]:
                weak = count[1]
                label = count[0] 
        return label
    
    
    def chooseLabel(self,entryCount,weak):
        high = 0
        highLabel = ''
        low = 0
        lowLabel = ''

        entries = list(entryCount.items())
        finalLabel = ''
        if len(entries)>1:
            evenCount = 0
            for count in entries:
                if count[1]<high:
                    low = count[1]
                elif count[1]>high:
                    weakMark = False
                    high = count[1]
                    highLabel = count[0]
                else:
                    weakMark = True
                    low = count[1]
                    lowLabel = count[0]


            if low<high and not weakMark:
                return highLabel
            else:
                return weak
        else:
            return entries[0][0]

Step 9.2: Training with TieWeakVoting Algorithm with Traning Dataset

In [None]:
tieWeakVoting = TieWeakVoting(estimators=[('decisionTree',decisionTree),
('naiveBayes',naiveBayes),('knn',knn),('svc',svc)])
tieWeakVoting.fit(X_train,y_train)

Step 9.3: Predicting with TieWeakVoting Algorithm with Test Dataset

In [None]:
twvPreds = tieWeakVoting.predict(X_test)

We have the predictions from TieWeakVoting Algorithm in ‘twvPred’ variable.

Step 9.4: Finding Confusion Matrix, Classification Report, Accuracy Score of TieWeakVoting Algorithm.

In [None]:
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test,twvPreds))
print('\nClassification Report:\n\n',classification_report(y_test,twvPreds))
print('\nAccuracy Score: ',accuracy_score(y_test, twvPreds))

In [None]:
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
rf  = RandomForestClassifier()
Sfolds = StratifiedKFold(n_splits=5,shuffle=True)
sc_Srf=cross_val_score(rf,TFIDF,labels,cv=Sfolds)*100
print('Random Forest--> Accuracy for all iteration ', sc_Srf)
ME_Srf = sc_Srf.mean()
print('Random Forest-->Mean Accuracy %.2f' %ME_Srf)

In [None]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.model_selection import KFold
from sklearn.multiclass import OneVsRestClassifier
OvR = OneVsRestClassifier(RandomForestClassifier())
OvR.fit(X_train, y_train)

In [None]:
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
rf  = RandomForestClassifier(n_estimators=50)
Sfolds = StratifiedKFold(n_splits=4,shuffle=True)
sc_Srf=cross_val_score(rf,TFIDF,labels,cv=20)*100
print('Random Forest--> Accuracy for all iteration ', sc_Srf)
ME_Srf = sc_Srf.mean()
print('Random Forest-->Mean Accuracy %.2f' %ME_Srf)

In [None]:
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
rf  = RandomForestClassifier(n_estimators=50)
Sfolds = StratifiedKFold(n_splits=246,shuffle=True)
sc_Srf=cross_val_score(rf,TFIDF,labels,cv=20)*100
print('Random Forest--> Accuracy for all iteration ', sc_Srf)
ME_Srf = sc_Srf.mean()
print('Random Forest-->Mean Accuracy %.2f' %ME_Srf)

In [None]:
import nltk
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')
nltk.download('stopwords')
nltk.download('omw-1.4')

In [None]:
df=pd.read_csv(r'C:\Users\orange pc\Downloads\new.csv')  #new.csv = TFIDF data 
df.head(5)

In [None]:
x=df
y=df['labels']

In [None]:
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=5, random_state=0) 
kmeans.fit(x)

In [None]:
kmeans.cluster_centers_

In [None]:
labels = kmeans.labels_
correct_labels = sum(y == labels)
print("Result: %d out of %d samples were correctly labeled." % (correct_labels, y.size))

In [None]:
np_array=df.to_numpy()

In [None]:
from sklearn.model_selection import KFold
kfold = KFold(n_splits=5, shuffle=True, random_state=1)
model = 1
for train, test in kfold.split(np_array):
 print('Model #%d:' % model)
 print('train: %s, test: %s' % (train, test))
model = model+1


In [None]:
x1=df.drop('labels',axis=1)
y1=df['labels']

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
x_train, x_test, y_train, y_test = train_test_split(x1, y1, test_size=0.2, random_state=42,stratify=y)


In [None]:
from sklearn.tree import DecisionTreeClassifier
dt= DecisionTreeClassifier()
dt.fit(x_train,y_train)
y_pred=dt.predict(x_test)


In [None]:
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test,y_pred))
print('\nClassification Report:\n\n',classification_report(y_test,y_pred))
print('\nAccuracy Score: ',accuracy_score(y_test, y_pred))

In [None]:
from sklearn.ensemble import BaggingClassifier

In [None]:
bag_model = BaggingClassifier(
base_estimator=DecisionTreeClassifier(), 
n_estimators=100, 
max_samples=0.8, 
bootstrap=True,
oob_score=True,
random_state=0
)

In [None]:
bag_model.fit(x_train, y_train)

In [None]:
bag_model.oob_score_

In [None]:
bag_model.score(x_test, y_test)

In [None]:
from sklearn import svm
clf = svm.SVC()

In [None]:
bag_model1 = BaggingClassifier(
base_estimator=svm.SVC(), 
n_estimators=100, 
max_samples=0.8, 
bootstrap=True,
oob_score=True,
random_state=0
)

In [None]:
bag_model1.fit(x_train, y_train)

In [None]:
bag_model1.oob_score_

In [None]:
def clustered_Sampling(df, n_per_cluster, num_select_clusters):
    N = len(df)
    K = int(N/n_per_cluster)
    data = None
    for k in range(K):
        sample_k = df.sample(n_per_cluster)
        sample_k["cluster"] = np.repeat(k,len(sample_k))
        df = df.drop(index = sample_k.index)
        data = pd.concat([data,sample_k],axis = 0)

    random_chosen_clusters = np.random.randint(0,K,size = num_select_clusters)
    samples = data[data.cluster.isin(random_chosen_clusters)]
    return(samples)

sample = clustered_Sampling(df = df, n_per_cluster = 2806, num_select_clusters = 5)
sample

In [None]:
def systematic_sampling(df, step):
 
    indexes = np.arange(0, len(df), step=step)
    systematic_sample = df.iloc[indexes]
    return systematic_sample
 
systematic_sample = systematic_sampling(df,5)
 

In [None]:
x2=sample.drop('labels',axis=1)
y2=sample['labels']

In [None]:
x_train1, x_test1, y_train1, y_test1 = train_test_split(x2, y2, test_size=0.2)


In [None]:
from sklearn.tree import DecisionTreeClassifier
dt= DecisionTreeClassifier()
dt.fit(x_train1,y_train1)
y_pred10=dt.predict(x_test1)


In [None]:
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
print('\nConfusion Matrix:\n\n',confusion_matrix(y_test1,y_pred10))
print('\nClassification Report:\n\n',classification_report(y_test1,y_pred10))
print('\nAccuracy Score: ',accuracy_score(y_test1, y_pred10))