### We have built text classification models using scikit learn, now let look inside the model to understand how it does prediction and what the most important features are
Fist let's quickly build the model again using SGDClassifier

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import SGDClassifier
from sklearn import metrics

In [2]:
df = pd.read_csv('bbc-text.csv')
print(df.shape, df['category'].nunique())
df.head(2)

(2225, 2) 5


Unnamed: 0,category,text
0,tech,tv future in the hands of viewers with home th...
1,business,worldcom boss left books alone former worldc...


In [3]:
df['category'].value_counts()

sport            511
business         510
politics         417
tech             401
entertainment    386
Name: category, dtype: int64

In [4]:
sgd = Pipeline([("tfidf_vector_com",TfidfVectorizer()), ( "clf", SGDClassifier())])

In [5]:
%%time
X_train, X_test, y_train, y_test = train_test_split(
    df['text'], df['category'], test_size=.2, stratify=df['category'], random_state=42)
sgd.fit(X_train, y_train)
pred_test = sgd.predict(X_test)
pred_train = sgd.predict(X_train)
print("test accuracy", np.mean(pred_test == y_test))
print("train accuracy", np.mean(pred_train == y_train))
print(metrics.classification_report(y_test, pred_test))

test accuracy 0.9775280898876404
train accuracy 1.0
               precision    recall  f1-score   support

     business       0.97      0.97      0.97       102
entertainment       0.96      0.99      0.97        77
     politics       0.98      0.96      0.97        84
        sport       1.00      1.00      1.00       102
         tech       0.97      0.96      0.97        80

     accuracy                           0.98       445
    macro avg       0.98      0.98      0.98       445
 weighted avg       0.98      0.98      0.98       445

CPU times: user 1.92 s, sys: 274 ms, total: 2.2 s
Wall time: 899 ms


### Now we have the model created, let dissect the model to gain insights
The model pipeline conatains two steps: TfidfVectorizer for feature extraction, and SGDClassifier as classifier

In [6]:

sgd

Pipeline(steps=[('tfidf_vector_com', TfidfVectorizer()),
                ('clf', SGDClassifier())])

The tfidf (a common term weighting scheme in information retrieval) values that were fed into SGD classifier

In [7]:
print(sgd['tfidf_vector_com'].idf_.shape)
sgd['tfidf_vector_com'].idf_

(26795,)


array([6.69317081, 2.47121513, 7.7917831 , ..., 7.7917831 , 7.7917831 ,
       7.38631799])

Check the model target classes and the feature coefficients for the classes

In [8]:
print(sgd.classes_)
print(sgd['clf'].coef_.shape) # <-- (number of classes, number of features)
sgd['clf'].coef_

['business' 'entertainment' 'politics' 'sport' 'tech']
(5, 26795)


array([[-0.06424924,  0.32160756,  0.        , ...,  0.        ,
         0.        ,  0.        ],
       [-0.04559256, -0.08990268,  0.        , ...,  0.05277412,
        -0.10943533,  0.        ],
       [ 0.04959907,  0.49723019,  0.        , ...,  0.        ,
         0.        ,  0.        ],
       [ 0.02885362, -0.5829393 ,  0.        , ...,  0.        ,
         0.        ,  0.        ],
       [ 0.05720101,  0.01831583,  0.        , ...,  0.        ,
         0.        ,  0.        ]])

From above we can see there are total 5 classification classes and there are total 26795 features, those 26795 features correspond to 26795 TFIDF values encoded from the same number of word tokens, let find out the most import word tokens used to calculate the probability for the target class

For business class, it is in the first index position 0 in the classes_ attribute arracy, we can use the same index to get the feature coefficients for "business"

In [9]:
sgd['clf'].coef_[0]

array([-0.06424924,  0.32160756,  0.        , ...,  0.        ,
        0.        ,  0.        ])

The maximum coefficient index and the max coefficient value is

In [10]:
idx = np.argmax(sgd['clf'].coef_[0])
print(idx)
sgd['clf'].coef_[0][idx]

13260


2.4005051825497112

Lets find out the word token correspond to this maximum coefficient using the vocabulary attibute from TfidfVecterizer 

In [11]:
# Convert word to index sequence dictionary to index to word for easy lookup word by index
idx_to_word = {idx:word for (word, idx) in sgd['tfidf_vector_com'].vocabulary_.items()}
idx_to_word[idx]

'its'

Putting all these together. lest get top N features for a class

In [18]:
def top_n_features_by_coef(n, classname):
    """
        Args:
            n: the top number of words by coefficient,
            classname: the class label
        Returns:
            top or bottom n words with coefficients
    """
    class_idx = np.where(sgd.classes_== classname)[0][0]
    idx_coef = sorted(
        [(i,v) for (i, v) in enumerate(sgd['clf'].coef_[class_idx])], key=lambda e: e[1],reverse=True)
    top_n_idx_coef = idx_coef[:n]
    bottom_n_idx_coef = idx_coef[-n:] # top negative N words, sort asscending
    top_word_coef = list(map(lambda e: (idx_to_word[e[0]], round(e[1], 4)), top_n_idx_coef))
    bottom_word_coef = list(map(lambda e: (idx_to_word[e[0]], round(e[1], 4)), bottom_n_idx_coef))
    df = pd.DataFrame(top_word_coef, columns=[f"{classname}_word", 'coef'])
    df_bottom = pd.DataFrame(bottom_word_coef, columns=[f"{classname}_word", 'coef'])
    return pd.concat([df, df_bottom])

What are the word tookens by top positive and negative coeffients?

In [19]:
n = 10
df_list = []
for _class in sgd.classes_:
    df_list.append(top_n_features_by_coef(n, _class))
pd.concat(df_list, axis=1)

Unnamed: 0,business_word,coef,entertainment_word,coef.1,politics_word,coef.2,sport_word,coef.3,tech_word,coef.4
0,its,2.4005,film,3.4698,party,2.3294,cup,1.7925,computer,2.4449
1,bank,2.3939,show,2.7745,blair,2.3008,players,1.7733,technology,2.3884
2,economic,2.3341,music,2.3008,government,1.9766,match,1.7472,online,2.3405
3,shares,2.1957,singer,2.1822,labour,1.9396,club,1.6202,software,2.3355
4,firm,1.9372,album,2.126,lord,1.9291,athletics,1.6165,game,2.0802
5,company,1.8363,star,1.8786,mr,1.8367,liverpool,1.6002,games,2.0757
6,business,1.7527,band,1.8117,committee,1.8335,rugby,1.4954,digital,2.0665
7,investment,1.5838,festival,1.7344,secretary,1.828,coach,1.491,users,1.7263
8,sales,1.5324,tv,1.6898,minister,1.8098,win,1.4361,ink,1.7228
9,market,1.4782,chart,1.6275,straw,1.7584,champion,1.3948,internet,1.6905


From above we can see the word tokens with highest positive coefficients for "business" are "its", "bank", "economic", and "shares", etc. And the word token with highest negative coefficients for "business" are "committee", "uk", "people", "brown", and "music", etc. The word token with highest negative coefficients are more likely to be associated with high coefficients in some other categories, for example, "committee" appears in top coefficients for "politics" and "music" appears in top coefficients for "entertainment"

What what the top positive and negative word tokens by the feature output calculated from tfidf * coefficient

In [20]:
def top_n_features_by_feature_output(n, classname):
    """
        Args:
            n: the top number of words by coefficient,
            classname: the class label
        Returns:
            top or bottom n words with coefficients
    """
    class_idx = np.where(sgd.classes_== classname)[0][0]
    feature_output = sgd['tfidf_vector_com'].idf_ * sgd['clf'].coef_
    idx_coef = sorted(
        [(i,v) for (i, v) in enumerate(feature_output[class_idx])], key=lambda e: e[1],reverse=True)
    top_n_idx_coef = idx_coef[:n]
    bottom_n_idx_coef = idx_coef[-n:] # top negative N words, sort asscending
    top_word_coef = list(map(lambda e: (idx_to_word[e[0]], round(e[1], 4)), top_n_idx_coef))
    bottom_word_coef = list(map(lambda e: (idx_to_word[e[0]], round(e[1], 4)), bottom_n_idx_coef))
    df = pd.DataFrame(top_word_coef, columns=[f"{classname}_word", 'coef'])
    df_bottom = pd.DataFrame(bottom_word_coef, columns=[f"{classname}_word", 'coef'])
    return pd.concat([df, df_bottom])

In [15]:
n = 10
df_list = []
for _class in sgd.classes_:
    df_list.append(top_n_features_by_feature_output(n, _class))
pd.concat(df_list, axis=1)

Unnamed: 0,business_word,coef,entertainment_word,coef.1,politics_word,coef.2,sport_word,coef.3,tech_word,coef.4
0,crossrail,9.44,film,11.2,ict,13.44,athletics,8.22,ink,11.27
1,wto,8.81,ballet,10.82,straw,9.19,liverpool,7.6,argonaut,9.65
2,bank,8.77,hendrix,10.01,lord,8.17,balco,7.44,spam,8.98
3,datamonitor,8.26,album,9.56,councils,8.05,doping,7.28,computer,8.9
4,economic,8.11,gallery,9.51,blair,8.01,bates,6.82,software,8.78
5,shares,8.1,singer,9.09,snooker,7.84,tennis,6.69,online,8.43
6,boeing,7.93,festival,8.36,duchy,7.7,cup,6.57,seafarers,8.41
7,feta,7.88,show,8.27,party,7.63,rugby,6.45,simonetti,8.32
8,davos,7.44,freeview,7.93,committee,7.29,conte,6.24,blog,8.13
9,plastic,7.33,band,7.73,ukip,7.27,mido,6.19,robot,7.97
