#### Bilingual (Chinese + English) Product Label Classification



This practical project was deploying machine learning (ML) techniques in natural language processing (NLP) to classify the products in retail stores into pre-defined product categories based on their product labels presented with an integration of Chinese and English characters. There were in total 23 categories ranging from food and beverages to personal healthcare products. During the preprocessing of texts, the following workflow has been deployed:
<ol>
<li>separately extracted Chinese and English characters</li>
<li>applied Jieba pre-built Hidden Markov Model for tokenization of Chinese portions</li>
<li>applied NLTK tokenizer for the tokenization of English portions</li>
<li>Feature Engineering:
<ul><li>One-Hot encoding on the 23-class product categories</li>
    <li>Word2Vec embedding training on inputted texts</li>
    <li>Encoder dictionary and 3-dimensional encoder sequence preparation based on inputted texts</li></ul>
</li>
</ol>

For the modelling part, two designs with different encoding methods were proposed:
<ul>
    <li>Concatenate Bi-directional LSTM embeddings from encoder sequence and Word2Vec embeddings</li>
    <li>Transformer-style encodings with positional embeddings on sequence lengths and token sizes</li>
</ul>

Technically, the Transformer encoding methods would grow the number of parameters and indeed it achieved slightly better results than the Bi-LSTM and Word2Vec embeddings, however, for this task the results eventually turned out that the difference of performances of the two models was not significant judging by the multi-class accuracy, probably the task was not sufficient complex to resolve the powers between the two models, or the advantage of Transformer structure. The comparison of classification precision, recall and F1-score of the two sets of results per category could be found at the end of this notebook.

In [2]:
import pandas as pd
import numpy as np
import seaborn
import matplotlib.pyplot as plt
from scipy import sparse
import re
import json

In [3]:
data = pd.read_excel(F'/content/drive/My Drive/Colab Notebooks/NLP/product label classification/df_prod_cat.xlsx')

In [4]:
label_data = data[pd.notnull(data['product_cat'])]
unlabel_data = data[pd.isnull(data['product_cat'])]

In [None]:
data['product_name'].values.tolist()[0:10]

['奇亞籽生機蘇打餅-蕎麥紫菜250g',
 '奇亞籽生機蘇打餅-黑芝麻養生250g',
 '奇亞籽生機蘇打餅-黑椒岩鹽245g',
 'SILICON AIR CUSHION 42CMX 42CM X5CM  #YWON-00029',
 'YT 一次性尿袋 2000ml',
 'Strawberry Yoghurt Drops 9g 士多啤梨味乳酪片',
 'Mixed Berry Yoghurt Drops 9g 雜莓味乳酪片',
 '瑜珈運動地墊 (6mm厚)',
 '聖誕禮袋-433',
 '聖誕禮籃-510']

In [5]:
print(label_data.shape)
print(unlabel_data.shape)

(7512, 3)
(6645, 3)


In [6]:
import nltk
from nltk import word_tokenize
from nltk import corpus
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

In [7]:
## Jieba for tokenization of Chinese texts
import jieba

In [8]:
## define codec ranges corresponding to Chinese characters
cjk_ranges = [
        ( 0x4E00,  0x62FF),
        ( 0x6300,  0x77FF),
        ( 0x7800,  0x8CFF),
        ( 0x8D00,  0x9FCC),
        ( 0x3400,  0x4DB5),
        (0x20000, 0x215FF),
        (0x21600, 0x230FF),
        (0x23100, 0x245FF),
        (0x24600, 0x260FF),
        (0x26100, 0x275FF),
        (0x27600, 0x290FF),
        (0x29100, 0x2A6DF),
        (0x2A700, 0x2B734),
        (0x2B740, 0x2B81D),
        (0x2B820, 0x2CEAF),
        (0x2CEB0, 0x2EBEF),
        (0x2F800, 0x2FA1F)
    ]

## separator of label texts into Chinese and English tokens
def chi_eng_extractor(string):

    string = re.sub(r'\d+|[!?@#%^&*\[\]\\(){}<>]|[.]|[/]|[$]|[-;:,`~=_+]', ' ', string)

    es = list()
    cs = list()

    for i in range(len(string)):
        char = ord(string[i])
        counter = 0
        for bottom, top in cjk_ranges:
            if char >= bottom and char <= top:
                counter += 1
        if counter > 0:
            cs.append(string[i])
            es.append(" ")
        else:
            cs.append(" ")
            es.append(string[i])

    es = ''.join(es)
    cs = ''.join(cs)
    es = es.strip()
    es = es.lower()
    cs = cs.strip()

    es = re.sub(" +", " ", es)
    cs = re.sub(" +", " ", cs)

    return es, cs

## implementation
training_label_list = [(chi_eng_extractor(x)) for x in label_data['product_name']]
testing_label_list = [(chi_eng_extractor(x)) for x in unlabel_data['product_name']]

In [9]:
## Jieba Chinese tokenization and NLTK English tokenization
token_train_en = [word_tokenize(x[0]) for x in training_label_list]
token_train_chi = [jieba.lcut(x[1], cut_all=False, HMM=True) for x in training_label_list]
token_test_en = [word_tokenize(x[0]) for x in testing_label_list]
token_test_chi = [jieba.lcut(x[1], cut_all=False, HMM=True) for x in testing_label_list]

Building prefix dict from the default dictionary ...
Dumping model to file cache /tmp/jieba.cache
Loading model cost 0.800 seconds.
Prefix dict has been built successfully.


In [10]:
## re-concatenate token lists
token_train_list = []
token_test_list = []
for x in range(len(token_train_chi)):
    chi_text = [w for w in token_train_chi[x] if w != ' ']
    token_train_list.append(token_train_en[x] + chi_text)
for x in range(len(token_test_chi)):
    chi_text = [w for w in token_test_chi[x] if w != ' ']
    token_test_list.append(token_test_en[x] + chi_text)

In [None]:
token_train_list[0:10]

[['g', '奇亞籽生', '機蘇', '打餅', '蕎麥', '紫菜'],
 ['g', '奇亞籽生', '機蘇', '打餅', '黑芝麻', '養生'],
 ['g', '奇亞籽生', '機蘇', '打餅', '黑椒', '岩鹽'],
 ['silicon', 'air', 'cushion', 'cmx', 'cm', 'x', 'cm', 'ywon'],
 ['yt', 'ml', '一次性', '尿袋'],
 ['strawberry', 'yoghurt', 'drops', 'g', '士多啤梨', '味', '乳酪', '片'],
 ['mixed', 'berry', 'yoghurt', 'drops', 'g', '雜莓味', '乳酪', '片'],
 ['mm', '瑜珈', '運動', '地', '墊', '厚'],
 ['聖誕禮袋'],
 ['聖誕禮籃']]

In [11]:
from sklearn.preprocessing import OneHotEncoder
fit_onehot = OneHotEncoder().fit(label_data[['product_cat']])
target = fit_onehot.transform(label_data[['product_cat']]).toarray()
cat_labels = fit_onehot.categories_

In [None]:
[{v:i} for v,i in enumerate(cat_labels[0].tolist())]

{0: 'Fine Food / Snacks',
 1: 'Healthcare Products',
 2: 'Drinks',
 3: 'Sports Equipments',
 4: 'Personal Hygene Products',
 5: 'Nursing Supplies',
 6: 'Books',
 7: 'Baby Products',
 8: 'Oils',
 9: 'Bathroom Accessories',
 10: 'Skin Care Products',
 11: 'Water Strainers',
 12: 'Nutritional Supplements',
 13: 'Physiotherapy Equipments',
 14: 'Pharmacy',
 15: 'Gifts',
 16: 'Staples',
 17: 'Organic Food',
 18: 'Health Monitoring Devices',
 19: 'Seasoning',
 20: 'Wheelchairs',
 21: 'Masks',
 22: 'Electronics'}

In [14]:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import LSTM, Embedding, Bidirectional, Dense
from tensorflow.keras.layers import TimeDistributed, Dropout, Activation, Concatenate, Dot
from tensorflow.keras.optimizers import Adam, RMSprop
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.models import model_from_json
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical
import tensorflow as tf

In [15]:
from sklearn.metrics import classification_report, confusion_matrix

In [16]:
## define encoder
def process_seq2seq_encoder_input(encoder):
    ## convert into dictionary
    reserved = {'<PAD>': 0, '<UNK>': 1}
    enc_list = [w for i in encoder for w in i]
    enc_dict = {e:i+2 for i,e in enumerate(set(enc_list))}
    enc_dict = {**reserved, **enc_dict}
    ## reserved key-index for padding sequence length, out-of-dictionary words
    enc_seq = []
    for e in range(len(encoder)):
        enc_sub_seq = []
        for se in encoder[e]:
            enc_sub_seq.append(enc_dict.get(se))
        enc_seq.append(enc_sub_seq)
    ## padding sequence
    MAX_LEN = max([len(x) for x in enc_seq])
    padded_seq = pad_sequences(enc_seq, maxlen=MAX_LEN, padding='post')
    ## one-hot vectors
    enc_seq_cat = to_categorical(padded_seq, num_classes=len(enc_dict))
    return enc_dict, enc_seq, padded_seq, enc_seq_cat

In [17]:
encoder_dict, encoder_seq, pad_encode_seq, encoder_seq_cat = process_seq2seq_encoder_input(token_train_list)

In [18]:
print(encoder_seq_cat.shape)

(7512, 27, 6522)


In [19]:
print(target.shape)

(7512, 23)


In [21]:
from gensim.models import word2vec
w2v = word2vec.Word2Vec(token_train_list, size=200, window=3, min_count=1, seed=42)
w2v.train(token_train_list, total_examples=len(token_train_list), epochs=1000)

(40218928, 46661000)

In [None]:
## extract trained word vectors
word_vec_array = np.zeros((pad_encode_seq.shape[0], pad_encode_seq.shape[1], 200))

for x in range(len(pad_encode_seq)):
  for y in range(len(pad_encode_seq[x])):
    if pad_encode_seq[x][y] == 0:
      word_vec_array[x][y] = np.zeros((200,))
    elif pad_encode_seq[x][y] == 1:
      word_vec_array[x][y] = np.mean([w2v[word] for word in w2v.wv.vocab], axis=0)
    else:
      word_vec_array[x][y] = w2v[list(encoder_dict.keys())[list(encoder_dict.values())[encoder_seq[x][y]]]]

In [23]:
from sklearn.model_selection import train_test_split
encoder_seq_cat_train, encoder_seq_cat_test, w2v_arr_train, w2v_arr_test, pad_encode_seq_train, pad_encode_seq_test, y_train, y_test = \
train_test_split(encoder_seq_cat, word_vec_array, pad_encode_seq, target,
                 stratify = label_data[['product_cat']], test_size = 0.1, random_state = 42)

In [None]:
import gc
del encoder_seq_cat
del pad_encode_seq
gc.collect()

In [24]:
def Word2Vec_Seq2Label(encoder_dict, encoder_seq):

    len_en = len(encoder_dict)
    max_length_en = max([len(x) for x in encoder_seq])

    encoder_inputs = Input(shape=(None, len_en))
    encoder_LSTM = Bidirectional(LSTM(400, return_state=True, return_sequences=True))
    encoder_hidden_vec, forward_last_h, forward_last_c, backward_last_h, backward_last_c = encoder_LSTM(encoder_inputs)
    enc_state_last_h = Concatenate()([forward_last_h, backward_last_h])
    enc_state_last_c = Concatenate()([forward_last_c, backward_last_c])

    w2v_encoder_inputs = Input(shape=(None, 200))
    w2v_LSTM = LSTM(200, return_sequences=False)
    w2v_LSTM_layer = w2v_LSTM(w2v_encoder_inputs)

    concat_layer = Concatenate()([enc_state_last_h, enc_state_last_c, w2v_LSTM_layer])
    
    dense_1 = Dense(1800, activation='relu')(concat_layer)
    dense_2 = Dense(200, activation='relu')(dense_1)
    dense_3 = Dense(50, activation='relu')(dense_2)
    outputs = Dense(len(cat_labels[0]), activation='softmax')(dense_3)

    model = Model([encoder_inputs, w2v_encoder_inputs], outputs)

    return model

In [None]:
simpler_model = Word2Vec_Seq2Label(encoder_dict, pad_encode_seq)
simpler_model.compile(optimizer=Adam(lr=0.0001), loss='categorical_crossentropy', metrics=['acc'])

In [27]:
simpler_model.summary(line_length=125)

Model: "model"
_____________________________________________________________________________________________________________________________
 Layer (type)                            Output Shape               Param #        Connected to                              
 input_1 (InputLayer)                    [(None, None, 6522)]       0              []                                        
                                                                                                                             
 bidirectional (Bidirectional)           [(None, None, 800),        22153600       ['input_1[0][0]']                         
                                          (None, 400),                                                                       
                                          (None, 400),                                                                       
                                          (None, 400),                                                 

In [28]:
simpler_model.fit([encoder_seq_cat_train, w2v_arr_train], y_train,
                  validation_split = 0.2, batch_size=8, epochs=12, 
                  callbacks = [tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=4)])

Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12


<keras.callbacks.History at 0x7fe7947c9710>

In [29]:
## make predictions
pred_y_test = simpler_model.predict([encoder_seq_cat_test, w2v_arr_test])
pred_y_test_class = np.argmax(pred_y_test, axis=1)

In [None]:
print(classification_report(np.argmax(y_test, axis=1), pred_y_test_class))

              precision    recall  f1-score   support

           0       0.89      0.93      0.91       115
           1       0.73      0.66      0.69        62
           2       0.89      0.90      0.89        70
           3       0.75      0.80      0.77        15
           4       0.98      0.94      0.96        47
           5       0.91      0.93      0.92       144
           6       0.00      0.00      0.00         2
           7       0.50      0.50      0.50         2
           8       0.50      0.40      0.44         5
           9       0.61      0.76      0.67        41
          10       0.82      0.65      0.72        48
          11       1.00      0.86      0.92         7
          12       0.90      0.76      0.83        25
          13       0.67      0.73      0.70        45
          14       0.75      0.75      0.75         4
          15       0.60      1.00      0.75         3
          16       1.00      0.80      0.89        20
          17       0.91    

In [31]:
class TransformerEncoder(tf.keras.layers.Layer):
    def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):
        super(TransformerEncoder, self).__init__(**kwargs)

        self.embed_dim = embed_dim
        self.dense_dim = dense_dim
        self.num_heads = num_heads

        self.attention = tf.keras.layers.MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)
        self.dense_proj = tf.keras.models.Sequential(
            [tf.keras.layers.Dense(dense_dim, activation="relu"), 
             tf.keras.layers.Dense(embed_dim),
             ]
        )
        self.layernorm_1 = tf.keras.layers.LayerNormalization()
        self.layernorm_2 = tf.keras.layers.LayerNormalization()

        self.supports_masking = True

    def call(self, inputs, mask=None):
        attention_output = self.attention(query=inputs, value=inputs, key=inputs)
        proj_input = self.layernorm_1(inputs + attention_output)
        proj_output = self.dense_proj(proj_input)
        
        return self.layernorm_2(proj_input + proj_output)

In [32]:
class PositionalEmbedding(tf.keras.layers.Layer):
    def __init__(self, sequence_length, vocab_size, embed_dim, **kwargs):
        super(PositionalEmbedding, self).__init__(**kwargs)

        self.token_embeddings = tf.keras.layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)
        self.position_embeddings = tf.keras.layers.Embedding(input_dim=sequence_length, output_dim=embed_dim)
        
        self.sequence_length = sequence_length
        self.vocab_size = vocab_size
        self.embed_dim = embed_dim

    def call(self, inputs):
        length = tf.shape(inputs)[-1]
        positions = tf.range(start=0, limit=length, delta=1)
        embedded_tokens = self.token_embeddings(inputs)
        embedded_positions = self.position_embeddings(positions)
        return embedded_tokens + embedded_positions

    def compute_mask(self, inputs, mask=None):
        return tf.math.not_equal(inputs, 0)

In [35]:
def Transformer_Seq2Label():

    embed_dim = len(encoder_dict) // 8
    latent_dim = len(encoder_dict) // 8 * 2
    num_heads = 8
    num_cats = len(cat_labels[0])
    vocab_size = len(encoder_dict)
    sequence_length = pad_encode_seq_train.shape[1]

    encoder_inputs = tf.keras.layers.Input(shape=(None,), dtype="int64", name="encoder_inputs")
    encoder_x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)
    encoder_outputs = TransformerEncoder(embed_dim, latent_dim, num_heads)(encoder_x)

    x = tf.keras.layers.Dropout(0.5)(encoder_outputs)
    x = tf.keras.layers.Reshape(target_shape=(sequence_length * embed_dim,))(x)
    x = tf.keras.layers.Dense(4000, activation='relu')(x)
    x = tf.keras.layers.Dense(800, activation='relu')(x)
    x = tf.keras.layers.Dense(200, activation='relu')(x)
    x = tf.keras.layers.Dense(50, activation='relu')(x)
    outputs = tf.keras.layers.Dense(num_cats, activation='softmax')(x)

    transformer = tf.keras.models.Model(encoder_inputs, outputs, name="transformer")

    return transformer

In [None]:
seq2label = Transformer_Seq2Label()
seq2label.compile(optimizer=Adam(lr=0.0001), loss='categorical_crossentropy', metrics=['acc'])

In [37]:
seq2label.summary(line_length=125)

Model: "transformer"
_____________________________________________________________________________________________________________________________
 Layer (type)                                           Output Shape                                      Param #            
 encoder_inputs (InputLayer)                            [(None, None)]                                    0                  
                                                                                                                             
 positional_embedding (PositionalEmbedding)             (None, None, 815)                                 5337435            
                                                                                                                             
 transformer_encoder (TransformerEncoder)               (None, None, 815)                                 23938180           
                                                                                                 

In [38]:
seq2label.fit(pad_encode_seq_train, y_train, validation_split=0.2, batch_size=8, epochs=12,
              callbacks=[tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=4)])

Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12


<keras.callbacks.History at 0x7fe77cb2b590>

In [39]:
## make predictions
pred_y_test = seq2label.predict(pad_encode_seq_test)
pred_y_test_class = np.argmax(pred_y_test, axis=1)

In [None]:
print(classification_report(np.argmax(y_test, axis=1), pred_y_test_class))

              precision    recall  f1-score   support

           0       0.91      0.93      0.92       115
           1       0.61      0.77      0.68        62
           2       0.94      0.89      0.91        70
           3       0.80      0.80      0.80        15
           4       1.00      0.94      0.97        47
           5       0.93      0.95      0.94       144
           6       0.50      0.50      0.50         2
           7       0.00      0.00      0.00         2
           8       0.75      0.60      0.67         5
           9       0.64      0.68      0.66        41
          10       0.81      0.71      0.76        48
          11       1.00      0.86      0.92         7
          12       0.92      0.88      0.90        25
          13       0.80      0.62      0.70        45
          14       0.75      0.75      0.75         4
          15       0.75      1.00      0.86         3
          16       0.95      0.90      0.92        20
          17       0.81    

  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


<br>

The Transformer encoding model has attained an accuracy of 85% while the Bi-directional and Word2Vec embedding model also has a slightly lower accuracy of 84%. For individual class performances, 14 out of 23 categories have achieved F1-score over 80% for the Transformer encoding model, major improvements have been observed for "oils", "health monitoring devices", "seasoning", "gifts". Relatively, for some categories with smaller proportion of items, like "books" and "baby products", their performances may require more testing sampels to validate or refine the power of the model on these specific classes. The "nutritional supplements" has also boosted recall while maintaining same level of precision, which increases the F1-score to 90%.

<table>
    <thead>
        <tr>
            <th>ID</td>
            <th>Product Category</th>
            <th>Support</td>
            <th colspan="3">Bi-LSTM + Word2Vec Embeddings</th>
            <th colspan="3">Transformer Positional Embeddings</th>
        </tr>
        <tr>
            <th></th>
            <th></th>
            <th></th>
            <th>precision</th>
            <th>recall</th>
            <th>f1-score</th>
            <th>precision</th>
            <th>recall</th>
            <th>f1-score</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>0</td>
            <td>Fine Food / Snacks</td>
            <td>115</td>
            <td>0.89</td>
            <td>0.93</td>
            <td>0.91</td>
            <td>0.91</td>
            <td>0.93</td>
            <td>0.92</td>
        </tr>
        <tr>
            <td>1</td>
            <td>Healthcare Products</td>
            <td>62</td>
            <td>0.73</td>
            <td>0.66</td>
            <td>0.69</td>
            <td>0.61</td>
            <td>0.77</td>
            <td>0.68</td>
        </tr>
        <tr>
            <td>2</td>
            <td>Drinks</td>
            <td>70</td>
            <td>0.89</td>
            <td>0.90</td>
            <td>0.89</td>
            <td>0.94</td>
            <td>0.89</td>
            <td>0.91</td>
        </tr>
        <tr>
            <td>3</td>
            <td>Sports Equipments</td>
            <td>15</td>
            <td>0.75</td>
            <td>0.80</td>
            <td>0.77</td>
            <td>0.80</td>
            <td>0.80</td>
            <td>0.80</td>
        </tr>
        <tr>
            <td>4</td>
            <td>Personal Hygene Products</td>
            <td>47</td>
            <td>0.98</td>
            <td>0.94</td>
            <td>0.96</td>
            <td>1.00</td>
            <td>0.94</td>
            <td>0.97</td>
        </tr>
        <tr>
            <td>5</td>
            <td>Nursing Supplies</td>
            <td>144</td>
            <td>0.91</td>
            <td>0.93</td>
            <td>0.92</td>
            <td>0.93</td>
            <td>0.95</td>
            <td>0.94</td>
        </tr>
        <tr>
            <td>6</td>
            <td>Books</td>
            <td>2</td>
            <td>0.00</td>
            <td>0.00</td>
            <td>0.00</td>
            <td>0.50</td>
            <td>0.50</td>
            <td>0.50</td>
        </tr>
        <tr>
            <td>7</td>
            <td>Baby Products</td>
            <td>2</td>
            <td>0.50</td>
            <td>0.50</td>
            <td>0.50</td>
            <td>0.00</td>
            <td>0.00</td>
            <td>0.00</td>
        </tr>
        <tr>
            <td>8</td>
            <td>Oils</td>
            <td>5</td>
            <td>0.50</td>
            <td>0.40</td>
            <td>0.44</td>
            <td>0.75</td>
            <td>0.60</td>
            <td>0.67</td>
        </tr>
        <tr>
            <td>9</td>
            <td>Bathroom Accessories</td>
            <td>41</td>
            <td>0.61</td>
            <td>0.76</td>
            <td>0.67</td>
            <td>0.64</td>
            <td>0.68</td>
            <td>0.66</td>
        </tr>
        <tr>
            <td>10</td>
            <td>Skin Care Products</td>
            <td>48</td>
            <td>0.82</td>
            <td>0.65</td>
            <td>0.72</td>
            <td>0.81</td>
            <td>0.71</td>
            <td>0.76</td>
        </tr>
        <tr>
            <td>11</td>
            <td>Water Strainers</td>
            <td>7</td>
            <td>1.00</td>
            <td>0.86</td>
            <td>0.92</td>
            <td>1.00</td>
            <td>0.86</td>
            <td>0.92</td>
        </tr>
        <tr>
            <td>12</td>
            <td>Nutritional Supplements</td>
            <td>25</td>
            <td>0.90</td>
            <td>0.76</td>
            <td>0.83</td>
            <td>0.92</td>
            <td>0.88</td>
            <td>0.90</td>
        </tr>
        <tr>
            <td>13</td>
            <td>Physiotherapy Equipments</td>
            <td>45</td>
            <td>0.67</td>
            <td>0.73</td>
            <td>0.70</td>
            <td>0.80</td>
            <td>0.62</td>
            <td>0.70</td>
        </tr>
        <tr>
            <td>14</td>
            <td>Pharmacy</td>
            <td>4</td>
            <td>0.75</td>
            <td>0.75</td>
            <td>0.75</td>
            <td>0.75</td>
            <td>0.75</td>
            <td>0.75</td>
        </tr>
        <tr>
            <td>15</td>
            <td>Gifts</td>
            <td>3</td>
            <td>0.60</td>
            <td>1.00</td>
            <td>0.75</td>
            <td>0.75</td>
            <td>1.00</td>
            <td>0.86</td>
        </tr>
        <tr>
            <td>16</td>
            <td>Staples</td>
            <td>20</td>
            <td>1.00</td>
            <td>0.80</td>
            <td>0.89</td>
            <td>0.95</td>
            <td>0.90</td>
            <td>0.92</td>
        </tr>
        <tr>
            <td>17</td>
            <td>Organic Food</td>
            <td>23</td>
            <td>0.91</td>
            <td>0.91</td>
            <td>0.91</td>
            <td>0.81</td>
            <td>0.96</td>
            <td>0.88</td>
        </tr>
        <tr>
            <td>18</td>
            <td>Health Monitoring Devices</td>
            <td>6</td>
            <td>0.71</td>
            <td>0.83</td>
            <td>0.77</td>
            <td>1.00</td>
            <td>1.00</td>
            <td>1.00</td>
        </tr>
        <tr>
            <td>19</td>
            <td>Seasoning</td>
            <td>11</td>
            <td>0.67</td>
            <td>0.91</td>
            <td>0.77</td>
            <td>0.90</td>
            <td>0.82</td>
            <td>0.86</td>
        </tr>
        <tr>
            <td>20</td>
            <td>Wheelchairs</td>
            <td>10</td>
            <td>0.83</td>
            <td>0.50</td>
            <td>0.62</td>
            <td>0.67</td>
            <td>0.60</td>
            <td>0.63</td>
        </tr>
        <tr>
            <td>21</td>
            <td>Masks</td>
            <td>25</td>
            <td>0.79</td>
            <td>0.88</td>
            <td>0.83</td>
            <td>0.79</td>
            <td>0.88</td>
            <td>0.83</td>
        </tr>
        <tr>
            <td>22</td>
            <td>Electronics</td>
            <td>22</td>
            <td>0.95</td>
            <td>0.86</td>
            <td>0.90</td>
            <td>1.00</td>
            <td>0.82</td>
            <td>0.90</td>
        </tr>
        <tr>
            <td><br></td>
            <td><b>accuracy</b></td>
            <td><b>752</b></td>
            <td><br></td>
            <td><br></td>
            <td><b>0.84</b></td>
            <td><br></td>
            <td><br></td>
            <td><b>0.85</b></td>
        </tr>
        <tr>
            <td><br></td>
            <td><b>macro average</b></td>
            <td><b>752</b></td>
            <td><b>0.75</b></td>
            <td><b>0.75</b></td>
            <td><b>0.75</b></td>
            <td><b>0.79</b></td>
            <td><b>0.78</b></td>
            <td><b>0.78</b></td>
        </tr>
        <tr>
            <td><br></td>
            <td><b>weighted average</b></td>
            <td><b>752</b></td>
            <td><b>0.84</b></td>
            <td><b>0.84</b></td>
            <td><b>0.83</b></td>
            <td><b>0.85</b></td>
            <td><b>0.85</b></td>
            <td><b>0.85</b></td>
        </tr>
    </tbody>
</table>