<a href="https://colab.research.google.com/github/Benjamin-morel/TensorFlow/blob/main/09_%5BUL%5D_word_embedding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

---


# **ML Model: text classifier confrontation & word embedding visualization**

| | |
|------|------|
| Filename | 09_[UL]_word_embedding.ipynb |
| Author(s) | Benjamin (contact.upside830@silomails.com) |
| Date | February 12, 2025 |
| Aim(s) | compare results and performances between text classification models and visualize word embedding space generated |
| Dataset(s) | - |
| Version | Python 3.10.12 - TensorFlow 2.17.1 - final notebook version |


<br> **!!Read before running!!** <br>
* **Step 1.** Fill in the inputs.
* **Step 2.** CPU execution is enough for training.
* **Step 3.** Run all and read comments.

---

#### **Motivation**

Binary text classification models developped before ([02_classfication_text.ipynb](https://github.com/Benjamin-morel/TensorFlow/tree/main), [07_word2vec.ipynb](https://github.com/Benjamin-morel/TensorFlow/tree/main) and [08_RNN_classification.ipynb](https://github.com/Benjamin-morel/TensorFlow/tree/main)) are confronted and compared according to their metric values. An additional analysis is done by visualizing word embedding spaces generated by each classifier thanks to an unsupervised dimension reduction algorithm.

#### **Outline**

*   data acquisition and predictions
*   metrics confrontation and ROC curves plotting
*   unsupervied learning algorithms for dimension reduction
*   neighbors localisation and word embedding space visualization
*   comparisons and conclusion

---

## **0. Input section**

The following inputs refer to the word embedding visualization part. You can choose some specific words to localize in the word embedding space and a number of nearby neighbors. The notion of neighbor-word is defined either by the 2-norm or by cosine similarity. The dimension reduction algorithm used can also be chosen by the user.

In [None]:
# word used for surrounding words analysis in the word embedding space
target_words = ['french', 'money', 'terrible', 'amazing']

# number of surrounding words per target word
nb_neighbors = 100

# measure similarity between words in the embedding space: norm or cosinus
metric_similarity = "cosinus"

# dimension reduction algorithm to use: PCA (Principal Component Analysis) or TSNE (t-Distributed stochastic neighbor embeddings)
DR_algo_name = "PCA"

---

## **1. Python libraries & display utilities**

In [None]:
# @title 1.1. Python libraries [RUN ME]

"""math"""
import numpy as np # linear algebra
import sklearn.metrics # scores and evaluation metrics

"""data manipulation and pre-processing"""
import os # miscellaneous operating system interfaces
import pandas as pd # data manipulation tool
from re import escape # regular expressions
import string # string manipulation
import shutil # operations on files

"""ML models"""
from sklearn.decomposition import PCA # dimension reduction method PCA
from sklearn.manifold import TSNE # dimension reduction method t-SNE
import tensorflow as tf # framework for ML/DL
from tensorflow import keras # API used to build model in TensorFlow

"""display"""
import plotly.graph_objects as go # graphing package
from plotly.subplots import make_subplots # make subplots

"""performances"""
from time import time # timer
start = time()

In [None]:
# @title 1.2. Import Github files [RUN ME]

"""clone the Github repertory TensorFlow and imports the models required (see section 2)"""
@keras.utils.register_keras_serializable()
def custom_standardization(input_text):
  no_uppercase = tf.strings.lower(input_text) # upper case --> lower case letters
  no_html_uppercase = tf.strings.regex_replace(no_uppercase, '<br />', ' ') # remove HTML strings
  no_punctuation_html_uppercase = tf.strings.regex_replace(no_html_uppercase, '[%s]' % escape(string.punctuation), '') # remove punctuation
  return no_punctuation_html_uppercase

"""clone the Github repertory TensorFlow and imports models (see section 3)"""
def get_github_models():

  !git clone https://github.com/Benjamin-morel/TensorFlow.git TensorFlow_duplicata
  path_model_02 = 'TensorFlow_duplicata/99_pre_trained_models/02_classification_text/02_classification_text.keras'
  path_model_07 = 'TensorFlow_duplicata/99_pre_trained_models/07_word2vec/07_classification_text.keras'
  path_model_08 = 'TensorFlow_duplicata/99_pre_trained_models/08_RNN_classification/08_RNN_classification.keras'

  model_02 = keras.models.load_model(path_model_02, custom_objects={'custom_standardization': custom_standardization})
  model_07 = keras.models.load_model(path_model_07)
  model_08 = keras.models.load_model(path_model_08)
  !rm -rf TensorFlow_duplicata/

  return [model_02, model_07, model_08]

"""clone the Github repertory TensorFlow and imports the files required (see section 3)"""
def get_github_files():

  !git clone https://github.com/Benjamin-morel/TensorFlow.git TensorFlow_duplicata
  path_vector_model_02 = 'TensorFlow_duplicata/99_pre_trained_models/09_word_embedding/vectors_02.tsv'
  path_metadata_model_02 = 'TensorFlow_duplicata/99_pre_trained_models/09_word_embedding/metadata_02.tsv'
  path_vector_model_07 = 'TensorFlow_duplicata/99_pre_trained_models/09_word_embedding/vectors_07.tsv'
  path_metadata_model_07 = 'TensorFlow_duplicata/99_pre_trained_models/09_word_embedding/metadata_07.tsv'
  path_vector_model_08 = 'TensorFlow_duplicata/99_pre_trained_models/09_word_embedding/vectors_08.tsv'
  path_metadata_model_08 = 'TensorFlow_duplicata/99_pre_trained_models/09_word_embedding/metadata_08.tsv'

  df = pd.read_csv(path_vector_model_02, sep='\t')
  vectors_02 = df.values

  df = pd.read_csv(path_metadata_model_02, sep='\t')
  metadata_02 = df.values

  df = pd.read_csv(path_vector_model_07, sep='\t')
  vectors_07 = df.values

  df = pd.read_csv(path_metadata_model_07, sep='\t')
  metadata_07 = df.values

  df = pd.read_csv(path_vector_model_08, sep='\t')
  vectors_08 = df.values

  df = pd.read_csv(path_metadata_model_08, sep='\t')
  metadata_08 = df.values
  !rm -rf TensorFlow_duplicata/

  return [vectors_02, vectors_07, vectors_08, metadata_02, metadata_07, metadata_08]

In [None]:
# @title 1.3. Figure plots [RUN ME]

"""compute and show metrics (see section 2)"""
def metrics_confrontation(pred, dataset, model_names):
  actuals = tf.concat([y for x, y in dataset], axis=0)
  actuals = actuals.numpy()
  metrics = pd.DataFrame(columns=['model', 'accuracy', 'recall', 'f1_score'])
  for i in range(len(model_names)):
    predicted_labels = np.round(pred[i], 0)
    accuracy = sklearn.metrics.accuracy_score(actuals, predicted_labels)
    recall = sklearn.metrics.recall_score(actuals, predicted_labels)
    F1_score = sklearn.metrics.f1_score(actuals, predicted_labels)
    metrics.loc[i] = [model_names[i], accuracy, recall, F1_score]

  return metrics

"""plot ROC curves (see section 2)"""
def ROC_confrontation(pred, dataset, model_names):
  actuals = tf.concat([y for x, y in dataset], axis=0)
  fig = go.Figure()
  for i in range(3):
    fpr, tpr, _ = sklearn.metrics.roc_curve(actuals,  pred[i].ravel())
    fig.add_traces(go.Scatter(x=fpr, y=tpr, mode='lines', name=model_names[i]))
  fig.add_traces(go.Scatter(x=tf.linspace(0,1,100), y=tf.linspace(0,1,100), mode='lines', name="random classifier", line=dict(dash='dash')))
  fig.update_layout(width=800,
                    height=600,
                    title=dict(text="ROC curves"),
                    xaxis=dict(title=dict(text="FPR")),
                    yaxis=dict(title=dict(text="TPR")),
                    legend=dict(title=dict(text="Models"), xanchor="left", yanchor="top", x=0.7, y=0.37),
                    font=dict(family="arial", size=18, color="black"))

  fig.show()

"""plot word embedding space (see section 5)"""
def plot_word_embedding(model_names, dictionary_words_vectors_DR, list_neighbors_names, totale_variance):

  specs_mat = []
  titles = []

  for i in range(len(model_names)):
    specs_mat.append({"type": "scatter3d"}) # generate 3d scatter specificity
    titles.append("Model: {} \nVariance: {}".format(model_names[i], totale_variance[i]))

  color = ['orange', 'black', 'red', 'green', 'magenta', 'goldenrod', 'lime']

  fig = make_subplots(rows=1, cols=len(model_names), subplot_titles=titles, specs=[specs_mat])

  for i in range(len(model_names)): # model loop

    vectors = list(dictionary_words_vectors_DR[i].values())
    vectors = np.vstack(vectors)
    words =  list(dictionary_words_vectors_DR[i].keys())

    if i == (len(model_names)-1): activate_legend = True # plot legend for the last model
    else: activate_legend = False

    """plot all word-points"""
    fig.add_trace(go.Scatter3d(x=vectors[:,0],
                            y=vectors[:,1],
                            z=vectors[:,2],
                            mode='markers',
                            marker_symbol='circle',
                            opacity=0.1,
                            marker_size=2,
                            marker_color='blue',
                            text=['{}'.format(words[j]) for j in range(len(words))],
                            hovertemplate='%{text}',
                            name="all words",
                            showlegend=activate_legend), row=1, col=i+1)

    """emphasize neighbors"""
    neighbor_names = list_neighbors_names[i]

    for j, target_word in enumerate(target_words): # target word loop

      neighbor_names_target_word = neighbor_names[j]
      neighbor_vectors_target_word = []

      for k, neighbor in enumerate(neighbor_names_target_word): # neighbor loop

        if (k==0)and(activate_legend==True): activate_legend1=True
        else: activate_legend1=False

        neighbor_vector = dictionary_words_vectors_DR[i][neighbor]

        x_pos = neighbor_vector[0]
        y_pos = neighbor_vector[1]
        z_pos = neighbor_vector[2]

        fig.add_trace(go.Scatter3d(x=[x_pos],
                                   y=[y_pos],
                                   z=[z_pos],
                                   mode='markers',
                                   marker_symbol='circle',
                                   marker_size=3,
                                   marker_color=color[j],
                                   text=['{}'.format(str(neighbor))],
                                   hovertemplate='%{text}',
                                   name=str(target_word),
                                   showlegend=activate_legend1,
                                   ), row=1, col=i+1)

  fig.update_scenes(xaxis_visible=False, yaxis_visible=False,zaxis_visible=False) # no axis
  fig.show()

---

## **2. Model confrontation**

The 3 different models developed are confronted with each other. As a reminder:
*   feedforward model (in [02_classfication_text.ipynb](https://github.com/Benjamin-morel/TensorFlow/tree/main)): NN composed of an embedding layer and a dense layer
*   skip-Gram model (in [07_word2vec.ipynb](https://github.com/Benjamin-morel/TensorFlow/tree/main)): construction of a word embedding space using the skip-gram method, and creation of a classification model using this word embedding space
*   RNN model (in [08_RNN_classification.ipynb](https://github.com/Benjamin-morel/TensorFlow/tree/main)): NN composed of an LSTM layer

To compare, we retrieve the predictions made on the test set for all models.

In [None]:
"""extract dataset and generate subsets"""
def get_data(url):
  dataset_name = "Imdb_dataset_1"

  path = tf.keras.utils.get_file(dataset_name, url, extract=True)
  path = os.path.join(path, 'aclImdb')
  test_path = os.path.join(path, 'test')
  raw_test_ds = keras.utils.text_dataset_from_directory(test_path, batch_size=32)

  return raw_test_ds

In [None]:
AUTOTUNE = tf.data.AUTOTUNE

url = "https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz"

raw_test_ds = get_data(url)
raw_test_ds = raw_test_ds.cache().prefetch(buffer_size=AUTOTUNE)

Downloading data from https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
[1m84125825/84125825[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 0us/step
Found 25000 files belonging to 2 classes.


In [None]:
"""make predictions"""
def get_predictions(model, dataset):
  predictions = model.predict(dataset, verbose=0)
  return predictions

In [None]:
model_imported = get_github_models()
predictions_all_model = []

for _, model in enumerate(model_imported):
  predictions = get_predictions(model, raw_test_ds)
  predictions_all_model.append(predictions)

Cloning into 'TensorFlow_duplicata'...
remote: Enumerating objects: 894, done.[K
remote: Counting objects: 100% (196/196), done.[K
remote: Compressing objects: 100% (60/60), done.[K
remote: Total 894 (delta 183), reused 136 (delta 136), pack-reused 698 (from 3)[K
Receiving objects: 100% (894/894), 194.67 MiB | 22.80 MiB/s, done.
Resolving deltas: 100% (453/453), done.


In [None]:
"""compute statistic metrics"""
def stat_prediction(predictions, model_names):
  stats = pd.DataFrame(columns=['model', 'mean', 'median', 'std'])
  for i, pred in enumerate(predictions):
    stats.loc[i] = [model_names[i], np.mean(pred), np.median(pred), np.std(pred)]
  return stats

In [None]:
model_names = ["feedforward", "skip-Gram", "RNN"]
stat_prediction(predictions_all_model, model_names)

Unnamed: 0,model,mean,median,std
0,feedforward,0.505472,0.531175,0.397387
1,skip-Gram,0.542174,0.559531,0.208092
2,RNN,0.531523,0.628869,0.425472


Different metrics are calculated to compare the models:
*   **Accuracy** (or classification accuracy): fraction of predicted labels matching exactly with true target labels
*   **Recall**: ratio of true positive count to the total actual positive count for a given class (i.e. TP / (TP+FN))
*   **F1-score**: harmonic mean between recall and precision

Analysis and conclusions are made in section 6.

In [None]:
metrics_confrontation(predictions_all_model, raw_test_ds, model_names)

Unnamed: 0,model,accuracy,recall,f1_score
0,feedforward,0.88392,0.89576,0.885278
1,skip-Gram,0.76956,0.8816,0.792777
2,RNN,0.86224,0.89568,0.866698


The ROC curve plots the True Positive Rate (TPR) Vs. False Positive Rate (FPR). These ratios are defined such as:
*   TPR = TP / (FP + TN)
*   FPR = FP / (FP + TN)

The ideal classifier model is the one whose ROC curve is closest to the top left-hand corner (area under the curve equal to 1).

In [None]:
ROC_confrontation(predictions_all_model, raw_test_ds, model_names)

See section 6 for analysis and conclusions.

In [None]:
"""return a score prediction between 0 and 1"""
def give_sentiment(model, review):
  prediction = model.predict(review, verbose=0)
  if prediction < 0.5:
    print("The movie looks pretty bad. (score model:", round(prediction[0][0], 2), ")")
  else:
    print("Great movie, go see it in the cinema! (score model:", round(prediction[0][0], 2), ")")

In [None]:
my_review = ["Contrary to other reviews saying the film is terrible and boring. The visuals are impressive, the story is captivating and the actors are brilliant."]
my_review_2 = ["My husband didn't like the film, finding it too boring, but I loved the actors and found the plot very interesting and captivating."]

my_review_2 = tf.constant(my_review_2)

for i in range(len(model_imported)):
  give_sentiment(model_imported[i], my_review_2)

The movie looks pretty bad. (score model: 0.44 )
Great movie, go see it in the cinema! (score model: 0.54 )




The movie looks pretty bad. (score model: 0.34 )


---

## **3. Dimension reduction**

A dimension reduction algorithm is used to tranform a multi-dimensional word embedding space into a 3-dimensions space easily visualizable. Two algorithms can be used: PCA (linear reduction) and t-SNE (non-linear reduction).

In this section, dictionaries are also built to facilitate the manipulation and extraction of word/vector data (raw or reduced):

*   `dict_words_vectors_all_models = {word 1: high dimension vector, word 2: high dimension vector, ...}`

*   `dict_words_vectors_DR_all_models = {word 1: low dimension vector, word 2: low dimension vector, ...}`

In [None]:
imported_files = get_github_files()

Cloning into 'TensorFlow_duplicata'...
remote: Enumerating objects: 894, done.[K
remote: Counting objects: 100% (196/196), done.[K
remote: Compressing objects: 100% (60/60), done.[K
remote: Total 894 (delta 183), reused 136 (delta 136), pack-reused 698 (from 3)[K
Receiving objects: 100% (894/894), 194.67 MiB | 31.43 MiB/s, done.
Resolving deltas: 100% (453/453), done.


In [None]:
"""build a words/vectors dictionary"""
def get_dictionary(vectors, words):
  return {word[0]: vectors[index] for index, word in enumerate(words)}

In [None]:
dict_words_vectors_all_models = [] # list containing all words/vectors dictionaries for all models

for i in range(len(model_names)):
  dict_words_vectors_model = get_dictionary(imported_files[i], imported_files[i+3])
  dict_words_vectors_all_models.append(dict_words_vectors_model)

In [None]:
"""reduce dataset dimension to 3"""
def dim_reduction(coordinate_array, algo_name):

  if algo_name == "TSNE":
    DR_algorithm = TSNE(n_components=3, max_iter=1000, n_iter_without_progress=100, perplexity=50) # DR = dimension reduction
  else:
    DR_algorithm = PCA(n_components=3)

  new_coordinate_array = DR_algorithm.fit_transform(coordinate_array) # apply dimension reduction

  if algo_name == "TSNE":
    total_variance = "NaN" # no variance for t-SNE
  else:
    total_variance = np.sum(DR_algorithm.explained_variance_ratio_)
    total_variance = round(total_variance, 3)

  return new_coordinate_array, total_variance

In [None]:
dict_words_vectors_DR_all_models = [] # list containing all words/DR vectors dictionaries for all models
totale_var = [] # list containing totale variance for all models

for i in range(len(dict_words_vectors_all_models)):
  vectors_DR, var = dim_reduction(imported_files[i], DR_algo_name)
  totale_var.append(var)

  dict_words_vectors_model = get_dictionary(vectors_DR, imported_files[i+3])
  dict_words_vectors_DR_all_models.append(dict_words_vectors_model)

---

## **4. Target word and neighbors**

From a given target word, neighbor words are found and localized into a word embedding space. Similarities are based on the 16-dimensions word embedding space. Similarities are either computed on a distance or angle criterion.

In [None]:
"""get the distance between two elements in the embedding space"""
def get_distance(token1, token2, dictionary):
  p1 = dictionary[token1]
  p2 = dictionary[token2]
  distance = np.linalg.norm(p2-p1)
  return distance

"""get the cosinus similarity between two elements in the embedding space"""
def get_cosinus_similarity(token1, token2, dictionary):
  p1 = dictionary[token1]
  p2 = dictionary[token2]
  dot_product = np.dot(p1, p2)
  magnitude_1 = np.linalg.norm(p1)
  magnitude_2 = np.linalg.norm(p2)
  cosine_sim = dot_product / (magnitude_1 * magnitude_2)
  return cosine_sim

In [None]:
"""get elements closest to a specific element in the embedding space"""
def get_neighbors(target, words, dictionary, n, metric):

  candidate_list = {} # stores n neighbors

  for i in range(len(words)):
    word_candidate = words[i]

    if metric == "norm":
      candidate_list[word_candidate] = get_distance(word, word_candidate, dictionary)
    else:
      candidate_list[word_candidate] = get_cosinus_similarity(word, word_candidate, dictionary)

  sorted_items = sorted(candidate_list.items(), key=lambda item: item[1])

  if metric == "norm":
    neighbor_list = sorted_items[1:n+1]
  else:
    neighbor_list = sorted_items[-(n+1):-1]

  neighbors = [item[0] for item in neighbor_list]
  return neighbors

In [None]:
neighbor_names_all_models = [] # list containing neighbor names for all models and all target words

for i in range(len(dict_words_vectors_all_models)): # model loop (02, 07, 08)
  words = dict_words_vectors_all_models[i].keys()
  dict_model = dict_words_vectors_all_models[i]

  all_neighbor_names = [] # list containing neighbor names for all target words

  for j, word in enumerate(target_words): # target word loop (defined in section 0)
    neighbors = get_neighbors(word, list(words), dict_model, nb_neighbors, metric_similarity)
    neighbors = [word] + neighbors # include the target word
    all_neighbor_names.append(neighbors)

  neighbor_names_all_models.append(all_neighbor_names)

---

## **5. Word embedding plot**

Visualize a dimensionally reduced word embedding space helps validate the performance of a classification model in classifying and linking words. Using the target words and their neighbors, clusters of similar words are formed. Depending on the model and the dimension reduction algorithm used, the quality of the partioning of the set of points in the word embedding space can vary:

*   **feedforward model**: both PCA (high variance) and t-SNE work
*   **skip-Gram model**: only t-SNE works
*   **RNN model**: both PCA (high variance) and t-SNE work

In [None]:
plot_word_embedding(model_names, dict_words_vectors_DR_all_models, neighbor_names_all_models, totale_var)

Word analogies can often be solved with vector arithmetic. Thus, anaology analysis is a good idea to measure the word embedding quality generated by a classification model.

In [None]:
"""get elements found with an analogy"""
def get_analogy(analogy, dictionary_word_vect_models, model_names, n):

  analogy_all_model = []

  for i in range(len(model_names)):
    vector_words_1_model = dictionary_word_vect_models[i]

    vectors = np.stack(list(vector_words_1_model.values()))
    words = list(vector_words_1_model.keys())

    vector_analogy = vector_words_1_model[analogy[0]] - vector_words_1_model[analogy[1]] + vector_words_1_model[analogy[2]]
    candidate_list = {}

    for i in range(len(words)):
        token_candidate = words[i]
        vector_candidate = vectors[i]
        candidate_list[token_candidate] = sum(abs(vector_analogy - vector_candidate))

    sorted_items = sorted(candidate_list.items(), key=lambda item: item[1])
    synonym_list = sorted_items[0:n]
    words = [item[0] for item in synonym_list]
    analogy_all_model.append(words)

  return analogy_all_model

In [None]:
my_analogy = ["boy", "man", "girl"]
get_analogy(my_analogy, dict_words_vectors_all_models, model_names, 5)

[['alone', 'intentionally', 'girl', 'next', 'movie'],
 ['boy', 'girl', 'lady', 'babe', 'blonde'],
 ['scope', 'that', 'hence', 'been', 'space']]

Unfortunately, the analogy "woman is to queen as man is to king" is not correctly understood by any model. The error can be explained by the lack of words "queen" and "king" in the training dataset.

However, sentiment words are correctly understood and analogies are quiet accurate for any model.

---

## **6. Conclusion**

The ROC curves and metrics provide a good comparison of the 3 text classification models.

**feedforward model:**<br>

* Performances & prediction quality (P&PQ): <br>

  *   **high values for accuracy and recall**: able to correctly classify a large number of movie reviews, without a dominant or advantaged prediction class (recall = 0.9 --> prediction errors are as much a result of positive as negative review).
  *   **low-complexity model**: 80,000 parameters, learning on CPU enough in a few minutes.

* Limitations: <br>

  *   **too specific**: word embedding not generalizable to other datasets/tasks (only two important clusters in the word embedding space and bad to make analogies). Training with larger datasets with different classes would be a good idea.
  *   **no context capture**: sufficient for movie review sentiment classification, not applicable in cases of complex text classification such as literary texts, translations, DNA/protein sequencing, grammar checker, etc.

**skip-Gram model:**<br>

* P&PQ: <br>
  *   **understands relatively well the context**:
  *   **no binary predictions**:

* Limitations: <br>
  *   **low values for accuracy and recall**:
  *   **word embedding space poorly organised**:

**RNN model:**<br>

* P&PQ: <br>
  *   **high values for accuracy and recall**: able to correctly classify a large number of movie reviews, without a dominant or advantaged prediction class (recall = 0.9 --> prediction errors are as much a result of positive as negative review).
  *   **word embedding space easy to reduce**: PCA and t-SNE algorithms succeed to reduce dimensionaly the initial word embedding space. Clusters are well divided.  

* Limitations: <br>
  *   **high-complexity model**: training phase too long. Interference time too high. Required a GPU configuration.
  *   **inefficient context capture**:

| | | | |
|------|------|------|------|
| | feedfordward | skip-Gram | RNN  |
| metric evaluation | ++ | - | ++ |
| computing complexity | ++ | + | - |
| word embedding space quality | + | - | ++ |
| context | - | + | - |
| generalization | - | + | ++ |

---


## **7. References**

| | | | | |
|------|------|------|------|------|
| Index | Title | Author(s) | Type | Comments |
|[[1]](https://aclanthology.org/P11-1015.pdf) | IMDB dataset | Andrew L. Maas & al | dataset & paper | - |
|[[2]](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html) | TSNE | Scikit-learn | tutoriels | - |
|[[3]](https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html) | PCA | Scikit-learn | tutoriels | - |
|[[4]](https://distill.pub/2016/misread-tsne/) | How to Use t-SNE Effectively | Wattenberg, et al. | paper | - |

In [None]:
print("Notebook run in %.1f seconds on %s" % ((time() - start), tf.config.list_physical_devices(device_type=None)[-1][-1]))

Notebook run in 121.4 seconds on CPU
