# WELCOME TO **"TensorFlow for Neural Language Processing" Series**  😁 

TensorFlow makes it easy for beginners and experts to create machine learning models for desktop, mobile, web, and cloud. TensorFlow provides a collection of workflows to develop and train models using Python, JavaScript, or Swift, and to easily deploy in the cloud, on-prem, in the browser, or on-device no matter what language you use.

We will see how we can gain insights into text data and hands-on on how to use those insights to train NLP models and perform some human mimicking tasks. Let’s dive in and look at some of the basics of NLP.
<br/> <br/>
**In this series of 4 project courses, you will learn practically how to build Natural Language Processing algorithms and learn how to create amazing models and build, train, and test Neural Networks in NLP with Tensorflow!** 😎

## 👉🏻 Course 1: Text Embedding and Classification

## 👉🏻 Course 2: Semantic Similarity in Texts

## 👉🏻 Course 3: Sentiment Analysis in Texts

## 👉🏻 Course 4: Text Generation with RNNs

In [None]:
print ("Let's start with Course 1: Word and Text Embeddings")

# WELCOME to this guided project "Semantic Similarity in Texts" on Coursera Labs! 😁 
#### This project course is part of "Tensorflow for Natural Language Processing" Series of project courses on Coursera.<br/><br/>

In this project, we will start coding, and we will go through 5 tasks:<br/> <br/>
👉🏻 **Task 1**: Introduction and Overview of the Project. <br/><br/>
👉🏻 **Task 2**: Import Libraries and Create Text Representations. <br/><br/>
👉🏻 **Task 3**: Create and Visualize Semantic Similarity. <br/><br/>
👉🏻 **Task 4**: Download the Data for Semantic Similarity. <br/><br/>
👉🏻 **Task 5**: Evaluate the Semantic Textual Similarity. <br/><br/>

## 👉🏻 Task 1: Introduction and Overview of the Project
This project illustrates how to access the Universal Sentence Encoder and use it for sentence similarity and sentence classification tasks. 🌌
<br/>
The Universal Sentence Encoder makes getting sentence level embeddings as easy as it has historically been to lookup the embeddings for individual words. 🔤<br/>
The sentence embeddings can then be trivially used to compute sentence level meaning similarity as well as to enable better performance on downstream classification tasks using less supervised training data. <br/>

At the end of this project, you will try out an amazing Bonus Exercise! 🤩

## 👉🏻 Task 2: Import Libraries and Create Text Representations

**Import Libraries**

This section sets up the environment for access to the Universal Sentence Encoder on TF Hub and provides examples of applying the encoder to words, sentences, and paragraphs.🔤

In [None]:
%%capture
!pip3 install seaborn

In [None]:
#@title Load the Universal Sentence Encoder's TF Hub module
from absl import logging

import tensorflow as tf

import tensorflow_hub as hub
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import re
import seaborn as sns

module_url = "https://tfhub.dev/google/universal-sentence-encoder/4" #@param ["https://tfhub.dev/google/universal-sentence-encoder/4", "https://tfhub.dev/google/universal-sentence-encoder-large/5"]
model = hub.load(module_url)
print ("module %s loaded" % module_url)
def embed(input):
  return model(input)

In [None]:
#@title Compute a representation for each message, showing various lengths supported.
word = "Elephant"
sentence = "I am a sentence for which I would like to get its embedding."
paragraph = (
    "Universal Sentence Encoder embeddings also support short paragraphs. "
    "There is no hard limit on how long the paragraph is. Roughly, the longer "
    "the more 'diluted' the embedding will be.")
messages = [word, sentence, paragraph]

# Reduce logging output.
logging.set_verbosity(logging.ERROR)

message_embeddings = embed (messages) ### YOUR CODE HERE

for i, message_embedding in enumerate(np.array(message_embeddings).tolist()):
  print("Message: {}".format(messages[i]))
  print("Embedding size: {}".format(len(message_embedding)))
  message_embedding_snippet = ", ".join(
      (str(x) for x in message_embedding[:3]))
  print("Embedding: [{}, ...]\n".format(message_embedding_snippet))

## 👉🏻 Task 3: Create and Visualize Semantic Similarity

The embeddings produced by the Universal Sentence Encoder are approximately normalized.<br/>
The semantic similarity of two sentences can be trivially computed as the inner product of the encodings. 👨🏽‍🤝‍👨🏽

### Create the Semantic Textual Similarity

In [None]:
def plot_similarity(labels, features, rotation):
  corr = np.inner(features, features)
  sns.set(font_scale=1.2)
  g = sns.heatmap(
      corr,
      xticklabels= labels, ### YOUR CODE HERE
      yticklabels= labels, ### YOUR CODE HERE
      vmin=0,
      vmax=1,
      cmap="YlOrRd")
  g.set_xticklabels(labels, rotation=rotation)
  g.set_title("Semantic Textual Similarity")

def run_and_plot(messages_):
  message_embeddings_ = embed(messages_)
  plot_similarity(messages_, message_embeddings_, 90)

### Visualize the Similarity 
Here we show the similarity in a heat map. 🗺 <br/>
The final graph is a 9x9 matrix where each entry `[i, j]` is colored based on the inner product of the encodings for sentence `i` and `j`.

In [None]:
messages = [
    # Smartphones
    "I like my phone",
    "My phone is not good.",
    "Your cellphone looks great.",

    # Weather
    "Will it snow tomorrow?",
    "Recently a lot of hurricanes have hit the US",
    "Global warming is real",

    # Food and health
    "An apple a day, keeps the doctors away",
    "Eating strawberries is healthy",
    "Is paleo better than keto?",

    # Asking about age
    "How old are you?",
    "what is your age?",
]

run_and_plot(messages) ### YOUR CODE HERE 
               

## 👉🏻 Task 4: Download the Data for Semantic Similarity

The **STS Benchmark** provides an intristic evaluation of the degree to which similarity scores computed using sentence embeddings align with human judgements.⚖ <br/>
The benchmark requires systems to return similarity scores for a diverse selection of sentence pairs. ✌🏻<br/>
Pearson correlation is then used to evaluate the quality of the machine similarity scores against human judgements. ⚖

### ⭐Download data

In [None]:
import pandas
import scipy
import math
import csv

sts_dataset = tf.keras.utils.get_file(
    fname="Stsbenchmark.tar.gz",
    origin="http://ixa2.si.ehu.es/stswiki/images/4/48/Stsbenchmark.tar.gz",
    extract=True)
sts_dev = pandas.read_table(
    os.path.join(os.path.dirname(sts_dataset), "stsbenchmark", "sts-dev.csv"),
    error_bad_lines=False,
    skip_blank_lines=True,
    usecols=[4, 5, 6],
    names=["sim", "sent_1", "sent_2"])
sts_test = pandas.read_table(
    os.path.join(
        os.path.dirname(sts_dataset), "stsbenchmark", "sts-test.csv"),
    error_bad_lines=False,
    quoting=csv.QUOTE_NONE,
    skip_blank_lines=True,
    usecols=[4, 5, 6],
    names=["sim", "sent_1", "sent_2"])
# cleanup some NaN values in sts_dev
sts_dev = sts_dev[[isinstance(s, str) for s in sts_dev['sent_2']]]

## 👉🏻 Task 5: Evaluate the Semantic Textual Similarity

⭐ Lets now evaluate the semantic textual embeddings!

In [None]:
sts_data = sts_dev #@param ["sts_dev", "sts_test"] {type:"raw"}

def run_sts_benchmark(batch):
  sts_encode1 = tf.nn.l2_normalize(embed(tf.constant(batch['sent_1'].tolist())), axis=1)
  sts_encode2 = tf.nn.l2_normalize(embed(tf.constant(batch['sent_2'].tolist())), axis=1)
  cosine_similarities = tf.reduce_sum(tf.multiply(sts_encode1, sts_encode2), axis=1)
  clip_cosine_similarities = tf.clip_by_value(cosine_similarities, -1.0, 1.0)
  scores = 1.0 - tf.acos (clip_cosine_similarities) / math.pi ### YOUR CODE HERE
  """Returns the similarity scores"""
  return scores

dev_scores = sts_data['sim'].tolist()
scores = []
for batch in np.array_split(sts_data, 10):
  scores.extend(run_sts_benchmark(batch))

pearson_correlation = scipy.stats.pearsonr(scores, dev_scores)
print('Pearson correlation coefficient = {0}\np-value = {1}'.format(
    pearson_correlation[0], pearson_correlation[1]))

## Bonus: Extra Exercise!
_Refresh Your Memory..._ 😋

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

There are three built-in RNN layers in Keras:

- keras.layers.SimpleRNN, a fully-connected RNN where the output from previous timestep is to be fed to next timestep.

- keras.layers.GRU, first proposed in Cho et al., 2014.

- keras.layers.LSTM, first proposed in Hochreiter & Schmidhuber, 1997.

In early 2015, Keras had the first reusable open-source Python implementations of LSTM and GRU.

Here is a simple example of a Sequential model that processes sequences of integers, embeds each integer into a 64-dimensional vector, then processes the sequence of vectors using a LSTM layer.

In [None]:
#creating the RNN model

model = keras.Sequential()
# Add an Embedding layer expecting input vocab of size 1000, and
# output embedding dimension of size 64.
model.add(layers.Embedding(input_dim=1000, output_dim=64))

# Add a LSTM layer with 128 internal units.
model.add(layers.LSTM(128))

# Add a Dense layer with 10 units.
model.add(layers.Dense(10))

model.summary()

By default, the output of a RNN layer contains a single vector per sample. This vector is the RNN cell output corresponding to the last timestep, containing information about the entire input sequence. The shape of this output is (batch_size, units) where units corresponds to the units argument passed to the layer's constructor.

A RNN layer can also return the entire sequence of outputs for each sample (one vector per timestep per sample), if you set return_sequences=True. The shape of this output is (batch_size, timesteps, units).

In [None]:
# Outputs and states in RNN

model = keras.Sequential()
model.add(layers.Embedding(input_dim=1000, output_dim=64))

# The output of GRU will be a 3D tensor of shape (batch_size, timesteps, 256)
model.add(layers.GRU(256, return_sequences=True))

# The output of SimpleRNN will be a 2D tensor of shape (batch_size, 128)
model.add(layers.SimpleRNN(128))

model.add(layers.Dense(10))

model.summary()

In addition, a RNN layer can return its final internal state(s). The returned states can be used to resume the RNN execution later, or to initialize another RNN. This setting is commonly used in the encoder-decoder sequence-to-sequence model, where the encoder final state is used as the initial state of the decoder.

To configure a RNN layer to return its internal state, set the return_state parameter to True when creating the layer. Note that LSTM has 2 state tensors, but GRU only has one.

To configure the initial state of the layer, just call the layer with additional keyword argument initial_state. Note that the shape of the state needs to match the unit size of the layer, like in the example below.

In [None]:
encoder_vocab = 1000
decoder_vocab = 2000

encoder_input = layers.Input(shape=(None,))
encoder_embedded = layers.Embedding(input_dim=encoder_vocab, output_dim=64)(
    encoder_input
)

# Return states in addition to output
output, state_h, state_c = layers.LSTM(64, return_state=True, name="encoder")(
    encoder_embedded
)
encoder_state = [state_h, state_c]

decoder_input = layers.Input(shape=(None,))
decoder_embedded = layers.Embedding(input_dim=decoder_vocab, output_dim=64)(
    decoder_input
)

# Pass the 2 states to a new LSTM layer, as initial state
decoder_output = layers.LSTM(64, name="decoder")(
    decoder_embedded, initial_state=encoder_state
)
output = layers.Dense(10)(decoder_output)

model = keras.Model([encoder_input, decoder_input], output)
model.summary()

Which of the following is true? 🤔 <br/>

i) On average, neural networks have higher computational rates than conventional computers.
ii) Neural networks learn by example.
iii) Neural networks mimic the way the human brain works.
<br/>
a) All of the mentioned are true? <br/>
b) (ii) and (iii) are true?  <br/>
c) (i), (ii) and (iii) are true?  <br/>
d) None of the mentioned?  <br/>
<br/> . <br/> .<br/> . <br/> .<br/> . <br/> .<br/> . <br/> .<br/> . <br/> .<br/> . <br/> .<br/> . <br/> .<br/> . <br/> .<br/> . <br/> .<br/> . <br/> .

a)✔ <br/>
✨ Neural networks have higher computational rates than conventional computers because a lot of the operation is done in parallel. That is not the case when the neural network is simulated on a computer. The idea behind neural nets is based on the way the human brain works. Neural nets cannot be programmed, they can only learn by examples.

# CONGRATULATIONS! 🤩