<a href="https://colab.research.google.com/github/jessica-guan/Python-DataSci-ML/blob/main/Natural%20Language%20Processing%3A%20Classification%20and%20Generation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Homework 21: Natural Language Processing II**
---

### **Description**
In this week's homework, you will see how to use more advanced forms of neural nets to perform tasks in NLP such as classification and generation.

<br>

### **Structure**
**Part 1**: Sarcasm Detection in the News




<br>

### **Cheat Sheets**
[Natural Language Processing II](https://docs.google.com/document/d/1p3xVUL1F6SEkusCI4klPLYqQwCkVN5s00ZvJjBpiSqM/edit?usp=sharing)

<br>

**Before starting, run the code below to import all necessary functions and libraries.**

In [None]:
!pip install lime

from lime import lime_text
import numpy as np
import pandas as pd

import tensorflow as tf
import numpy as np
import os

from keras.models import Sequential
from keras.layers import *
from keras.optimizers import Adam, SGD
from keras.utils import to_categorical

from sklearn.model_selection import train_test_split

from random import choices

import warnings
warnings.filterwarnings('ignore')



---
## **Part 1: Sarcasm Detection in the News**
---

In this section, we will see how to apply what we learned yesterday in combination with more advanced tools to determine how sarcastic a given text is.

<br>

We will be working with a dataset containing the headline of many news articles and a classification of whether that headline is sarcastic or not.

<br>


**Run the code provided below to import the dataset.**

In [None]:
data = pd.read_csv('https://docs.google.com/spreadsheets/d/e/2PACX-1vTHrKLcHxF-DSkeH5FMmpFm5KQzDbzdaCdj1aP89wmUVIg_TxLPaveVXt1C8kG2b3aLnuON6cqfABd5/pub?output=csv')
data.head()

x_train, x_test, y_train, y_test = train_test_split(data["headline"], data["is_sarcastic"], test_size = 0.2, random_state = 42)

x_train = np.array(x_train)
x_test = np.array(x_test)

y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

---
### **Part 1.1:  Simple Deep Neural Networks**
---

In this section, you will use dense layers provided by keras to build a simple DNN.

#### **Problem #1.1.1: Create the `TextVectorization` layer**


To get started, let's create a `TextVectorization` layer to vectorize this data.

Specifically,
1. Initialize the layer with the specified parameters.

2. Adapt the layer to the training data.

3. Look at the newly built vocabulary.

##### **1.Initialize the layer with the specified parameters.**

* The vocabulary should be at most 2000 words.
* The layer's output should always be 20 integers.

In [None]:
vectorize_layer = TextVectorization(
    max_tokens = 2000,
    output_mode = 'int',
    output_sequence_length = 20
  )

##### **2. Adapt the layer to the training data.**

In [None]:
vectorize_layer.adapt(x_train)

##### **3. Look at the newly built vocabulary.**

In [None]:
vectorize_layer.get_vocabulary()[:50]

['',
 '[UNK]',
 'to',
 'of',
 'the',
 'in',
 'for',
 'a',
 'on',
 'and',
 'with',
 'is',
 'new',
 'trump',
 'man',
 'from',
 'at',
 'about',
 'you',
 'this',
 'by',
 'after',
 'be',
 'how',
 'out',
 'it',
 'up',
 'that',
 'as',
 'not',
 'your',
 'his',
 'are',
 'what',
 'he',
 'just',
 'us',
 'has',
 'who',
 'more',
 'will',
 'all',
 'one',
 'into',
 'report',
 'why',
 'have',
 'donald',
 'area',
 'over']

#### **Problem #1.1.2: Build the model**


Complete the code below to build a model with the following layers.

An `Embedding` layer such that:
* The vocabulary contains 2000 tokens.
* The input length corresponds to the output of the vectorization layer.
* The number of outputs per input is 128.

<br>

Hidden layers:

* A  `Flatten` layer.
* A  `Dense` layer with 64 units (hidden states).

<br>

An output `Dense` layer. You can use activation `softmax`.

In [None]:
model = Sequential()

# Input, Vectorization, and Embedding Layers
model.add(Input(shape=(1,), dtype=tf.string))
model.add(vectorize_layer)
model.add(Embedding(input_dim=2000, output_dim=128))

# Hidden Layers
model.add(Flatten())
model.add(Dense(64, activation='relu'))

# Output Layer
model.add(Dense(2, activation='softmax'))

#### **Problem #1.1.3: Compile and fit the model**


Using standard parameters for classification, compile and train this neural network using:
* A learning rate of 0.01.
* A batch size of 200.
* 5 epochs.

In [None]:
opt = Adam(learning_rate = 0.01)
model.compile(optimizer = opt, loss = 'categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=200, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x7975c45b1e40>

#### **Problem #1.1.4: Evaluate the model**


Now, evaluate the model for both the training and test sets.

In [None]:
# Evaluate the model on the training set
train_loss, train_accuracy = model.evaluate(x_train, y_train, verbose=0)
print("Training Accuracy:", train_accuracy)
print("Training Loss:", train_loss)

# Evaluate the model on the test set
test_loss, test_accuracy = model.evaluate(x_test, y_test, verbose=0)
print("Test Accuracy:", test_accuracy)
print("Test Loss:", test_loss)

Training Accuracy: 0.9813263416290283
Training Loss: 0.05133132264018059
Test Accuracy: 0.8161737322807312
Test Loss: 0.8460763096809387


---
### **Part 1.2: Convolutional Neural Nets**
---

In this section, you will approach the same problem using the `Conv1D` and `MaxPooling1D` layers provided by keras.

#### **Problem #1.2.1: Build a CNN**


Complete the code below to build, *but not train*, a new CNN model. Specifically, create a model identical to the ones above except with hidden layers as follows:

* A `Conv1D` layer with `filters = 1`, `kernel_size = 4`, and `activation = 'relu'`.
* A `MaxPooling1D` layer with `pool_size = 2`.

In [None]:
model = Sequential()

# Input, Vectorization, and Embedding Layers
model.add(Input(shape=(1,), dtype=tf.string))
model.add(vectorize_layer)
model.add(Embedding(input_dim=2000, output_dim=128))
model.add(Conv1D(filters=1, kernel_size=4, activation='relu'))
model.add(MaxPooling1D(pool_size=2))

# Hidden Layers
model.add(Flatten())
model.add(Dense(64, activation='relu'))

# Output Layer
model.add(Dense(2, activation='softmax'))

#### **Problem #1.2.2: Examine the CNN's structure**


Now, let's look at the structure of this neural network.

**Run the code below to print the input and output shapes of each layer.**

In [None]:
for layer in model.layers:
  print(str(layer.input_shape) + " -> " + str(layer.output_shape))

(None, 1) -> (None, 20)
(None, 20) -> (None, 20, 128)
(None, 20, 128) -> (None, 17, 1)
(None, 17, 1) -> (None, 8, 1)
(None, 8, 1) -> (None, 8)
(None, 8) -> (None, 64)
(None, 64) -> (None, 2)


#### **Problem #1.2.3: Train and Evaluate the CNN**


Now, complete the code below to train and evaluate this model.

In [None]:
# Fitting
opt = Adam(learning_rate = 0.01)
model.compile(optimizer = opt, loss = 'categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=200, epochs=5)


# Evaluating
print("\n\n\n")
model.evaluate(x_train, y_train, verbose=0)
model.evaluate(x_test, y_test, verbose=0)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5






[0.7122322916984558, 0.8178584575653076]

#### **Problem #1.2.4: Adjust the model**


Complete the code below to train a new CNN model. Specifically, create a model identical to the ones above except with hidden layers as follows:

* A `Conv1D` layer with `filters = 64`, `kernel_size = 3`, and `activation = 'relu'`.
* A `MaxPooling1D` layer with `pool_size = 2`.

In [None]:
model = Sequential()

# Input, Vectorization, and Embedding Layers
model.add(Input(shape=(1,), dtype=tf.string))
model.add(vectorize_layer)
model.add(Embedding(input_dim=2000, output_dim=128))
model.add(Conv1D(filters=63, kernel_size=3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))

# Hidden Layers
model.add(Flatten())
model.add(Dense(64, activation='relu'))

# Output Layer
model.add(Dense(2, activation='softmax'))


# Printing Structure
for layer in model.layers:
  print(str(layer.input_shape) + " -> " + str(layer.output_shape))
print("\n\n\n")


# Fitting
opt = Adam(learning_rate = 0.01)
model.compile(optimizer = opt, loss = 'categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=200, epochs=5)


# Evaluating
print("\n\n\n")
model.evaluate(x_train, y_train, verbose=0)
model.evaluate(x_test, y_test, verbose=0)

(None, 1) -> (None, 20)
(None, 20) -> (None, 20, 128)
(None, 20, 128) -> (None, 18, 63)
(None, 18, 63) -> (None, 9, 63)
(None, 9, 63) -> (None, 567)
(None, 567) -> (None, 64)
(None, 64) -> (None, 2)




Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5






[0.6236959099769592, 0.828528642654419]

#### **Problem #1.2.5: Improve the model**
---

You are likely seeing that this last model performs better on the test set than most others you have seen today. It will be challenging to beat this; however, see if you can improve the model any more by adjusting any parameters including:

* **Number of filters**: Can we get away with fewer filters? Should we add more filters?
* **Kernel size**: What happens when we change this more significantly?
* **Pool size**: Should we be pooling more inputs together or fewer?
* **Dense layers**: Would adding any dense layers after the convolved results are pooled and flatten help?
* **Number of convolutional and pooling layers**: If you're careful about the input and output shapes, it is possible to stack multiple convolutional and pooling layers.
* **Training parameters**: Would it help to adjust the learning rate, number of epochs, or batch size?

In [None]:
model = Sequential()

# Input, Vectorization, and Embedding Layers
model.add(Input(shape=(1,), dtype=tf.string))
model.add(vectorize_layer)
model.add(Embedding(input_dim=2000, output_dim=128))
model.add(Conv1D(filters=63, kernel_size=3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))

# Hidden Layers
model.add(Flatten())
model.add(Dense(64, activation='relu'))

# Output Layer
model.add(Dense(2, activation='softmax'))


# Printing Structure
for layer in model.layers:
  print(str(layer.input_shape) + " -> " + str(layer.output_shape))
print("\n\n\n")


# Fitting
opt = Adam(learning_rate = 0.01)
model.compile(optimizer = opt, loss = 'categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=200, epochs=5)


# Evaluating
print("\n\n\n")
model.evaluate(x_train, y_train, verbose=0)
model.evaluate(x_test, y_test, verbose=0)

(None, 1) -> (None, 20)
(None, 20) -> (None, 20, 128)
(None, 20, 128) -> (None, 18, 63)
(None, 18, 63) -> (None, 9, 63)
(None, 9, 63) -> (None, 567)
(None, 567) -> (None, 64)
(None, 64) -> (None, 2)




Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5






[0.6140491366386414, 0.8294646143913269]

---
#End of notebook

© 2024 The Coding School, All rights reserved