Part 1

In [None]:
!pip install tensorflow
import tensorflow as tf
import numpy as np
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import Dense, Flatten, Input
from tensorflow.keras.models import Model

# Load the mnist dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize the pixel values to be between 0 and 1
x_train, x_test = x_train / 255.0, x_test / 255.0

# Reshape the data to fit the model 28x28x1
x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28, 28, 1)

# Build Model A from scratch
model_a = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
    tf.keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(10, activation='softmax')
])

model_a.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model_a.fit(x_train, y_train, epochs=10, validation_split=0.1)

In [9]:
# Preprocess data for Model B
# Convert grayscale to rgb by repeating the channels
x_train_rgb = np.repeat(x_train, 3, axis=-1)
x_test_rgb = np.repeat(x_test, 3, axis=-1)

# Resize images to 32x32 pixels
x_train_resized = tf.image.resize(x_train_rgb, [32, 32])
x_test_resized = tf.image.resize(x_test_rgb, [32, 32])

# Verify the shape
print(f"Resized training data shape: {x_train_resized.shape}")
print(f"Resized test data shape: {x_test_resized.shape}")

# Build Model B transfer learning with VGG16
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(32, 32, 3))
x = Flatten()(base_model.output)
x = Dense(128, activation='relu')(x)
x = Dense(10, activation='softmax')(x)
model_b = Model(inputs=base_model.input, outputs=x)

for layer in base_model.layers:
    layer.trainable = False

model_b.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model_b.fit(x_train_resized, y_train, epochs=10, validation_split=0.1)

# Evaluate both models
test_loss_a, test_acc_a = model_a.evaluate(x_test, y_test, verbose=2)
print(f"Model A Test Accuracy: {test_acc_a}")

test_loss_b, test_acc_b = model_b.evaluate(x_test_resized, y_test, verbose=2)
print(f"Model B Test Accuracy: {test_acc_b}")

Resized training data shape: (60000, 32, 32, 3)
Resized test data shape: (10000, 32, 32, 3)
Epoch 1/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m329s[0m 194ms/step - accuracy: 0.8181 - loss: 0.6209 - val_accuracy: 0.9525 - val_loss: 0.1527
Epoch 2/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m307s[0m 182ms/step - accuracy: 0.9488 - loss: 0.1625 - val_accuracy: 0.9590 - val_loss: 0.1209
Epoch 3/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m302s[0m 179ms/step - accuracy: 0.9597 - loss: 0.1271 - val_accuracy: 0.9678 - val_loss: 0.0948
Epoch 4/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m333s[0m 197ms/step - accuracy: 0.9654 - loss: 0.1098 - val_accuracy: 0.9643 - val_loss: 0.1053
Epoch 5/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m315s[0m 187ms/step - accuracy: 0.9672 - loss: 0.1031 - val_accuracy: 0.9720 - val_loss: 0.0831
Epoch 6/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m 

To compare and contrast the performance of Model A built from scratch and Model B using transfer learning with VGG16, I look at several key metrics and aspects of the training and evaluation processes.

Key Metrics

Accuracy:
Model A: 99.25%
Model B: 97.35%

Loss:
Model A: 0.0251
Model B: 0.0832

Training Time:
Model A: Approximately 1 second per epoch.
Model B: Approximately 49 seconds per epoch.

Detailed Comparison

Accuracy
Model A achieved a higher accuracy 99.25% compared to Model B 97.35%. This indicates that Model A, which is specifically designed and trained for the MNIST dataset, performs better in terms of correctly classifying the handwritten digits.

Loss
Model A also had a lower loss 0.0251 compared to Model B 0.0832. A lower loss indicates that Model A's predictions are closer to the actual values, suggesting better performance.

Training Time
Model B took significantly longer to train approximately 49 seconds per epoch compared to Model A approximately 1 second per epoch. This is due to the complexity of the VGG16 model, which has a much larger number of parameters and layers compared to the CNN used in Model A.

Model Complexity
Model A: A simple Convolutional Neural Network (CNN) built from scratch, specifically tailored for the MNIST dataset. It has fewer layers and parameters, making it less computationally intensive.
Model B: Utilizes a pre-trained VGG16 model, which is a very deep and complex network originally trained on the imagenet dataset. While this model is very powerful for more complex image recognition tasks, it is somewhat overkill for the simpler MNIST dataset.

Generalization and Flexibility
Model A: While highly effective for the MNIST dataset, it might not perform as well on different datasets without significant modifications and retraining.
Model B: The pre-trained VGG16 model can be fine-tuned for various datasets with potentially better results due to its generalization capabilities derived from training on the diverse ImageNet dataset. However, for simpler tasks like MNIST, its complexity may not be fully utilized.

Transfer Learning Benefits
Model B demonstrates the concept of transfer learning, where a pre-trained model on a large and diverse dataset can be adapted to a specific task with relatively less training data and time. However, its performance on MNIST shows that transfer learning isn't always superior, especially for simpler tasks where a specialized model can outperform a more general one.

Conclusion
Model A is more specialized and performs better on the MNIST dataset in terms of accuracy and loss, with significantly faster training times. It is less complex and computationally efficient for this specific task.
Model B leverages the power of transfer learning with the VGG16 model, showcasing its generalization capabilities but resulting in lower accuracy and higher computational costs for the MNIST dataset.
For the MNIST dataset, Model A is the clear winner due to its higher accuracy, lower loss, and faster training time. However, Model B illustrates the potential of transfer learning, which can be highly beneficial for more complex tasks and datasets where training a model from scratch would be more challenging.

Part 2

In [12]:
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Load the airline tweets dataset
airline_tweets = pd.read_csv('Tweets.csv')

# Select relevant columns
tweets = airline_tweets['text'].values
labels = airline_tweets['airline_sentiment'].values

# Convert labels to binary positive: 1 negative: 0
labels = np.where(labels == 'positive', 1, 0)

# Preprocess the text data
tokenizer = Tokenizer(num_words=5000)
tokenizer.fit_on_texts(tweets)
sequences = tokenizer.texts_to_sequences(tweets)
padded_sequences = pad_sequences(sequences, maxlen=100)

In [13]:
from tensorflow.keras.datasets import imdb

# Load the IMDB dataset
num_words = 5000
max_len = 100
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=num_words)

# Pad the sequences
x_train = pad_sequences(x_train, maxlen=max_len)
x_test = pad_sequences(x_test, maxlen=max_len)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
[1m17464789/17464789[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step


In [14]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout

def build_model():
    model = Sequential()
    model.add(Embedding(input_dim=5000, output_dim=128, input_length=100))
    model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

In [15]:
# Split the data into training and testing sets
from sklearn.model_selection import train_test_split

x_train_tweets, x_test_tweets, y_train_tweets, y_test_tweets = train_test_split(padded_sequences, labels, test_size=0.2, random_state=42)

# Build and train Model C
model_c = build_model()
model_c.fit(x_train_tweets, y_train_tweets, epochs=5, batch_size=64, validation_split=0.2)

Epoch 1/5




[1m147/147[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 78ms/step - accuracy: 0.8327 - loss: 0.4459 - val_accuracy: 0.8920 - val_loss: 0.2684
Epoch 2/5
[1m147/147[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 81ms/step - accuracy: 0.9218 - loss: 0.2058 - val_accuracy: 0.9091 - val_loss: 0.2372
Epoch 3/5
[1m147/147[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 86ms/step - accuracy: 0.9458 - loss: 0.1364 - val_accuracy: 0.9112 - val_loss: 0.2465
Epoch 4/5
[1m147/147[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 89ms/step - accuracy: 0.9609 - loss: 0.1046 - val_accuracy: 0.9117 - val_loss: 0.2858
Epoch 5/5
[1m147/147[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 90ms/step - accuracy: 0.9698 - loss: 0.0819 - val_accuracy: 0.9052 - val_loss: 0.3288


<keras.src.callbacks.history.History at 0x15d048da4d0>

In [16]:
# Build and train Model D
model_d = build_model()
model_d.fit(x_train, y_train, epochs=5, batch_size=64, validation_data=(x_test, y_test))

Epoch 1/5
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m43s[0m 104ms/step - accuracy: 0.7064 - loss: 0.5460 - val_accuracy: 0.8303 - val_loss: 0.3886
Epoch 2/5
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m47s[0m 121ms/step - accuracy: 0.8530 - loss: 0.3493 - val_accuracy: 0.8322 - val_loss: 0.3819
Epoch 3/5
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m55s[0m 142ms/step - accuracy: 0.8746 - loss: 0.3100 - val_accuracy: 0.8471 - val_loss: 0.3553
Epoch 4/5
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m55s[0m 142ms/step - accuracy: 0.8919 - loss: 0.2694 - val_accuracy: 0.8492 - val_loss: 0.3626
Epoch 5/5
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m55s[0m 140ms/step - accuracy: 0.9062 - loss: 0.2410 - val_accuracy: 0.8410 - val_loss: 0.3895


<keras.src.callbacks.history.History at 0x15d05703290>

In [17]:
# Evaluate Model C
loss_c, accuracy_c = model_c.evaluate(x_test_tweets, y_test_tweets, verbose=2)
print(f"Model C Test Accuracy: {accuracy_c}")

# Evaluate Model D on the airline tweets dataset
loss_d, accuracy_d = model_d.evaluate(x_test_tweets, y_test_tweets, verbose=2)
print(f"Model D Test Accuracy on Airline Tweets: {accuracy_d}")

92/92 - 1s - 14ms/step - accuracy: 0.9088 - loss: 0.3098
Model C Test Accuracy: 0.9088114500045776
92/92 - 2s - 17ms/step - accuracy: 0.4737 - loss: 1.0728
Model D Test Accuracy on Airline Tweets: 0.47370219230651855


The results indicate the performance of both models Model C and Model D on the test subset of the airline tweets dataset. I have analyze the results:

Performance Metrics

Model C (Trained on Airline Tweets Dataset)
Test Accuracy: 90.88%
Test Loss: 0.3098

Model D (Trained on IMDB Dataset)
Test Accuracy on Airline Tweets Dataset: 47.37%
Test Loss on Airline Tweets Dataset: 1.0728

Interpretation
Model C Performance:
Model C achieved a high accuracy of 90.88% on the airline tweets test set. This indicates that the model is well-suited for the specific dataset it was trained on.
Low Loss: The low loss value of 0.3098 suggests that the model's predictions are close to the true labels, further supporting its strong performance.

Model D Performance:
Model D achieved a significantly lower accuracy of 47.37% when evaluated on the airline tweets test set. This indicates that the model trained on the IMDB dataset did not generalize well to the airline tweets dataset.
High Loss: The high loss value of 1.0728 indicates that the model's predictions are not close to the true labels, highlighting its poor performance on this dataset.

Key Takeaways
Dataset Specificity:
Model C: Since Model C was trained on the same type of data it was tested on airline tweets, it performs very well. This demonstrates the importance of training on a dataset that closely matches the target data.
Model D: Training on a different dataset IMDB movie reviews resulted in poor performance when tested on the airline tweets dataset. This suggests that the model did not transfer well between the different types of text data.

Transfer Learning Limitations:
The poor performance of Model D highlights the limitations of transfer learning when the source and target datasets are significantly different in terms of content and context. Although transfer learning can be powerful, its effectiveness depends on the similarity between the datasets.

Model Architecture:
Both models used the same architecture, emphasizing that the difference in performance is due to the data they were trained on rather than the model structure itself.

Conclusion
Model C is highly effective for sentiment analysis on the airline tweets dataset, achieving a high accuracy of 90.88%.
Model D, trained on the IMDB dataset, performed poorly on the airline tweets dataset, achieving an accuracy of only 47.37%.
These results underline the critical role of dataset relevance in training effective sentiment analysis models. For best results, models should be trained on data that closely matches the target application domain.