<a href="https://colab.research.google.com/github/LizzVallarie/AI-For-Software-Engineering/blob/main/AI_assignment_Assignment_3_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

TensorFlow vs PyTorch
TensorFlow:
- Developed by Google.
- Originally used static graphs (TF 1.x), but TF 2.x supports eager execution.
- Ideal for deployment using TensorFlow Serving, TensorFlow Lite, or TensorFlow.js.

PyTorch:
- Developed by Facebook.
- Eager execution by default (more Pythonic).
- Preferred in academia for research and fast experimentation.

Choose TensorFlow when:
- Production deployment is the priority.

Choose PyTorch when:
- Flexibility and rapid prototyping are important.


Two use cases for Jupyter Notebooks
1. **Experiment tracking:** Step-by-step model training, testing, and documentation.
2. **Interactive visualization:** Plotting and exploring datasets visually using matplotlib or seaborn.


In [25]:
# Scikit-learn - Iris Dataset
# Import libraries
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score
import pandas as pd

In [26]:
# Load dataset
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['species'] = iris.target

In [27]:
# Handle missing values (none in this dataset, just in case)
df = df.dropna()

In [28]:
# Encode labels (already encoded as integers)
X = df.drop('species', axis=1)
y = df['species']

In [29]:
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [30]:
# Train Decision Tree model
clf = DecisionTreeClassifier(random_state=42)
clf.fit(X_train, y_train)

In [31]:
# Predict
y_pred = clf.predict(X_test)

In [32]:
# Evaluation
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Precision (macro):", precision_score(y_test, y_pred, average='macro'))
print("Recall (macro):", recall_score(y_test, y_pred, average='macro'))

Accuracy: 1.0
Precision (macro): 1.0
Recall (macro): 1.0


 TensorFlow — MNIST CNN Classifier

In [33]:

import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt


In [34]:
# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

In [35]:
# Normalize and reshape
x_train = x_train.reshape(-1, 28, 28, 1) / 255.0
x_test = x_test.reshape(-1, 28, 28, 1) / 255.0

In [36]:
# Build CNN model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [37]:
# Compile model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

In [None]:
# Train model
model.fit(x_train, y_train, epochs=5, validation_split=0.1)

Epoch 1/5
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m38s[0m 22ms/step - accuracy: 0.9043 - loss: 0.3220 - val_accuracy: 0.9837 - val_loss: 0.0560
Epoch 2/5
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m29s[0m 17ms/step - accuracy: 0.9837 - loss: 0.0512 - val_accuracy: 0.9887 - val_loss: 0.0353
Epoch 3/5
[1m 556/1688[0m [32m━━━━━━[0m[37m━━━━━━━━━━━━━━[0m [1m18s[0m 16ms/step - accuracy: 0.9893 - loss: 0.0319

In [None]:
# Evaluate
test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test accuracy:", test_acc)

In [None]:
# Visualize predictions
import numpy as np

In [None]:
predictions = model.predict(x_test[:5])
for i in range(5):
    plt.imshow(x_test[i].reshape(28, 28), cmap='gray')
    plt.title(f"Prediction: {np.argmax(predictions[i])} | True: {y_test[i]}")
    plt.axis('off')
    plt.show()

spaCy — NER and Sentiment on Amazon Reviews

In [None]:
import spacy
from spacy import displacy
from textblob import TextBlob

In [None]:
# Load spaCy English model
nlp = spacy.load("en_core_web_sm")

In [None]:
# Sample reviews
reviews = [
    "I love the Apple iPhone 13. The camera is amazing!",
    "Samsung Galaxy phones have too much bloatware.",
    "This Sony headphone is very comfortable and the sound is great!"
]

for review in reviews:
    print("\nReview:", review)
    # Named Entity Recognition
    doc = nlp(review)
    for ent in doc.ents:
        print(f"Entity: {ent.text} - Label: {ent.label_}")
    # Sentiment Analysis using TextBlob
    sentiment = TextBlob(review).sentiment.polarity
    sentiment_label = "Positive" if sentiment > 0 else "Negative" if sentiment < 0 else "Neutral"
    print(f"Sentiment: {sentiment_label} (score: {sentiment:.2f})")

Ethics & Debugging

**Bias in MNIST:**
- Digit distribution might be imbalanced.
- Handwriting styles can vary across genders, age, or nationality.

**Bias in Amazon Reviews:**
- Prejudiced language or overrepresented product types can skew sentiment.

**Mitigation Tools:**
- TensorFlow Fairness Indicators: Identify model performance gaps across demographic slices.
- spaCy rules: Customize pipelines to ignore sensitive data like race or gender terms.


Troubleshooting Challenge

Common TensorFlow bugs:
- Input shape mismatch: Use `model.summary()` to verify shapes.
- Wrong loss function: Use `categorical_crossentropy` for one-hot labels, `sparse_categorical_crossentropy` for integers.
- Learning rate too high/low: Tune optimizer settings.
