[Open in Colab](https://colab.research.google.com/github/IsraelVessel/AI-Assignment/blob/main/AIToolsAssignment.ipynb)  
# AI Tools Assignment — Mastering the AI Toolkit 🛠️🧠

This notebook contains the Theory, Practical implementations, Ethics & Optimization, and a Colab helper to run heavy tasks (MNIST) on GPU. Run cells in order.

In [None]:
# Colab helper: mount Drive, install deps, train MNIST, save artifacts
try:
    from google.colab import drive
    drive.mount('/content/drive')
    base_out = '/content/drive/MyDrive/AI_Assignment_Artifacts'
    import os
    os.makedirs(base_out, exist_ok=True)
    print('Drive mounted. Artifacts will be saved to', base_out)
except Exception as e:
    base_out = '/content'
    print('Drive not mounted, saving artifacts locally to /content', e)

# Install packages (optional in Colab)
!pip install -q tensorflow==2.12.0 scikit-learn spacy matplotlib pillow
!python -m spacy download en_core_web_sm -q

print('Helper ready. Use the MNIST cell below to train the model.')

In [None]:
# Part 1 — Theory answers (compact)

# Q1: TensorFlow vs PyTorch
# - TensorFlow: production tooling, TF Serving/TF Lite, Keras API for high-level models.
# - PyTorch: pythonic, eager-first, popular in research.

# Q2: Jupyter use cases
# 1. EDA and visualization.
# 2. Prototyping models and sharing reproducible experiments.

# Q3: spaCy vs string ops
# spaCy provides robust tokenization, POS tagging, dependency parsing, pretrained NER, and matchers — far more reliable than ad-hoc string rules.

print('Theory section: read markdown cells for full answers.')

## Part 1 — Theory (short answers)

## Part 2 — Practical: Iris (scikit-learn)

# Iris Decision Tree example
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report

data = load_iris(as_frame=False)
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
clf = DecisionTreeClassifier(random_state=42)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print(classification_report(y_test, y_pred, target_names=data.target_names))

## Part 2 — Practical: MNIST CNN (run in Colab)

# MNIST CNN (Colab recommended)
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import layers, models

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train.astype('float32')/255.0
x_test = x_test.astype('float32')/255.0
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)

model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()

history = model.fit(x_train, y_train, epochs=8, batch_size=128, validation_split=0.1)
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print('Test accuracy:', test_acc)

# Save sample predictions and plots to base_out if available
try:
    out = base_out
except NameError:
    out = '/content'

plt.figure(figsize=(6,4))
plt.plot(history.history['accuracy'], label='train_acc')
plt.plot(history.history['val_accuracy'], label='val_acc')
plt.legend()
plt.savefig(str(out) + '/mnist_training.png')
plt.close()

import random
idx = random.sample(range(len(x_test)), 5)
preds = model.predict(x_test[idx])
fig, axs = plt.subplots(1,5, figsize=(12,3))
for i, j in enumerate(idx):
    axs[i].imshow(x_test[j].squeeze(), cmap='gray')
    axs[i].set_title(f'Pred: {np.argmax(preds[i])}\nTrue: {y_test[j]}')
    axs[i].axis('off')
fig.savefig(str(out) + '/mnist_samples.png')
fig.clf()
model.save(str(out) + '/mnist_cnn.h5')

## Part 2 — Practical: spaCy NER & simple sentiment

In [None]:
# spaCy NER demo
import spacy
from spacy.matcher import PhraseMatcher
from PIL import Image, ImageDraw, ImageFont

nlp = spacy.load('en_core_web_sm')
reviews = [
    'I love the Acme SmartWatch, the battery lasts all week and the strap is comfortable.',
    'The Zeta Vacuum is noisy and stopped working after two weeks. Terrible experience.',
    'Great headphones by SoundMax — amazing bass and clear mids.'
]
matcher = PhraseMatcher(nlp.vocab, attr='LOWER')
patterns = [nlp.make_doc('Acme SmartWatch'), nlp.make_doc('Zeta Vacuum'), nlp.make_doc('SoundMax')]
matcher.add('PRODUCT', patterns)
lines = []
for r in reviews:
    doc = nlp(r)
    ents = [(ent.text, ent.label_) for ent in doc.ents]
    matches = matcher(doc)
    found = [doc[start:end].text for _, start, end in matches]
    lines.append('Review: ' + r)
    lines.append('Entities: ' + str(ents))
    lines.append('Matched: ' + str(found))
    lines.append('')

# render text to PNG for report
txt = '\n'.join(lines)
W, H = (1200, 400)
img = Image.new('RGB', (W, H), color='white')
draw = ImageDraw.Draw(img)
try:
    font = ImageFont.truetype('DejaVuSansMono.ttf', 14)
except:
    font = ImageFont.load_default()
draw.multiline_text((10,10), txt, fill='black', font=font)
out_path = base_out if 'base_out' in globals() else '/content'
img.save(str(out_path) + '/spacy_ner.png')
print('Saved spaCy NER output to', str(out_path) + '/spacy_ner.png')

## Part 3 — Ethics & Optimization

In [None]:
### Biases: MNIST and Reviews
- MNIST: handwriting diversity and scanner/camera differences can bias models.
- Reviews: rule-based sentiment fails on sarcasm and cultural language.

### Mitigations
- Data augmentation, auditing datasets for diversity, and using human-in-the-loop evaluation for sensitive predictions.

## How to run
- For best results, open this notebook in Colab using the link at the top, enable GPU, then run the Colab helper cell (Cell 2) and the MNIST cell (Cell 6).
- After running, download artifacts from Drive (/MyDrive/AI_Assignment_Artifacts/) and share them here for final PDF assembly.

In [None]:
Final notes: This notebook is intentionally concise. For detailed code and alternative scripts, see the repository files: `mnist_cnn.py`, `iris_classification.py`, `spacy_ner_sentiment.py`, and `md_to_pdf.py`.

## Part 3: Ethics & Optimization

Acknowledgements: Use Colab for GPU-heavy tasks. If you prefer local runs, ensure TensorFlow is installed and your Python version matches TF's supported list.

Contact: If you want, I can now run quick smoke tests, regenerate the PDF with embedded images (if you upload them or after the Colab run), and push the final artifacts to the repo.

Last updated: AI assistant — GitHub Copilot. Small edits are safe; let me know if you want full expanded answers inserted as code cells instead.