# AI Tools Assignment: Mastering the AI Toolkit 🛠️🧠

This notebook covers theoretical and practical tasks on AI tools, frameworks, and ethical considerations.

## Part 1: Theoretical Understanding

### Q1: Explain the primary differences between TensorFlow and PyTorch. When would you choose one over the other?

**Answer:**
- **TensorFlow** uses static computation graphs (with eager execution as an option), is widely used in production, and has strong deployment support (e.g., TensorFlow Lite, TensorFlow Serving).
- **PyTorch** uses dynamic computation graphs, is more Pythonic, and is popular in research for its flexibility and ease of debugging.
- **Choose TensorFlow** for production-ready, scalable solutions. **Choose PyTorch** for rapid prototyping, research, and when dynamic graph construction is needed.

### Q2: Describe two use cases for Jupyter Notebooks in AI development.

**Answer:**
1. **Interactive Experimentation:** Test and visualize data preprocessing, model training, and evaluation in real-time.
2. **Documentation & Sharing:** Combine code, results, and explanations for reproducible research and team collaboration.

### Q3: How does spaCy enhance NLP tasks compared to basic Python string operations?

**Answer:**
- spaCy provides advanced NLP features like tokenization, part-of-speech tagging, named entity recognition, and dependency parsing.
- It handles linguistic nuances and context, whereas basic string operations only perform simple pattern matching or splitting.

### Comparative Analysis: Scikit-learn vs. TensorFlow

| Feature | Scikit-learn | TensorFlow |
|---------|--------------|------------|
| Target Applications | Classical ML (e.g., regression, SVM) | Deep Learning (e.g., neural networks) |
| Ease of Use | Beginner-friendly, simple API | Steeper learning curve, more flexible |
| Community Support | Large, mature community | Large, active community, strong industry adoption |

## Part 2: Practical Implementation

### Task 1: Classical ML with Scikit-learn (Iris Dataset)

In [None]:
# Load and preprocess the Iris dataset
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score
import pandas as pd

iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.Series(iris.target)

# Check for missing values
X = X.fillna(X.mean())

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Decision Tree
clf = DecisionTreeClassifier(random_state=42)
clf.fit(X_train, y_train)

# Predict and evaluate
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='macro')
recall = recall_score(y_test, y_pred, average='macro')

print(f'Accuracy: {accuracy:.2f}')
print(f'Precision: {precision:.2f}')
print(f'Recall: {recall:.2f}')

### Task 2: Deep Learning with TensorFlow (MNIST)

In [None]:
# Build and train a CNN on MNIST using TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
x_train = x_train[..., None]
x_test = x_test[..., None]

model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5, validation_split=0.1)

test_loss, test_acc = model.evaluate(x_test, y_test)
print(f'Test accuracy: {test_acc:.2f}')

In [None]:
# Visualize predictions on 5 sample images
import matplotlib.pyplot as plt
import numpy as np

sample_idx = np.random.choice(len(x_test), 5, replace=False)
sample_images = x_test[sample_idx]
sample_labels = y_test[sample_idx]
preds = model.predict(sample_images)

for i in range(5):
    plt.imshow(sample_images[i].reshape(28,28), cmap='gray')
    plt.title(f'True: {sample_labels[i]}, Pred: {np.argmax(preds[i])}')
    plt.axis('off')
    plt.show()

### Task 3: NLP with spaCy (Amazon Reviews)

In [None]:
# Named Entity Recognition and Sentiment Analysis with spaCy
import spacy
nlp = spacy.load('en_core_web_sm')

reviews = [
    "I love my new Apple iPhone! The camera is amazing.",
    "The Samsung headphones broke after a week. Very disappointed."
]

for review in reviews:
    doc = nlp(review)
    entities = [(ent.text, ent.label_) for ent in doc.ents]
    sentiment = 'positive' if any(word in review.lower() for word in ['love', 'amazing', 'great']) else 'negative'
    print(f'Review: {review}')
    print(f'Entities: {entities}')
    print(f'Sentiment: {sentiment}\n')

## Part 3: Ethics & Optimization

### Ethical Considerations

- **Potential Biases:**
  - MNIST: May not generalize to non-digit images or different handwriting styles.
  - Amazon Reviews: Sentiment rules may not capture sarcasm or cultural context.
- **Mitigation:**
  - Use tools like TensorFlow Fairness Indicators to evaluate model fairness.
  - Use spaCy's rule-based systems to refine entity and sentiment extraction.

### Troubleshooting Challenge

Below is a buggy TensorFlow script. Debug and fix errors (e.g., dimension mismatches, incorrect loss functions):

In [None]:
# Fixed TensorFlow script for classification
import tensorflow as tf
from tensorflow.keras import layers, models

# Dummy data
X = tf.random.normal((100, 20))
y = tf.random.uniform((100,), maxval=2, dtype=tf.int32)

model = models.Sequential([
    layers.Dense(32, activation='relu', input_shape=(20,)),
    layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X, y, epochs=3)

## Bonus: Model Deployment

Deploy your MNIST classifier using Streamlit or Flask. Example Streamlit code:

```python
import streamlit as st
import numpy as np
from tensorflow.keras.models import load_model

model = load_model('mnist_cnn.h5')
uploaded_file = st.file_uploader('Upload an image', type=['png', 'jpg'])
if uploaded_file:
    # Preprocess and predict
    st.image(uploaded_file)
    # ...
```

Include a screenshot and live demo link in your report.