### Question 1
**What is Deep Learning? Briefly describe how it evolved and how it differs from traditional machine learning.**

**Answer:**

Deep Learning is a subset of Machine Learning that uses multi-layered neural networks to automatically learn complex data representations. It evolved from early perceptron models in the 1950s, expanded with backpropagation in the 1980s, and has grown significantly in recent decades due to advances in computing power (GPUs), availability of large datasets, and improved algorithms. Unlike traditional ML, which relies on handcrafted features, Deep Learning automatically extracts hierarchical features directly from raw data, enabling superior performance in vision, speech, and NLP tasks.

### Question 2
**Explain the basic architecture and functioning of a Perceptron. What are its limitations?**

**Answer:**

A Perceptron consists of input nodes, associated weights, a summation function, and an activation function. Inputs are multiplied by their respective weights, summed, and passed through an activation function to produce an output. Limitations include inability to solve non-linearly separable problems (like XOR), sensitivity to input scaling, and being limited to linear decision boundaries.

### Question 3
**Describe the purpose of activation function in neural networks. Compare Sigmoid, ReLU, and Tanh functions.**

**Answer:**

Activation functions introduce non-linearity in neural networks, enabling them to learn complex mappings.
- **Sigmoid:** Outputs in range (0,1), good for probabilities, but suffers from vanishing gradient.
- **Tanh:** Outputs in range (-1,1), zero-centered, better than Sigmoid but still prone to vanishing gradients.
- **ReLU:** Outputs positive values directly, efficient and widely used, but suffers from the 'dying ReLU' problem where neurons may become inactive.

### Question 4
**What is the difference between Loss function and Cost function in neural networks? Provide examples.**

**Answer:**

- **Loss function** measures the error for a single training example (e.g., Mean Squared Error, Binary Cross-Entropy).
- **Cost function** is the average of loss functions across the entire training dataset.
For example, Binary Cross-Entropy Loss for one sample vs. the overall cost computed as the mean of all individual losses.

### Question 5
**What is the role of optimizers in neural networks? Compare Gradient Descent, Adam, and RMSprop.**

**Answer:**

Optimizers adjust weights to minimize the loss function.
- **Gradient Descent:** Updates weights using the slope of the loss function; can be slow.
- **RMSprop:** Uses adaptive learning rates by normalizing gradients, good for non-stationary problems.
- **Adam:** Combines momentum and RMSprop, efficient and widely used for training deep networks.

### Question 6
**Write a Python program to implement a single-layer perceptron from scratch using NumPy to solve the logical AND gate.**

In [None]:
import numpy as np

# AND gate data
X = np.array([[0,0],[0,1],[1,0],[1,1]])
y = np.array([0,0,0,1])

# Initialize weights and bias
weights = np.zeros(2)
bias = 0
lr = 0.1

# Training
for epoch in range(20):
    for i in range(len(X)):
        linear = np.dot(X[i], weights) + bias
        y_pred = 1 if linear >= 0 else 0
        error = y[i] - y_pred
        weights += lr * error * X[i]
        bias += lr * error

print("Trained weights:", weights)
print("Trained bias:", bias)

### Question 7
**Implement and visualize Sigmoid, ReLU, and Tanh activation functions using Matplotlib.**

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(-10,10,100)

sigmoid = 1/(1+np.exp(-x))
relu = np.maximum(0,x)
tanh = np.tanh(x)

plt.plot(x,sigmoid,label='Sigmoid')
plt.plot(x,relu,label='ReLU')
plt.plot(x,tanh,label='Tanh')
plt.legend()
plt.show()

### Question 8
**Use Keras to build and train a simple multilayer neural network on the MNIST digits dataset. Print the training accuracy.**

In [None]:
import tensorflow as tf
from tensorflow.keras import layers, models

# Load dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train/255.0, x_test/255.0

# Build model
model = models.Sequential([
    layers.Flatten(input_shape=(28,28)),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

history = model.fit(x_train, y_train, epochs=5, batch_size=32, validation_split=0.2)
print("Training Accuracy:", history.history['accuracy'][-1])

### Question 9
**Visualize the loss and accuracy curves for a neural network model trained on the Fashion MNIST dataset. Interpret the training behavior.**

In [None]:
import tensorflow as tf
import matplotlib.pyplot as plt

# Load dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
x_train, x_test = x_train/255.0, x_test/255.0

# Model
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28,28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

history = model.fit(x_train, y_train, epochs=5, validation_split=0.2)

# Plot curves
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.legend()
plt.show()

**Interpretation:** If training and validation curves are close, the model generalizes well. A large gap indicates overfitting, while both low accuracies indicate underfitting.

### Question 10
**Fraudulent transaction detection workflow.**

**Answer:**

- **Model Design:** A multilayer neural network with dense layers.
- **Activation & Loss:** ReLU for hidden layers, Sigmoid for output; Binary Cross-Entropy loss since it's a binary classification task.
- **Training & Evaluation:** Use stratified sampling, class weighting, or SMOTE for imbalance; evaluate using Precision, Recall, F1-score, and AUC.
- **Optimizer & Regularization:** Adam optimizer for fast convergence; apply dropout, early stopping, and L2 regularization to prevent overfitting.

In [None]:
import tensorflow as tf
from tensorflow.keras import layers, models

# Example model
model = models.Sequential([
    layers.Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    layers.Dropout(0.5),
    layers.Dense(32, activation='relu'),
    layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy', tf.keras.metrics.AUC()])

history = model.fit(X_train, y_train, epochs=10, batch_size=64,
                    validation_split=0.2, class_weight=class_weights)