<a href="https://colab.research.google.com/github/saffarizadeh/INSY5378/blob/main/Assignment_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://kambizsaffari.com/Logo/College_of_Business.cmyk-hz-lg.png" width="500px"/>

# *INSY 5378*

# **Assignment: Introduction to TensorFlow, PyTorch, JAX, and Keras**

Instructor: Dr. Kambiz Saffari

---

**Instructions**
- This assignment covers concepts from **Chapter 3: Introduction to TensorFlow, PyTorch, JAX, and Keras**.
- Write your answers directly in the provided cells.
- You may add additional cells if you want to test ideas, but only answers in the marked cells will be graded.


## Question 1: Tensors, Variables, and Gradients Across Frameworks

The chapter shows that TensorFlow, PyTorch, and JAX each handle tensors and gradient computation differently. This question gives you hands-on experience with each one.

**Part A - TensorFlow: Variables and GradientTape**

1. Create a TensorFlow constant tensor `b` from the list `[[1, 2, 3], [4, 5, 6]]` with dtype `float32`. Try to assign `b[0, 0] = 99.0`. In a comment, explain why this fails.
2. Create a `tf.Variable` called `v` initialized with the same values. Use `v[0, 0].assign(99.0)` to modify it. Print `v`.
3. Create a `tf.Variable` with value `3.0`. Inside a `tf.GradientTape` scope, compute `result = tf.square(input_var)`. Retrieve and print the gradient. In a comment, verify the result analytically.

**Part B - PyTorch: Autograd and Gradient Accumulation**

1. Create `input_var = torch.tensor(3.0, requires_grad=True)`. Compute `result = torch.square(input_var)`, call `result.backward()`, and print `input_var.grad`.
2. **Without resetting** the gradient, compute and call `backward()` again. Print `input_var.grad`. In a comment, explain why the value doubled and why `model.zero_grad()` matters during training.

**Part C - JAX: Statelessness and Functional Gradients**

1. Create `x = jnp.array([1, 2, 3], dtype="float32")`. Try `x[0] = 10.0` (it will fail). Use `x.at[0].set(10.0)` to create `new_x`. Print both. In a comment, explain why JAX forbids in-place modification.
2. Define `compute_loss(x)` returning `jnp.square(x)`. Use `jax.value_and_grad(compute_loss)` to get the loss and gradient for `jnp.array(3.0)`. Print both.

In a comment at the end, briefly describe one key strength and one key weakness of each framework (TensorFlow, PyTorch, JAX) as discussed in the chapter.


In [None]:
# Write your answer to Question 1 here

## Question 2: The Keras Layer Class

The chapter describes the `Layer` as the central abstraction in Keras. A layer encapsulates weights (state) and a forward-pass computation.

**Part A - Implement a Custom Layer**

Following the `SimpleDense` example from the chapter, implement a custom Keras layer by subclassing `keras.Layer`:

- `__init__(self, units, activation=None)` should store `units` and `activation`.
- `build(self, input_shape)` should create weights `self.W` and `self.b` using `self.add_weight()`.
- `call(self, inputs)` should compute `y = keras.ops.matmul(inputs, self.W) + self.b` and apply the activation if provided.

Instantiate your layer with `units=32` and `activation=keras.ops.relu`, pass a test input of shape `(2, 784)` through it, and print the output shape.

**Part B - Understanding the Design**

In a comment, answer:
1. Why does Keras create weights in `build()` rather than `__init__()`? What is "automatic shape inference"?
2. Why do we put our computation in `call()` instead of `__call__()`?


In [None]:
# Write your answer to Question 2 here

## Question 3: Compile, Fit, and the Keras Workflow

Before training a model you must configure the learning process with `compile()`. This question explores those choices.

**Part A - Compile a Model Two Ways**

Create a simple model:
```python
model = keras.Sequential([keras.layers.Dense(1)])
```

1. Compile it using **string shortcuts**: optimizer `"rmsprop"`, loss `"mean_squared_error"`, metrics `["accuracy"]`.
2. Compile the same model again using **object instances** instead, and this time pass a custom `learning_rate` of `0.01` to the optimizer. Print the model's optimizer learning rate to confirm it is set correctly.

In a comment, explain when you would prefer using object instances over string shortcuts.

**Part B - Choosing the Right Loss Function**

The chapter emphasizes that picking the right loss function is critical.

For each scenario below, write a one-line `model.compile()` call with the appropriate loss (you can reuse the model above or create new ones). You do not need to train these models.

1. Binary classification (two classes, single output with sigmoid activation).
2. Multi-class classification with integer labels (e.g., MNIST with labels 0-9).
3. Regression (predicting a continuous value).

In a comment, explain: what are the three things `compile()` configures, and why does the chapter say the loss function is the most important one to get right?


In [None]:
# Write your answer to Question 3 here

## Question 4: End-to-End Keras Model with Validation

This question walks through the complete Keras workflow on a binary classification task.

**Part A - Generate Data and Create a Validation Split**

Create a linearly separable dataset (as shown in the chapter):

```python
import numpy as np

num_samples_per_class = 1000
negative_samples = np.random.multivariate_normal(
    mean=[0, 3], cov=[[1, 0.5], [0.5, 1]], size=num_samples_per_class
)
positive_samples = np.random.multivariate_normal(
    mean=[3, 0], cov=[[1, 0.5], [0.5, 1]], size=num_samples_per_class
)
inputs = np.vstack((negative_samples, positive_samples)).astype(np.float32)
targets = np.vstack((
    np.zeros((num_samples_per_class, 1), dtype="float32"),
    np.ones((num_samples_per_class, 1), dtype="float32"),
))
```

Shuffle the data and split it into 70% training and 30% validation using `np.random.permutation` (as shown in the chapter). Print the shapes of your training and validation sets.

In a comment, explain why we must keep training and validation data strictly separate.

**Part B - Build, Compile, Train**

```python
model = keras.Sequential([
    keras.layers.Dense(1, activation="sigmoid"),
])
```

Compile with optimizer `"rmsprop"`, loss `"binary_crossentropy"`, and metrics `["binary_accuracy"]`. Train for 15 epochs with `batch_size=16`, passing the validation data. Store the return value of `fit()` in `history`.

**Part C - Evaluate, Predict, and Interpret**

1. Use `model.evaluate()` on the validation set and print the loss and accuracy.
2. Use `model.predict()` on the first 5 validation inputs. For each, print the predicted probability, the predicted class (threshold 0.5), and the actual label.
3. Print the training loss and validation loss from the last epoch using `history.history`.

In a comment, answer:
1. What is the difference between `model(inputs)` and `model.predict(inputs)`?
2. This model is a single Dense layer (a linear classifier). The chapter says choosing a topology defines a "hypothesis space." What kind of decision boundary can this model learn, and what would you change to handle data that is not linearly separable?


In [None]:
# Write your answer to Question 4 here

---
**Submission Reminder**
- Make sure your notebook runs from top to bottom without errors.
- Clearly label all answers.
- Include all required comments and explanations. They are part of your grade.
