# EECS 545 (WN 2024) Homework 2 Q2: Softmax Regression via Gradient Ascent

<span class="instruction">Before starting the assignment, please fill in the following cell.</span>

In [None]:
###################################################################
# Enter your first and last name, e.g. "John Doe"                 #
# for example                                                     #
# __NAME__ = "Yunseok Jang"                                       #
# __UNIQID__ = "yunseokj"                                         #
###################################################################
raise NotImplementedError("TODO: Add your implementation here.")
###################################################################
#                        END OF YOUR CODE                         #
###################################################################

print(f"Your name and email: {__NAME__} <{__UNIQID__}@umich.edu>")
assert __NAME__ and __UNIQID__

# Softmax Regression via Gradient Ascent
In this notebook you will implement a gradient ascent for softmax regression from a given dataset. 

Among various ways in computing gradient, we will use gradient ascent update rule derived in our homework.

After implementing it, you will report the accuracy of your implementation.

## Setup code
Before getting started, we need to run some boilerplate code to set up our environment. You'll need to rerun this setup code each time you start the notebook. Let's start by checking whether we are using Python 3.11 or higher.

In [None]:
import sys
if sys.version_info[0] < 3:
    raise Exception("You must use Python 3")

if sys.version_info[1] < 11:
    print("Autograder will execute your code based on Python 3.11 environment. Please use Python 3.11 or higher to prevent any issues")
    print("You can create a conda environment with Python 3.11 like 'conda create --name eecs545 python=3.11'")
    raise Exception("Python 3 version is too low: {}".format(sys.version))
else:
    print("You are good to go")

First, run this cell load the [autoreload](https://ipython.readthedocs.io/en/stable/config/extensions/autoreload.html) extension. This allows us to edit `.py` source files, and re-import them into the notebook for a seamless editing and debugging experience.

In [None]:
%load_ext autoreload
%autoreload 2

Once you located the `softmax_regression.py` correctly, run the following cell allow us to import from `softmax_regression.py`. If it works correctly, it should print the message:
```Hello from softmax_regression.py```

In [None]:
from softmax_regression import hello
hello()

Then, we run some setup code for this notebook: Import some useful packages and increase the default figure size.

In [None]:
# install required libraries
# !pip install numpy==1.24.1 scikit-learn==1.2.0

# import libraries
import math
import numpy as np

In [None]:
from IPython.display import display_html, HTML

display_html(HTML('''
<style type="text/css">
  .instruction { background-color: yellow; font-weight:bold; padding: 3px; }
</style>
'''));

## Load the dataset
The following codebase will load the dataset and print out the dimension of each file.

In [None]:
import os
filename = 'data/q2_data.npz'

assert os.path.exists(filename), f"{filename} cannot be found."

q2_data = np.load(filename)
for k, v in q2_data.items():
    print(f'Key "{k}" has shape {v.shape}')
    print(f'First three rows of {k} are {v[:3]}')

num_classes = len(np.unique(q2_data['q2y_test']))
print(f'We have {num_classes} different classes in our dataset')

Note: In this problem, the indexes in y starts from 1. We have three classes now, but please be aware that we will test with various number of classes in Autograder.

In [None]:
print(np.unique(q2_data['q2y_train']))
print(np.unique(q2_data['q2y_test']))

## Fit W to the train set of the data

Now that we have prepared our data, it is time to implement the gradient ascent algorithm for softmax regression. As the first step to implment, we will first implement the softmax probability computation. <span class="instruction">In the file `softmax_regression.py`, implement the function `compute_softmax_probs` that computes softmax for the data X and weight W.</span> You should double-check the numeric stability of your `compute_softmax_probs` implementations.

After implementing the softmax function, we will compute the weight W by fitting the train set. In this problem, we will use the gradient ascent algorithm in. <span class="instruction">Please implement gradient ascent in `gradient_ascent_train` of `softmax_regression.py`.</span>

We then measure the accuracy with respect to W. You need to implement `compute_accuracy` function in `softmax_regression.py`. Once you correctly implement all the codes, you should be able to get an accuracy above 90%.

In [None]:
from softmax_regression import gradient_ascent_train, compute_accuracy
import numpy as np

np.random.seed(0)
your_accuracy = None
W = gradient_ascent_train(q2_data['q2x_train'], q2_data['q2y_train'], num_classes)
print(W.shape)

your_accuracy = compute_accuracy(q2_data['q2x_test'], q2_data['q2y_test'], W, num_classes) * 100
print(f'The accuracy of Softmax Regression - our implementation: {your_accuracy:.2f}%')

## Bonus: Performance comparison with SciKit-Learn

At the end of this question, we would like to check whether our performance is reasonable or not. Here, we use [Logistic Regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) from [SciKit-Learn](https://scikit-learn.org/stable/). You should be able to get similar (or could be even better) performance as scikit-learn's one.

In [None]:
# !pip install scikit-learn if you haven't done so
from sklearn.linear_model import LogisticRegression

# check the previous cell output
assert your_accuracy is not None

# Note: accuracy varies depends on the solver
MLR = LogisticRegression(multi_class='multinomial',solver ='newton-cg')
MLR.fit(q2_data['q2x_train'], np.reshape(q2_data['q2y_train'], -1) - 1)

# Generate predictions and compute accuracy
preds = MLR.predict(q2_data['q2x_test']) + 1  # the shape is (50, )
preds = preds[:, np.newaxis]

# Count the number of matched label
accuracy = 100 * np.mean((preds == q2_data['q2y_test']).astype(np.float32))

print(f'The accuracy of Sklearn Logistic Regression is: {accuracy:.2f}%')
print(f'Please compare with the accuracy of your implementation: {your_accuracy:.2f}%')