# In-Class Challenge Assignment: Experimenting with the Perceptron
# Day 19 Extension
# CMSE 202

## Now that you have a working Perceptron Classifier... let's experiment with it a bit!

When building and testing your Perceptron Classifier you used a simplified version of the iris dataset that has been reduced to just two features and two class labels. 

### Will your Perceptron classifier work on a more complex dataset?

Another widely used dataset for experimenting with binary classification is the [sonar dataset](https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks)).

A version of this dataset can be found here:

`https://raw.githubusercontent.com/msu-cmse-courses/cmse202-supplemental-data/main/data/sonar.csv`

Make sure you take a moment to read the [UC Irvine Machine Learning Repository page](https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks)) to understand exactly what is in this dataset, but essentially is a collection of sonar measurements of rocks and "mines" (metal cynlinders). 



---
### Testing your new tool and exploring others

With any time that you have left in class, see if you can accomplish the following:

1. Load up the sonar dataset and change the class labels so that they can be used with the Perceptron classifier.

2. Use the Perceptron classifier you built from scratch to see how well you can do at distinguishing rocks from mines. You may need to make some modifications to your code if you didn't build it to be flexible enough to accept an arbitary number of data deatures. Experiment with the learning rate and number of iterations to see how high of an accuracy you can get with your classifier.

3. If you get your Perceptron classifier working, can you figure out how to use the Perceptron Classifier that is available in [scikit-learn](https://scikit-learn.org/stable/index.html)? You may need to do a bit of Google searching and exploration of the documentation to figure this out. How well does the scikit-learn version do compared to the one you built?

<!--
4. If you're feeling really ambitious, can you build a Perceptron classifier with [Tensorflow](https://www.tensorflow.org/)? Remember, the Perceptron is basically just a single-neuron single-layer neural network. This requires installation of Tensorflow and relevant APIs.
-->

4. The logistic regression model (Day 15) is also a multi-variable classifier. Use it on the same dataset. Compare the results of your perceptron classifier against that obtained from and discuss your observations.

---
&#9989; **Do This**: Load up the sonar data.

In [10]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.linear_model import Perceptron as SklearnPerceptron
from sklearn.linear_model import LogisticRegression

# Step 1: Load the dataset from the given URL
url = "https://raw.githubusercontent.com/msu-cmse-courses/cmse202-supplemental-data/main/data/sonar.csv"
data = pd.read_csv(url, header=None)

# Step 2: Preprocess the data
X = data.iloc[:, :-1].values  # Features (all columns except the last)
y = data.iloc[:, -1].values   # Labels (last column)

# Convert labels to binary: 'M' -> 1, 'R' -> 0
y = np.where(y == 'M', 1, 0)

# Ensure X is a NumPy array (this step guarantees correct data types for matrix operations)
X = np.array(X)

# Step 3: Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Step 4: Define your custom Perceptron classifier
class Perceptron:
    def __init__(self, learning_rate=0.01, n_iterations=1000):
        self.learning_rate = learning_rate



Copy your percentron class to the cell below. **Note** sklearn has a **Perceptron** function. We should avoid using the same function name of your perceptron class.

In [11]:
import numpy as np

class MyPerceptron:
    def __init__(self, learning_rate=0.01, n_iterations=1000):
        self.learning_rate = learning_rate
        self.n_iterations = n_iterations
        self.weights = None
        self.bias = None

    def _activation(self, x):
        """Step activation function"""
        return 1 if x >= 0 else 0

    def fit(self, X, y):
        """Train the perceptron model using the provided training data."""
        # Initialize weights and bias to zero
        self.weights = np.zeros(X.shape[1])
        self.bias = 0

        # Gradient Descent / Perceptron learning rule
        for _ in range(self.n_iterations):
            for i in range(len(y)):
                linear_output = np.dot(X[i], self.weights) + self.bias
                predicted = self._activation(linear_output)
                
                # Perceptron weight update rule
                if predicted != y[i]:
                    error = y[i] - predicted
                    self.weights += self.learning_rate * error * X[i]
                    self.bias += self.learning_rate * error

    def predict(self, X):
        """Predict using the trained perceptron model."""
        linear_output = np.dot(X, self.weights) + self.bias
        return np.array([self._activation(x) for x in linear_output])

# Usage example:
# perceptron = MyPerceptron(learning_rate=0.01, n_iterations=1000)
# perceptron.fit(X_train, y_train)
# y_pred = perceptron.predict(X_test)



Train your percentron class with sonar data. What's the accuracy?

In [12]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Step 1: Load the Sonar dataset
url = "https://raw.githubusercontent.com/msu-cmse-courses/cmse202-supplemental-data/main/data/sonar.csv"
df = pd.read_csv(url, header=None)

# Step 2: Preprocess the data
# Replace class labels ('R' and 'M') with 0 and 1
df[60] = df[60].map({'R': 0, 'M': 1})

# Separate features and labels
X = df.drop(60, axis=1).values  # Features (all columns except the last one)
y = df[60].values  # Labels (last column)

# Step 3: Split the dataset into training and test sets (80% training, 20% testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 4: Train the Perceptron model
perceptron = MyPerceptron(learning_rate=0.01, n_iterations=1000)
perceptron.fit(X_train, y_train)

# Step 5: Make predictions on the test set
y_pred = perceptron.predict(X_test)

# Step 6: Evaluate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy on Sonar dataset: {accuracy * 100:.2f}%")



TypeError: can't multiply sequence by non-int of type 'float'

---
&#9989; **Do This**: Use the **Perceptron** function from sklearn library to classify the same dataset in the cell below. Compare to your percentron classifier, how is the performance of the percetron in the sklearn library?

In [None]:
# put your code here


---
&#9989; **Do This**: Use **logistic regress model** from statsmodel library to classify the same dataset in the cell below. 

* Note that the full sonar data set contains some values that will result in singular values in the logistic regression. Thus, we will use only the first 40 attritbutes (columns) in the sonar dataset. The class label is still the last column in the sonar dataset.
* We will add constant to the model, which is equivalent to the bias weight.
* The Logit function requires the labels to be 1 or 0. You'll need to replace '-1' in the labels to '0' for the Logit function.
* Let's set test_size = 0.15 in the train-test split, and fit the model using the training set.
* Predict the labels of the test set. How is the accuracy?

In [None]:
# put your code here


Train your percentron class with the training set. 
* Don't forget that the labels in the sonar dataset is '-1'. You probably need to convert '0' in the train_labels and test_labels back to '-1'.
* Use the features in the test set in your percetron prediction function to predict the labels. 

In [None]:
# put your code here


---
&#9989; **Do This**: Give a short discussion of the comparison between the results from the different classifiers.


-----
### Congratulations, we're done!

Now, you just need to submit this assignment by uploading it to the course <a href="https://d2l.msu.edu/">Desire2Learn</a> web page for today's submission folder (Don't forget to add your names in the first cell).


&#169; Copyright 2025, The Department of Computational Mathematics, Science and Engineering; Michigan State University