# DS 542 Notebook 6



## Download Data

The following cells download the UCI Abalone data set.
You should not need to modify any of these cells.

In [2]:
!wget https://archive.ics.uci.edu/static/public/1/abalone.zip

--2024-09-28 03:53:49--  https://archive.ics.uci.edu/static/public/1/abalone.zip
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified
Saving to: ‘abalone.zip’

abalone.zip             [  <=>               ]  54.06K   168KB/s    in 0.3s    

2024-09-28 03:53:50 (168 KB/s) - ‘abalone.zip’ saved [55357]



In [3]:
!unzip abalone.zip

Archive:  abalone.zip
  inflating: Index                   
  inflating: abalone.data            
  inflating: abalone.names           


In [4]:
!cat abalone.names

1. Title of Database: Abalone data

2. Sources:

   (a) Original owners of database:
	Marine Resources Division
	Marine Research Laboratories - Taroona
	Department of Primary Industry and Fisheries, Tasmania
	GPO Box 619F, Hobart, Tasmania 7001, Australia
	(contact: Warwick Nash +61 02 277277, wnash@dpi.tas.gov.au)

   (b) Donor of database:
	Sam Waugh (Sam.Waugh@cs.utas.edu.au)
	Department of Computer Science, University of Tasmania
	GPO Box 252C, Hobart, Tasmania 7001, Australia

   (c) Date received: December 1995


3. Past Usage:

   Sam Waugh (1995) "Extending and benchmarking Cascade-Correlation", PhD
   thesis, Computer Science Department, University of Tasmania.

   -- Test set performance (final 1044 examples, first 3133 used for training):
	24.86% Cascade-Correlation (no hidden nodes)
	26.25% Cascade-Correlation (5 hidden nodes)
	21.5%  C4.5
	 0.0%  Linear Discriminate Analysis
	 3.57% k=5 Nearest Neighbour
      (Problem encoded as a classification task)

   -- Data set samp

## Prepare Data

This section reads the data set and converts it to PyTorch tensors.
You probably do not need to modify this section.

In [5]:
import numpy as np
import torch

In [6]:
abalone_X = []
abalone_Y = []
with open('abalone.data') as abalone_file:
    for line in abalone_file:
        line = line.rstrip("\n")
        line = line.split(",")

        # drop initial sex column
        line = line[1:]

        # convert from strings to numbers
        line = [float(v) for v in line]

        abalone_X.append(line[:-1])
        abalone_Y.append(line[-1])

abalone_X = np.array(abalone_X)
abalone_Y = np.array(abalone_Y)

In [7]:
# GPU configuration

def to_gpu(t):
    if torch.cuda.is_available():
        return t.cuda()
    return t

def to_numpy(t):
    return t.detach().cpu().numpy()

device = to_gpu(torch.ones(1,1)).device
device

device(type='cuda', index=0)

In [8]:
# switch from NumPy arrays to Torch tensors

abalone_X = torch.tensor(abalone_X, device=device)
abalone_Y = torch.tensor(abalone_Y, device=device)

In [9]:
abalone_X

tensor([[0.4550, 0.3650, 0.0950,  ..., 0.2245, 0.1010, 0.1500],
        [0.3500, 0.2650, 0.0900,  ..., 0.0995, 0.0485, 0.0700],
        [0.5300, 0.4200, 0.1350,  ..., 0.2565, 0.1415, 0.2100],
        ...,
        [0.6000, 0.4750, 0.2050,  ..., 0.5255, 0.2875, 0.3080],
        [0.6250, 0.4850, 0.1500,  ..., 0.5310, 0.2610, 0.2960],
        [0.7100, 0.5550, 0.1950,  ..., 0.9455, 0.3765, 0.4950]],
       device='cuda:0', dtype=torch.float64)

In [10]:
abalone_X.shape

torch.Size([4177, 7])

In [11]:
abalone_Y

tensor([15.,  7.,  9.,  ...,  9., 10., 12.], device='cuda:0',
       dtype=torch.float64)

In [16]:
abalone_Y.shape

torch.Size([4177, 1])

## Problem 1 - Linear Regression

Use PyTorch to implement linear regression of the abalone Rings column saved in `abalone_Y` using the columns in `abalone_X` as inputs.
Train your linear model using gradient descent as described in lecture.

You can freely use code from the [example training notebook shared in class](https://colab.research.google.com/drive/1xWo_rF0exGdewtaMZP5LNBKJolxUVt8u?usp=sharing).
This model should be much simpler than that example - in particular, you do not not need (and should not have) the Fourier features, hidden layers, and activation functions.

Feel free to add extra cells as you feel appropriate.

In [13]:
# BUILD AND TRAIN YOUR LINEAR MODEL HERE
def gradient_descent(X, y, theta, bias,learning_rate, iterations):
    m = len(y)
    cost_history = torch.zeros(iterations).to(X.device)

    for i in range(iterations):
        predictions = X.mm(theta)+bias
        errors = predictions - y
        gradient_theta = (1/m) * X.t().mm(errors)
        gradient_b = (1/m) * torch.sum(errors)

        theta = theta - learning_rate * gradient_theta
        bias = bias - learning_rate * gradient_b


        cost_history[i] = (1/(2*m)) * torch.sum(errors**2)

    return theta,bias, cost_history


In [18]:
# PRINT YOUR LINEAR MODEL COEFFICIENTS HERE
abalone_X = abalone_X.to(torch.float32)  # Ensure X is float32
abalone_Y = abalone_Y.to(torch.float32)  # Ensure Y is float32
theta_initial = torch.randn((abalone_X.shape[1], 1), dtype=torch.float32).to(device)
bias = torch.tensor(0.0, dtype=torch.float32, device=device)

theta_final, bias, cost_history = gradient_descent(abalone_X, abalone_Y, theta_initial, bias, 0.00001, 1000)

print("theta：", theta_final)


theta： tensor([[-0.7853],
        [-1.4158],
        [-0.8171],
        [-0.6219],
        [ 0.7944],
        [ 0.9757],
        [-0.6478]], device='cuda:0')


In [19]:
# PRINT YOUR LINEAR BIAS HERE.
print('bias:' ,bias)


bias: tensor(0.1127, device='cuda:0')


In [20]:
abalone_X.shape

torch.Size([4177, 7])

## Problem 2 - A Better Model

Build and train a separate model using PyTorch using at least one hidden layer.
Then answer the questions below.

Again, you can freely use code from the [example training notebook shared in class](https://colab.research.google.com/drive/1xWo_rF0exGdewtaMZP5LNBKJolxUVt8u?usp=sharing).
You should not need the Fourier features for this particular model.

Feel free to add extra cells as you feel appropriate.

In [33]:
import torch
import torch.nn as nn
import torch.optim as optim

class SimpleNN(nn.Module):
    def __init__(self, input_size, hidden_width, output_size):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_width)
        self.fc2 = nn.Linear(hidden_width, hidden_width)
        self.fc3 = nn.Linear(hidden_width, output_size)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = torch.sigmoid(self.fc3(x))
        return x


input_size = 7
hidden_width = 128
output_size = 1
learning_rate = 0.001
iterations = 1000

x_train = abalone_X
y_train = abalone_Y

Describe the second model that you built.
Include the number and widths of the hidden layers, the activation functions, and anything else that you deem important.

The NN model I use 2 hidden layers and an output layer.Hidden_width =128.Activation function is relu in hidden layers and sigmoid in output layer. I choose the learning rate =0.001 and 1000 iterations.

What loss value did your second model achieve?

In [32]:
# PRINT YOUR LOSS HERE

def train_network(x_train, y_train, input_size, hidden_width, output_size, learning_rate, iterations, device):
    x_train = x_train.clone().detach().float().to(device)
    y_train = y_train.clone().detach().float().to(device)

    model = SimpleNN(input_size=input_size, hidden_width=hidden_width, output_size=output_size).to(device)
    criterion = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)

    for epoch in range(iterations):
        outputs = model(x_train)
        loss = criterion(outputs, y_train)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if (epoch+1) % 100 == 0:
            print(f'Epoch [{epoch+1}/{iterations}], Loss: {loss.item():.4f}')

    return model

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


trained_model = train_network(x_train, y_train, input_size, hidden_width, output_size, learning_rate, iterations, device)


Epoch [100/1000], Loss: 90.2152
Epoch [200/1000], Loss: 90.2042
Epoch [300/1000], Loss: 90.2037
Epoch [400/1000], Loss: 90.2036
Epoch [500/1000], Loss: 90.2036
Epoch [600/1000], Loss: 90.2035
Epoch [700/1000], Loss: 90.2035
Epoch [800/1000], Loss: 90.2035
Epoch [900/1000], Loss: 90.2035
Epoch [1000/1000], Loss: 90.2035
