# **PyTorch**

PyTorch is one of the biggest machine learning frameworks out there and it is used for providing the functionality to build, train, and test models. We will go over how to run a simple linear regression model using Python3 and PyTorch.

First, we can start by installing the PyTorch library using pip.

In [1]:
pip install torch

Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch)
  Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch)
  Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.2.106 (from torch)
  Using cached nvidia_curand_cu12-10.3.2.106-py3-

# **Tensors**

Tensors are data structures in PyTorch that are used for encoding the parameters and inputs/outputs of our model. They are very similar to arrays and matrices and are very compatible with **ndarrays** from the **numpy** library in python. We can create tensors in many different ways.

For more information on Tensors, check out the PyTorch guide: https://pytorch.org/tutorials/beginner/basics/tensorqs_tutorial.html

In [2]:
# Import our PyTorch library
import torch
import numpy as np

data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)

#print(x_data)


# Converting a np array to tensor
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

#print(x_np)

# Creating a tensor with random values and checking the attributes of our tensor
randomTensor = torch.rand(3,4)

print(randomTensor)
print(f"Shape of tensor: {randomTensor.shape}")
print(f"Datatype of tensor: {randomTensor.dtype}")
print(f"Device tensor is stored on: {randomTensor.device}")


# Storing the tensor on a GPU
# We move our tensor to the GPU if available
if torch.cuda.is_available():
    randomTensor = randomTensor.to("cuda")

tensor([[1, 2],
        [3, 4]])
tensor([[1, 2],
        [3, 4]])
tensor([[0.5330, 0.4052, 0.6336, 0.9844],
        [0.1022, 0.8949, 0.9983, 0.6709],
        [0.7443, 0.6058, 0.5377, 0.3247]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


# **Linear Regression with PyTorch**

We are going to run linear regression on some stock data. We will use **Yahoo Finance** to get the daily adjusted stock data for nvidia for the past 20 years. I will comment the code to explain what I'm doing and make it easy to read but the idea is to get the data, parse the data and store it in a tensor, then run linear regression on the data with Mean Squared Error as our loss function.

In [3]:
pip install yfinance torch matplotlib

Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch)
  Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch)
  Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.2.106 (from torch)
  Using cached nvidia_curand_cu12-10.3.2.106-py3-

In [None]:
import yfinance as yf
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime, timedelta

# Step 1: Download NVIDIA stock data for the past year
end_date = datetime.now()
start_date = end_date - timedelta(days=365)
stock_data = yf.download('NVDA', start=start_date, end=end_date)

# Use the adjusted closing price
stock_data = stock_data[['Adj Close']].reset_index()

# Step 2: Prepare data for linear regression
# Convert dates to numerical format (days since start date)
stock_data['Days'] = (stock_data['Date'] - stock_data['Date'].min()).dt.days

# Prepare data for PyTorch
x = torch.tensor(stock_data['Days'].values, dtype=torch.float32).view(-1, 1)
y = torch.tensor(stock_data['Adj Close'].values, dtype=torch.float32).view(-1, 1)

# Step 3: Define a simple linear regression model in PyTorch
class LinearRegressionModel(nn.Module):
    def __init__(self):
        super(LinearRegressionModel, self).__init__()
        self.linear = nn.Linear(1, 1)

    def forward(self, x):
        return self.linear(x)

# Initialize model, define loss and optimizer
model = LinearRegressionModel()
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)

# Step 4: Train the model
num_epochs = 1000
for epoch in range(num_epochs):
    model.train()

    # Forward pass
    predictions = model(x)
    loss = criterion(predictions, y)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if (epoch+1) % 100 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

# Step 5: Plot the results
plt.figure(figsize=(10, 6))
plt.scatter(stock_data['Date'], stock_data['Adj Close'], label='Actual Data', color='blue')
predicted_prices = model(x).detach().numpy()
plt.plot(stock_data['Date'], predicted_prices, label='Linear Regression Line', color='red')
plt.xlabel('Date')
plt.ylabel('Adjusted Close Price (USD)')
plt.title('NVIDIA Stock Price Linear Regression')
plt.legend()
plt.grid(True)
plt.show()
