In [0]:
!pip install torch torchvision

# Imports, Documentation


---



Imports necessary Torch, Torchvision, NumPy, and plotting libraries. 

Main Docs: https://pytorch.org/docs/stable/_modules/torch.html

Neural Networks: https://pytorch.org/docs/stable/nn.html

Functional: https://pytorch.org/docs/stable/_modules/torch/nn/functional.html 

Dataset Creation: https://pytorch.org/docs/stable/data.html

NumPy: https://docs.scipy.org/doc/numpy-1.15.1/reference/

Plotting: https://matplotlib.org/contents.html

In [0]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from IPython.core.debugger import set_trace
from torch import nn
from torch import optim
from torch.utils.data import Dataset
from pprint import pprint




%matplotlib inline

# GPU Operations


---



Example: Set device to run on GPU if possible, otherwise run on CPU. GPU operations should be faster. 

**Sunview cannot use GPU operations**, don't use this unless we can run everything on Google's servers

Docs: https://pytorch.org/docs/stable/notes/cuda.html

In [0]:
#device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 
#device

#Dataset Importing

----

Each feature needs to be converted from a *numpy* object to a *torch* object to be used with PyTorch's library. They are mostly interchangeable.

Mount Colab drive, get CSV in same folder as program. When on Windows machine, this process will be simpler since it'll just be a relative path
  

In [0]:
from google.colab import drive
drive.mount('/content/gdrive')

TrashingDataset will accept a CSV file as input. It will split the data into two chunks: CPU/RAM usage, and hard faults. These will create two vectors for feature crossing, one (2xN) and one (1xN). 

Scaling can be done to reduce the range the data actually covers. I'm not sure yet if this actually helps or not.

In [0]:
class TrashingDataset(Dataset):
  def __init__(self, csv_file):
    
    #Read in data, split into RAM/CPU usages VS hard faults
    self.dataframe = pd.read_csv(csv_file)
    
    self.faults = dataset.dataframe.iloc[:, 3:]
    self.usage = dataset.dataframe.iloc[:, 0:2]

    self.tensor_faults = torch.tensor(self.faults.values, dtype = torch.float)
    self.tensor_usage = torch.tensor(self.usage.values, dtype = torch.float)
    
    self.batch_size_faults = self.tensor_faults.size()[0]
    self.dimension_count_faults = self.tensor_faults.size()[1]

    self.batch_size_usage = self.tensor_usage.size()[0]
    self.dimension_count_usage = self.tensor_usage.size()[1]
    
#     self.scale()
  
  def __len__(self):
    return len(self.dataframe)
  
  def scale(self):
    
    #Scales down data
    usage_max, _ = torch.max(self.tensor_usage, 0)
    faults_max, _ = torch.max(self.tensor_faults, 0)
    self.tensor_usage = torch.div(self.tensor_usage, usage_max)  
    self.tensor_faults = torch.div(self.tensor_faults, faults_max)
    
  #WIP
#   def __getitem__(self, index):
#     item = self.dataframe.iloc[:,index]


with open('/content/gdrive/My Drive/Colab_Notebooks/stats.csv', 'r') as csv_file:
  dataset = TrashingDataset(csv_file)
  

X = dataset.tensor_usage
Y = dataset.tensor_faults
  
# print(dataset.tensor_faults)
# print(dataset.tensor_usage)


# dataset.dataframe

# fig, ax = plt.subplots()
# ax.plot(dataset.dataframe)

# Building Neural Nets

TO DO: Add in Relu layers, predict function

---

Every neural net in PyTorch has three core components:

**Model**: Defined by a class with minimally an \__init__() and forward() methods.This is where you actually build the graph your data will be traversing. 

**Loss Function**: This is how you determine how accurate your data is. If you have a line/model/etc predicting where your data will fall, and you have a data point not on that line/model, the distance between that point and your line is called "loss". Minimizing this loss is the ultimate goal of ML. Simplest of these is the MSE -- mean squared error 

**Optimizer**: This is our gradient descent. [SGD = Stochastic Gradient Descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent), or "Iterative". Let's us incrementally optimize a differentiable object. The learning rate controls how fast we're iterating. "Too low" rates will take too long, "too high" rates will overshoot and fail. 


We build this model using existing data, and then we want to know whether it's successful. This means we need both **Training Data** and **Test Data**. In both cases, we have data which we know its classification. A classic example would be "Is this e-mail spam", where we have an e-mail and its features (subject line, e-mail origin, percent caps in body) and have labeled whether or not it's spam. In our case, we have a set of features (CPU usage, page faults, etc) and will be labelling whether or not the system is considered "thrashing" at that time. We feed the system this data, and it builds a model. 

Then we expose it to our test data. This data should be similar to the training data, except we don't tell the model what it's classified as (spam/not spam, thrashing/not thrashing). This is how we determine whether the model was built correctly. 

We have to be careful not to [overfit](https://www.investopedia.com/terms/o/overfitting.asp) our data. Your model will always be amazing at predicting its own training data, but if you feed it the exact same data points when testing it, you're feeding your own confirmation bias. Remember, ML is essentially overcomplicated linear regression, you have points on a graph and are drawing a line to match it. If you test it using the exact same/near identical points it already had, you haven't learned anything about your model. 

In [0]:
del model

In [0]:
device = torch.device("cpu")

class MyModel(nn.Module):
  def __init__(self):
    super(MyModel, self).__init__()
    
    self.input_dimensions = 2
    self.output_dimensions = 1
    self.hidden_layer = 3
    
    self.learning_rate = 0.000005
    self.training_iterations = 1000
    
    self.weight1 = torch.randn(self.input_dimensions, self.hidden_layer) #[2,3]
    self.weight2 = torch.randn(self.hidden_layer, self.output_dimensions) #[3,1]!
    
    self.linear_layer_1 = nn.Linear(self.input_dimensions, self.hidden_layer)
    self.linear_layer_2 = nn.Linear(self.hidden_layer, 1)
    
    
    
  def forward(self, x):
    h_relu = self.linear_layer_1(x)#.clamp(min=0)
    y_pred = self.linear_layer_2(h_relu)
    output = self.sigmoid(y_pred)
    return output
  
#   def train(self, X, Y):
#     output = self.forward(X)
#     self.backward
    
  
  def sigmoid(self, x):
    return 1 / (1 + torch.exp(-x))
  
  #Returns a percentage from 0 to 1, how "likely" %
  def predict(self, inputs):
    weight = list(self.linear_layer_1.parameters())
    self.weight = weight
    self.weight_data = weight[0]
    print(type(inputs))
    print(type(self.weight))
    print(type(self.weight_data))
#     input1 = np.dot(inputs, weight)
#     return self.sigmoid()

  def save_weights(self, model):
    torch.save(model, "neural_net")
  
  def load_weights(self):
    torch.load("neural_net")

# model = MyModel()
# model(X)




model = MyModel()
for i in range(model.training_iterations):
    model.train()
    Y_next = model(X)
    optimizer = optim.SGD(model.parameters(), model.learning_rate) #Somehow works without this????
    criterion = nn.MSELoss()
    loss = criterion(Y_next, Y)
    loss.backward(loss) 
    optimizer.step() #updates params of lin reg model 

    model.eval()
    with torch.no_grad(): 
      y_next = model(X)
#     print ("#" + str(i) + " Loss: " + str(torch.mean((Y - Y_next)**2).detach().item()))  # mean sum squared loss

#     model.train(X, Y)

test_tensor_usage = torch.FloatTensor([12, 15])
test_tensor_faults = torch.FloatTensor([146])

# usage_max, _ = torch.max(test_tensor_usage, 0)
# faults_max, _ = torch.max(test_tensor_faults, 0)
# test_tensor_usage = torch.div(test_tensor_usage, usage_max) 
# test_tensor_faults = torch.div(test_tensor_faults, faults_max) 


model.forward(test_tensor_usage)

  
# model.save_weights(model)
# model.predict()

# # x = dataset.tensor.to(device)
# x = dataset.tensor.float()
# type(x)

#Training Neural Nets

TO DO: Training needs to be part of the model itself

----

WIP


In [0]:
model.train()
optimizer.zero_grad()

y_next = model(X)
y_next.size()
x.size()
# loss = criterion(y_next.squeeze(1), x) #Squeeze prevents mismatch error 

# loss.backward(loss) 
# optimizer.step() #updates params of lin reg model 

# model.eval()
# with torch.no_grad(): 
#   y_next = model(x)
  

# fig, ax = plt.subplots()
# ax.plot(x.cpu().numpy(), y_next.cpu().numpy(), ".", label = "pred")
# # ax.plot(x.cpu().numpy(), y.cpu().numpy(), ".", label = "data")
# ax.set_title(f"MSE: {loss.item():0.1f}")
# ax.legend();

In [0]:

test_list = torch.FloatTensor([92, 95, 2, 146])

type(test_list)

model.forward(test_list)

# print("TEST")
# print(model.weight[0])
# print("TEST")