# Develop an NN Solution 🧠

Apply your PyTorch skills to the problem of optimized battery charging; training a simple NN to predict the `duration` for how long a user will plug-in and charge their device in future, based on past data. 

At this point, it is expected that you have two featurized, battery charging datasets: training data and test data. In this notebook, you should load in that data and use it to train and evaluate a simple neural network. 

This neural network does not need to be your best solution, just a proof-of-concept. 

> To know whether or not you are on the right track, aim for a test RMSE of around 3hrs. 

This RMSE (the square root of a squared value) roughly means that the average error in predicting the duration of a plug-in charge event is around 3hrs. The ideal would be 0 average error, but the reality is that being within a few hours of a correct prediction can still be very helpful in deciding when to pause and resume charging, such that battery lifespan can be increased without a sacrifice to the user experience! 

### Your tasks
To create an NN-based solution to OBC, complete the following tasks:

**Part 1: Loading Custom Data**

1. Load in your train/test battery datasets
2. Create DataLoaders for those datasets

**Part 2: Training and Evaluating a Simple NN**

3. Define and train a simple NN
4. Evaluate your NN on some test data, recording the resultant RMSE

You may be wondering: *How do I submit this in parts?* 

> You will be expected to submit this notebook **twice** for grading; once, when you've completed part one, and once when you've completed the entire baseline solution (part two). 

**Hint**: It may be helpful to reference the codebase for Household Power Prediction, which you saw a while ago—this codebase contains several helper functions for training and testing a model; including converting a typical MSE function into an RMSE value. 

In [1]:
# Import Nessecary Libraries
import torch
from torch.utils.data import Dataset, DataLoader, random_split
import pandas as pd

In [27]:
# Create Data Loader Class
class OBC_Dataset(Dataset):
    def __init__(self, filename):
        # Read-In DataFrame
        df = pd.read_csv(filename)

        # Seperate features and target (target = last column)
        input_features = df.iloc[:, :-1].values.astype(dtype = 'float32')
        target = df.iloc[:, -1:].values.astype(dtype = 'float32')

        # Convert Features/Target into tensor dtype
        self.x = torch.tensor(input_features, dtype = torch.float32)
        self.y = torch.tensor(target, dtype = torch.float32)

    def __len__(self):
        # Nessecary __len__ method
        return len(self.y)

    def __getitem__(self, index):
        # Nessecary Fetch item method
        return self.x[index], self.y[index]

    def split_data(self, n_test):
        # Split into desired train-test split
        test_size = round(n_test * len(self.x))
        train_size = len(self.x) - test_size
        return random_split(self, [train_size, test_size])

In [34]:
# Get DataFrame
df = OBC_Dataset("data/engineered_dataset.csv")

In [35]:
# Split Into Train-Test Split
train, test = df.split_data(n_test = 0.2)

In [37]:
# Examine Train Length to Ensure Validity
len(train)

1032899

In [38]:
# Examine Test Length to Ensure Validity
len(test)

258225

In [39]:
# Examine Input Features & Target of Sample
index = 2
features, target = train[index]
print("Features at position: ", index, ":", features)
print("\n\nTarget at position: ", index, ":", target)

Features at position:  2 : tensor([ 1.0000,  0.0000, 15.0000,  0.0000,  0.0000, 12.0000, 22.6158, 20.0000])


Target at position:  2 : tensor([0.])


In [40]:
# Add to Data Loaders
train_loader = DataLoader(train, batch_size = 32, shuffle = True)
test_loader = DataLoader(test, batch_size = 32)
# ------- END OF ASSIGNMENT ONE ------- #