<a href="https://www.kaggle.com/code/datascientistsohail/pytorch-neural-network-se03ep05?scriptVersionId=118276024" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

### PyTorch Nueral Network Implementation for the prediction of Wine Quality 

PyTorch is a deep learning framework that provides a high-level interface for building and training neural networks. Neural networks are machine learning models that are designed to resemble the structure and function of the human brain, and they can be used for a wide range of tasks, including image classification, speech recognition, and natural language processing.

To implement a neural network for wine quality detection involves splitting it into training and validation sets, and converting the data into tensors, which are the basic data structures used in PyTorch. It requires to normalize the inputs to ensure that the model is not biased towards any particular feature.

Next, define the structure of the neural network using PyTorch's nn module. This typically involves creating a subclass of nn.Module and defining the forward pass of the network, which takes inputs, passes them through a series of layers, and returns the final outputs.

Once the network structure is defined,define a loss function and an optimizer. The loss function measures how well the network is able to predict the target values, and the optimizer updates the network's parameters to minimize the loss.

Finally, train the network by iterating over the training data and updating the parameters using backpropagation and the optimizer. The performance of the network is evaluated using cross-validation and a performance metric such as Cohen's Kappa score.

This is a high-level overview of the process of building and training a neural network in PyTorch for wine quality detection. 

**Loading the important packages/libraries**

In [1]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from sklearn.metrics import cohen_kappa_score
from sklearn.model_selection import KFold


import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

/kaggle/input/playground-series-s3e5/sample_submission.csv
/kaggle/input/playground-series-s3e5/train.csv
/kaggle/input/playground-series-s3e5/test.csv


In [2]:
df = pd.read_csv('/kaggle/input/playground-series-s3e5/train.csv')
df_test = pd.read_csv('/kaggle/input/playground-series-s3e5/test.csv')
submission = pd.read_csv('/kaggle/input/playground-series-s3e5/sample_submission.csv')

In [3]:
print(df.shape)
print(df_test.shape)

(2056, 13)
(1372, 12)


**Checking for missing or null values**

In [4]:
print('Missing values if any in train set: ', df.isnull().sum().sum())
print('Missing values if any in test set: ', df_test.isnull().sum().sum())

Missing values if any in train set:  0
Missing values if any in test set:  0


In [5]:
useful_features = [c for c in df.columns if c not in ['Id', 'quality']]
print(useful_features)
print('Total number of useful features: ', len(useful_features))

['fixed acidity', 'volatile acidity', 'citric acid', 'residual sugar', 'chlorides', 'free sulfur dioxide', 'total sulfur dioxide', 'density', 'pH', 'sulphates', 'alcohol']
Total number of useful features:  11


**Dividing the Dataset into X and y.**

In [6]:
X = df[useful_features]
y = df['quality']
X_test = df_test[useful_features]
del df
del df_test

**PyTorch Neural Network for Wine Quality**

In [7]:
# Define the neural network class
class WineQualityNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(11, 128)
        self.fc2 = nn.Linear(128, 128)
        self.fc3 = nn.Linear(128, 1)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

**Evaluation of the Pytorch on Training set**

In [8]:
# Convert data to tensors and normalize the inputs
X_tensor = torch.tensor(X.values, dtype=torch.float32)
X_tensor = (X_tensor - X_tensor.mean(dim=0)) / X_tensor.std(dim=0)

X_test_tensor = torch.tensor(X_test.values, dtype=torch.float32)
X_test_tensor = (X_test_tensor - X_test_tensor.mean(dim=0) / X_test_tensor.std(dim=0))

y_tensor = torch.tensor(y.values, dtype=torch.float32).unsqueeze(-1)

# Initialize the network, loss function, and optimizer
y_preds = []
model = WineQualityNet()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Define the training loop
num_epochs = 1000
for epoch in range(num_epochs):
    # Zero the gradients
    optimizer.zero_grad()
    # Forward pass
    output = model(X_tensor)
    # Compute the loss
    loss = criterion(output, y_tensor)
    # Backward pass
    loss.backward()
    # Update the parameters
    optimizer.step()

# Implement cross-validation
kf = KFold(n_splits=5)
cohen_kappa_scores = []
for train_index, test_index in kf.split(X_tensor):
    X_train, X_valid = X_tensor[train_index], X_tensor[test_index]
    y_train, y_valid = y_tensor[train_index], y_tensor[test_index]
    model.train()
    optimizer.zero_grad()
    output = model(X_train)
    loss = criterion(output, y_train)
    loss.backward()
    optimizer.step()
    model.eval()
    with torch.no_grad():
        y_pred = model(X_valid).squeeze(-1)
        y_pred = y_pred.round().clamp(0, 10)
        y_valid = y_valid.squeeze(-1)
    cohen_kappa_scores.append(cohen_kappa_score(y_valid, y_pred, weights='quadratic'))
print("Test Cohen's Kappa Score:", cohen_kappa_scores)

Test Cohen's Kappa Score: [0.8301432303729919, 0.8409428734772857, 0.8632003994148922, 0.8096768308940123, 0.8406265185614222]


**Evaluation of the Pytorch on Test set**

In [9]:
model.eval()
with torch.no_grad():
    y_test_pred = model(X_test_tensor).squeeze(-1)
    y_test_pred = y_test_pred.round().clamp(0, 10)


In [10]:
print(y_test_pred.shape)
print(submission.shape)

torch.Size([1372])
(1372, 2)


**Submission**

In [11]:
submission['quality'] = y_test_pred
submission.to_csv('submission.csv',index = False)