# Spaceship Titanic Kaggle Competition

## Dataset Description
In this competition your task is to predict whether a passenger was transported to an alternate dimension during the Spaceship Titanic's collision with the spacetime anomaly. To help you make these predictions, you're given a set of personal records recovered from the ship's damaged computer system.

## File and Data Field Descriptions
### `train.csv` 
Personal records for about two-thirds (~8700) of the passengers, to be used as training data.

`PassengerId` - A unique Id for each passenger. Each Id takes the form gggg_pp where gggg indicates a group the passenger is travelling with and pp is their number within the group. People in a group are often family members, but not always.

`HomePlanet` - The planet the passenger departed from, typically their planet of permanent residence.

`CryoSleep` - Indicates whether the passenger elected to be put into suspended animation for the duration of the voyage. Passengers in cryosleep are confined to their cabins.

`Cabin` - The cabin number where the passenger is staying. Takes the form deck/num/side, where side can be either P for Port or S for Starboard.

`Destination` - The planet the passenger will be debarking to.

`Age` - The age of the passenger.

`VIP` - Whether the passenger has paid for special VIP service during the voyage.

`RoomService`, `FoodCourt`, `ShoppingMall`, `Spa`, `VRDeck` - Amount the passenger has billed at each of the Spaceship Titanic's many luxury amenities.

`Name` - The first and last names of the passenger.

`Transported` - Whether the passenger was transported to another dimension. This is the target, the column you are trying to predict.

### `test.csv` 
Personal records for the remaining one-third (~4300) of the passengers, to be used as test data. Your task is to predict the value of Transported for the passengers in this set.

### `sample_submission.csv` 
A submission file in the correct format.

`PassengerId` - Id for each passenger in the test set.

`Transported` - The target. For each passenger, predict either True or False.

### Imports

In [43]:
import torch
import pandas as pd
import numpy as np
import time
from sklearn.preprocessing import LabelEncoder
import torch.nn as nn
import torch.nn.functional as F
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from torch.utils.tensorboard import SummaryWriter

In [44]:
writer = SummaryWriter(f'runs/exp-{time.time()}')

# Define the CSV file path
train_file = 'train.csv' 
test_file = 'test.csv' 

# Read the CSV file into a pandas DataFrame
train_df = pd.read_csv(train_file)
test_df = pd.read_csv(test_file)

# Apply correct data type to some fields
train_df['CryoSleep'] = train_df['CryoSleep'].astype(bool)
train_df['VIP'] = train_df['VIP'].astype(bool)

train_df.head()

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported
0,0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False
1,0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True
2,0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False
3,0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False
4,0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True


### Preprocessing 

Add a field called GroupId to the data. This is the first four characters of the PassengerId field.

In [45]:
train_df['GroupId'] = train_df['PassengerId'].str[:4].astype(int)

# Display the updated DataFrame
train_df.head()

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported,GroupId
0,0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False,1
1,0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True,2
2,0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False,3
3,0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False,3
4,0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True,4


Set CryoSleep to true if `RoomService`, `FoodCourt`, `ShoppingMall`, `Spa`, `VRDeck` are 0.

In [46]:
# Create a boolean condition to check if 'CryoSleep' is null and all specified columns are 0
condition = (train_df['CryoSleep'].isnull()) & (train_df['RoomService'] == 0) & (train_df['FoodCourt'] == 0) & (train_df['ShoppingMall'] == 0) & (train_df['Spa'] == 0) & (train_df['VRDeck'] == 0)

# Set 'CryoSleep' to True where the condition is True
train_df.loc[condition, 'CryoSleep'] = True

# Create a boolean condition to check if any of the specified columns are greater than 0
condition = (train_df['RoomService'] > 0) | (train_df['FoodCourt'] > 0) | (train_df['ShoppingMall'] > 0) | (train_df['Spa'] > 0) | (train_df['VRDeck'] > 0)

# Set 'CryoSleep' to False where the condition is True
train_df.loc[condition, 'CryoSleep'] = False


# Display the updated DataFrame
train_df.head()

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported,GroupId
0,0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False,1
1,0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True,2
2,0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False,3
3,0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False,3
4,0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True,4


If `Cabin` field is missing, copy it from a row with same `GroupId`

In [47]:
# Define a function to copy 'Cabin' from another row with the same 'GroupId'
def copy_cabin(row):
    if pd.isna(row['Cabin']):
        same_group = train_df[train_df['GroupId'] == row['GroupId']]
        if not same_group.empty:
            return same_group.iloc[0]['Cabin']
    return row['Cabin']

# Apply the function to fill missing 'Cabin' values
train_df['Cabin'] = train_df.apply(copy_cabin, axis=1)

train_df.head()

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported,GroupId
0,0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False,1
1,0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True,2
2,0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False,3
3,0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False,3
4,0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True,4


Split field `Cabin` into `CabinDeck`, `CabinNum` and `CabinSide`.

In [48]:
# Split the 'Cabin' column into three new columns: 'CabinDeck', 'CabinNum', and 'CabinSide'
train_df[['CabinDeck', 'CabinNum', 'CabinSide']] = train_df['Cabin'].str.extract('([A-Za-z]+)/(\d+)/([A-Za-z]+)')

# Display the updated DataFrame
train_df.head()

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported,GroupId,CabinDeck,CabinNum,CabinSide
0,0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False,1,B,0,P
1,0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True,2,F,0,S
2,0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False,3,A,0,S
3,0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False,3,A,0,S
4,0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True,4,F,1,S


Remove fields `HomePlanet`, `Destination`, `Cabin`, `Name`

In [49]:
# List of columns to remove
columns_to_remove = ['HomePlanet', 'Destination', 'Cabin', 'Name']

# Drop the specified columns from the DataFrame
train_df = train_df.drop(columns=columns_to_remove)

# Display the updated DataFrame
train_df.head()

Unnamed: 0,PassengerId,CryoSleep,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Transported,GroupId,CabinDeck,CabinNum,CabinSide
0,0001_01,False,39.0,False,0.0,0.0,0.0,0.0,0.0,False,1,B,0,P
1,0002_01,False,24.0,False,109.0,9.0,25.0,549.0,44.0,True,2,F,0,S
2,0003_01,False,58.0,True,43.0,3576.0,0.0,6715.0,49.0,False,3,A,0,S
3,0003_02,False,33.0,False,0.0,1283.0,371.0,3329.0,193.0,False,3,A,0,S
4,0004_01,False,16.0,False,303.0,70.0,151.0,565.0,2.0,True,4,F,1,S


If CryoSleep is true set `RoomService`, `FoodCourt`, `ShoppingMall`, `Spa`, and `VRDeck` to be 0.
If CryoSleep is false set `RoomService`, `FoodCourt`, `ShoppingMall`, `Spa`, and `VRDeck` to be the mean of the non-zero values.

In [50]:
# Set the specified columns to 0 where 'CryoSleep' is true
train_df.loc[train_df['CryoSleep'], ['RoomService', 'FoodCourt', 'ShoppingMall', 'Spa', 'VRDeck']] = 0

# Calculate the means of the columns
mean_room_service = int(train_df[train_df['CryoSleep'] == False]['RoomService'].mean())
mean_food_court = int(train_df[train_df['CryoSleep'] == False]['FoodCourt'].mean())
mean_shopping_mall = int(train_df[train_df['CryoSleep'] == False]['ShoppingMall'].mean())
mean_spa = int(train_df[train_df['CryoSleep'] == False]['Spa'].mean())
mean_vr_deck = int(train_df[train_df['CryoSleep'] == False]['VRDeck'].mean())

# Fill missing values with the calculated means
train_df['RoomService'].fillna(mean_room_service, inplace=True)
train_df['FoodCourt'].fillna(mean_food_court, inplace=True)
train_df['ShoppingMall'].fillna(mean_shopping_mall, inplace=True)
train_df['Spa'].fillna(mean_spa, inplace=True)
train_df['VRDeck'].fillna(mean_vr_deck, inplace=True)

# Display the updated DataFrame
train_df.head()

Unnamed: 0,PassengerId,CryoSleep,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Transported,GroupId,CabinDeck,CabinNum,CabinSide
0,0001_01,False,39.0,False,0.0,0.0,0.0,0.0,0.0,False,1,B,0,P
1,0002_01,False,24.0,False,109.0,9.0,25.0,549.0,44.0,True,2,F,0,S
2,0003_01,False,58.0,True,43.0,3576.0,0.0,6715.0,49.0,False,3,A,0,S
3,0003_02,False,33.0,False,0.0,1283.0,371.0,3329.0,193.0,False,3,A,0,S
4,0004_01,False,16.0,False,303.0,70.0,151.0,565.0,2.0,True,4,F,1,S


Replace missing age with mean age

In [51]:
mean_age = train_df['Age'].mean()

# Fill missing value with the calculated mean
train_df['Age'].fillna(mean_age, inplace=True)
train_df['Age'] = train_df['Age'].astype(int)

# Display the updated DataFrame
train_df.head()

Unnamed: 0,PassengerId,CryoSleep,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Transported,GroupId,CabinDeck,CabinNum,CabinSide
0,0001_01,False,39,False,0.0,0.0,0.0,0.0,0.0,False,1,B,0,P
1,0002_01,False,24,False,109.0,9.0,25.0,549.0,44.0,True,2,F,0,S
2,0003_01,False,58,True,43.0,3576.0,0.0,6715.0,49.0,False,3,A,0,S
3,0003_02,False,33,False,0.0,1283.0,371.0,3329.0,193.0,False,3,A,0,S
4,0004_01,False,16,False,303.0,70.0,151.0,565.0,2.0,True,4,F,1,S


Convert categorical data to Tensors using LabelEncoder and Numpy

In [52]:
cat_features=["CryoSleep", "VIP", "CabinDeck", "CabinNum", "CabinSide"]
lbl_encoders={}
for feature in cat_features:
    lbl_encoders[feature]=LabelEncoder()
    train_df[feature]=lbl_encoders[feature].fit_transform(train_df[feature])

### Stacking the features
cat_values=np.stack([train_df[i].values for i in cat_features], 1)

### Convert numpy to Tensors
cat_values=torch.tensor(cat_values,dtype=torch.int64)
cat_values

tensor([[   0,    0,    1,    0,    0],
        [   0,    0,    5,    0,    1],
        [   0,    1,    0,    0,    1],
        ...,
        [   0,    0,    6,  551,    1],
        [   0,    0,    4, 1385,    1],
        [   0,    0,    4, 1385,    1]])

Convert Continous data to Tensors using Numpy

In [53]:
cont_features=["GroupId", "Age", "RoomService", "FoodCourt", "ShoppingMall", "Spa", "VRDeck"]

### Stacking continuous variable to a tensor
cont_values=np.stack([train_df[i].values for i in cont_features],axis=1)
cont_values=torch.tensor(cont_values,dtype=torch.float)
cont_values

tensor([[1.0000e+00, 3.9000e+01, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
         0.0000e+00],
        [2.0000e+00, 2.4000e+01, 1.0900e+02,  ..., 2.5000e+01, 5.4900e+02,
         4.4000e+01],
        [3.0000e+00, 5.8000e+01, 4.3000e+01,  ..., 0.0000e+00, 6.7150e+03,
         4.9000e+01],
        ...,
        [9.2790e+03, 2.6000e+01, 0.0000e+00,  ..., 1.8720e+03, 1.0000e+00,
         0.0000e+00],
        [9.2800e+03, 3.2000e+01, 0.0000e+00,  ..., 0.0000e+00, 3.5300e+02,
         3.2350e+03],
        [9.2800e+03, 4.4000e+01, 1.2600e+02,  ..., 0.0000e+00, 0.0000e+00,
         1.2000e+01]])

Extract the dependent feature `Transported` 

In [54]:
### Dependent Feature 
lbl_encoders={}
lbl_encoders["Transported"]=LabelEncoder()
train_df["Transported"]=lbl_encoders["Transported"].fit_transform(train_df["Transported"])

y=torch.tensor(train_df["Transported"].values,dtype=torch.float).reshape(-1,1)
y

tensor([[0.],
        [1.],
        [0.],
        ...,
        [1.],
        [0.],
        [1.]])

Check the sizes of feature tensors

In [55]:
cat_values.shape, cont_values.shape, y.shape

(torch.Size([8693, 5]), torch.Size([8693, 7]), torch.Size([8693, 1]))

Calculate embedding dimensions for Categorical Data

In [56]:
#### Embedding Size For Categorical columns
cat_dims=[len(train_df[col].unique()) for col in cat_features]
embedding_dims = [(x, min(50, (x + 1) // 2)) for x in cat_dims]

embedding_dims 

[(2, 1), (2, 1), (9, 5), (1818, 50), (3, 2)]

### Artificial Neural Network 

Create the Neural Network Class

In [57]:
class TitanicClassifier(nn.Module):

    def __init__(self, num_continuous, embedding_dims, hidden_dims):
        super(TitanicClassifier, self).__init__()

        # Embedding layers for categorical variables
        self.embeddings  = nn.ModuleList([nn.Embedding(inp,out) for inp,out in embedding_dims])

        # Fully connected layers for continuous variables
        self.fc_continuous = nn.Sequential(
            nn.Linear(num_continuous, hidden_dims[0]),
            nn.ReLU(),
            nn.Linear(hidden_dims[0], hidden_dims[1]),
            nn.ReLU()
        )

        # Combined feature representation
        self.combined_dim = sum((out for inp,out in embedding_dims)) + hidden_dims[1]

        # Fully connected layers for combined features
        self.fc_combined = nn.Sequential(
            nn.Linear(self.combined_dim, hidden_dims[2]),
            nn.ReLU(),
            nn.Dropout(0.4),  # Dropout for regularization
            nn.Linear(hidden_dims[2], hidden_dims[3]),
            nn.ReLU(),
            nn.Dropout(0.4)
        )

        # Output layer for binary classification
        self.fc_output = nn.Sequential(
            nn.Linear(hidden_dims[3], 1),
            nn.Sigmoid()            
        )
    
    def forward(self, x_cat, x_cont):
        # Embed categorical variables
        embedded_data = []
        for i,e in enumerate(self.embeddings):
            embedded_data.append(e(x_cat[:,i]))  # This will get the all the rows for the i th column
        embedded_data = torch.cat(embedded_data, dim=1)

        # Pass continuous variables through FC layers
        continuous_out = self.fc_continuous(x_cont)

        # Concatenate embeddings and continuous data
        combined_data = torch.cat([embedded_data, continuous_out], dim=1)

        # Pass through FC layers for combined features
        combined_out = self.fc_combined(combined_data)

        # Output layer for binary classification
        output = self.fc_output(combined_out)

        return output

Check model

In [58]:
torch.manual_seed(100)
model=TitanicClassifier(len(cont_features), embedding_dims, [100,80,100,50])
model

TitanicClassifier(
  (embeddings): ModuleList(
    (0-1): 2 x Embedding(2, 1)
    (2): Embedding(9, 5)
    (3): Embedding(1818, 50)
    (4): Embedding(3, 2)
  )
  (fc_continuous): Sequential(
    (0): Linear(in_features=7, out_features=100, bias=True)
    (1): ReLU()
    (2): Linear(in_features=100, out_features=80, bias=True)
    (3): ReLU()
  )
  (fc_combined): Sequential(
    (0): Linear(in_features=139, out_features=100, bias=True)
    (1): ReLU()
    (2): Dropout(p=0.4, inplace=False)
    (3): Linear(in_features=100, out_features=50, bias=True)
    (4): ReLU()
    (5): Dropout(p=0.4, inplace=False)
  )
  (fc_output): Sequential(
    (0): Linear(in_features=50, out_features=1, bias=True)
    (1): Sigmoid()
  )
)

Split into test and train data

In [59]:
# Define the sizes for train, validate, and test sets
train_size = 0.6  # 60% for training
validate_size = 0.2  # 20% for validation
test_size = 0.2  # 20% for testing

# Split categorical and continuous data
X_train_categ, X_non_train_categ = train_test_split(cat_values, test_size=1-train_size, random_state=42, shuffle=False) 
X_validate_categ, X_test_categ = train_test_split(X_non_train_categ, test_size=test_size/(validate_size+test_size), random_state=42, shuffle=False)

X_train_cont, X_non_train_cont = train_test_split(cont_values, test_size=1-train_size, random_state=42, shuffle=False)
X_validate_cont, X_test_cont = train_test_split(X_non_train_cont, test_size=test_size/(validate_size+test_size), random_state=42, shuffle=False)

# Split target variable
y_train, y_non_train = train_test_split(y, test_size=1-train_size, random_state=42, shuffle=False) 
y_validate, y_test = train_test_split(y_non_train, test_size=test_size/(validate_size+test_size), random_state=42, shuffle=False)

print(X_train_categ.shape, X_train_cont.shape, y_train.shape)
print(X_validate_categ.shape, X_validate_cont.shape, y_validate.shape)
print(X_test_categ.shape, X_test_cont.shape, y_test.shape)

torch.Size([5215, 5]) torch.Size([5215, 7]) torch.Size([5215, 1])
torch.Size([1739, 5]) torch.Size([1739, 7]) torch.Size([1739, 1])
torch.Size([1739, 5]) torch.Size([1739, 7]) torch.Size([1739, 1])


Training the model if pretrained model is not available

In [60]:
loss_function = nn.MSELoss()
optimizer=torch.optim.Adam(model.parameters(), lr=0.1)

try:
    model.load_state_dict(torch.load('model.pt'))
    model.eval()

except Exception:
    epochs=1000
    for i in range(1, epochs + 1):
        y_pred=model(X_train_categ,X_train_cont)
        loss = loss_function(y_pred, y_train)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if i%10==1:
            print("Epoch number: {} and the loss : {}".format(i,loss.item()))

        # if i % 10 == 0:    # Run every 10 epochs
        #     # Check against the validation set
        #     model.train(False) # Don't need to track gradents for validation
        #     voutputs = model(X_validate_categ, X_validate_cont)
        #     vloss = loss_function(voutputs, y_validate)
        #     vloss = vloss.item()
        #     model.train(True)

        #     # Log the running loss averaged per batch
        #     print("Epoch number: {} Training Loss : {} Validation Loss : {}".format(i, loss.item(), vloss))
        #     writer.add_scalars('Training vs. Validation Loss',
        #                     { 'Training' : loss.item(), 'Validation' : vloss }, i)

    print("Training finished")
    writer.flush()

Epoch number: 1 and the loss : 0.4610167443752289
Epoch number: 11 and the loss : 0.49837008118629456


Epoch number: 21 and the loss : 0.498561829328537
Epoch number: 31 and the loss : 0.49837008118629456
Epoch number: 41 and the loss : 0.498561829328537
Epoch number: 51 and the loss : 0.498561829328537
Epoch number: 61 and the loss : 0.49837008118629456
Epoch number: 71 and the loss : 0.498561829328537
Epoch number: 81 and the loss : 0.49837008118629456
Epoch number: 91 and the loss : 0.49837008118629456
Epoch number: 101 and the loss : 0.498561829328537
Epoch number: 111 and the loss : 0.4981783330440521
Epoch number: 121 and the loss : 0.498561829328537
Epoch number: 131 and the loss : 0.49837008118629456
Epoch number: 141 and the loss : 0.49837008118629456
Epoch number: 151 and the loss : 0.49837008118629456
Epoch number: 161 and the loss : 0.498561829328537
Epoch number: 171 and the loss : 0.4981783330440521
Epoch number: 181 and the loss : 0.4981783330440521
Epoch number: 191 and the loss : 0.4981783330440521
Epoch number: 201 and the loss : 0.4979865849018097
Epoch number: 211 an

Draw the Neural Network Graph on Tensorboard

In [61]:
writer.add_graph(model, (X_test_categ, X_test_cont))
writer.flush()

Calculate the accuracy of the model on test data

In [62]:
# Run on the Test Data
y_pred=""
with torch.no_grad():
    y_pred=model(X_test_categ, X_test_cont)
    loss = loss_function(y_pred, y_test)
print('Error: {}'.format(loss.detach()))

Error: 0.47556066513061523


Save the model

In [63]:
# torch.save(model.state_dict(),'model.pt')