Kaggle has a competition that uses data about the famous Titanic shipwreck. The objective is to create a machine learning model that can predict if a passenger survived or did not survive the shipwreck. This is a supervised, binary classification problem. The model used was a neural network. The programming language is R. This notebook is hosted on Kaggle and can be found here: https://www.kaggle.com/code/jarredpriester/using-a-neural-network-to-predict-titanic-survival
First, I wanted to get started with Kaggle and decided this would be a good competition to start with. Second, I wanted to work on creating a neural network in the caret library.
I learned how to implement cross validation into a neural network using the caret library. I learned that most people that survived were female and had tickets on the upper part of the ship. Most people that did not survive were male and had tickets in the bottom of the ship.
The dataset that was used was provided by Kaggle already split into train and test sets. The combined datasets had 1309 rows and 12 variables.
train.csv - training data
test.csv - test data
nn.titanic.preds.csv - predictions
Titanic_kaggle. R - R script
Titanic_kaggle.Rmd - R markdown
Titanic_kaggle.pdf - pdf of the notebook