Predicting-survivals-with-ML

Overview

This project is a data science exploration of the Spaceship Titanic dataset, a variant. Make reference to the html file of the famous Titanic competition on Kaggle. The goal of this project is to predict the survival of passengers in a spaceship that is about to collide with a 'spacetime anomaly' using advanced machine learning techniques. The project report can be found in the html file or the Rmd file.

Features engineering

The original features were:

where transported was the variable to be predicted.

The original dataset included features such as cabin and passenger_id that contain hidden information. For example, by splitting the cabin column every time we encountered a '/', we were able to create 3 new columns. Similarly, by splitting the passenger_id column on the '_' symbol, we were able to create 2 new columns.

Models used

This project is a binary classification task, we tried several machine learning models, including

Logistic regression
Shrinkage methods (Ridge, Lasso, Elastic net)
Tree methods (Single tree, Random Forest, Bagging, Boosting)
Support Vector Machines (Linear and non-linear)
NN

Results

Our analysis revealed that the best model was Boosting with an impressive Area Under the Curve of 0.8824744.

Relevant columns + charts

Spa, VRDeck and CryoSleep turned out to be the most relevant features in terms of Mean Decrease in Gini Index for the tree-based methods. Let's look at the charts we did ex-ante to see if this was visually intuible:

Cryosleep was super clear, however for numerical columns we struggle to see this evidence because there are some extremeli high values that skew the data to the right

Comments

This was an exciting and enlightening project that allowed us to dive deep into the dataset and uncover hidden insights. We were able to experiment with a variety of different models and understand their parameters and appropriate use cases.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
images		images
README.md		README.md
spaceship_titanic_ML.Rmd		spaceship_titanic_ML.Rmd
spaceship_titanic_ML.html		spaceship_titanic_ML.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting-survivals-with-ML

Overview

Features engineering

Models used

Results

Relevant columns + charts

Comments

About

Releases

Packages

Languages

manuelrech/predicting-survivals-with-ML

Folders and files

Latest commit

History

Repository files navigation

Predicting-survivals-with-ML

Overview

Features engineering

Models used

Results

Relevant columns + charts

Comments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages