Space Travel Passenger Transport Prediction Model

Overview

This repository features a machine learning model designed to predict whether passengers of a space travel company are likely to be transported. The model employs CatBoostClassifier, a machine learning algorithm known for handling categorical data effectively.

Data

The model is trained and tested using two datasets:

train.csv: Training data with passenger features and transportation status.
test.csv: Testing data with passenger features.

Features

The datasets include a mix of categorical and numerical features related to passengers. Features include HomePlanet, CryoSleep, Destination, Age, VIP status, and various service usages.

Preprocessing

Key preprocessing steps are:

Feature extraction from 'Cabin' to get 'Deck', 'Room', and 'Side'.
Handling missing values:
- Numerical features: Imputed with mean values.
- Categorical features: Imputed with the most frequent values.
Scaling numerical features using StandardScaler.
Encoding categorical features using OneHotEncoder.
Label encoding the target variable.

Feature Engineering

The 'Cabin' feature is parsed into three separate features: 'Deck', 'Room', and 'Side', to extract more meaningful information.

Model Training

The CatBoostClassifier model is trained on the preprocessed data. Key aspects of the model training include:

Iterations: 5
Learning rate: 0.1
Loss function: CrossEntropy

Prediction and Output

The model predicts the transportation status for the test data. Predictions are converted to binary format based on a specified threshold and exported to 'predictions1.csv'.

Usage

To use the model:

Load the training and testing datasets.
Preprocess the data by handling missing values, encoding features, and scaling numerical features.
Train the CatBoostClassifier model using the training data.
Predict transportation status for the test data.
Export the predictions to 'predictions1.csv'.

Dependencies

pandas
numpy
scikit-learn
catboost

Note

This code is tailored for a specific scenario of predicting passenger transportation status in a space travel context and can be adapted for similar predictive modeling tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
catboost_info		catboost_info
.gitignore		.gitignore
README.md		README.md
SpaceTitanic.ipynb		SpaceTitanic.ipynb
predictions.csv		predictions.csv
predictions1.csv		predictions1.csv
predictions2.csv		predictions2.csv
predictions4.csv		predictions4.csv
sample_submission.csv		sample_submission.csv
test.csv		test.csv
train.csv		train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Space Travel Passenger Transport Prediction Model

Overview

Data

Features

Preprocessing

Feature Engineering

Model Training

Prediction and Output

Usage

Dependencies

Note

About

Languages

pramodyasahan/spaceship-titanic

Folders and files

Latest commit

History

Repository files navigation

Space Travel Passenger Transport Prediction Model

Overview

Data

Features

Preprocessing

Feature Engineering

Model Training

Prediction and Output

Usage

Dependencies

Note

About

Topics

Resources

Stars

Watchers

Forks

Languages