Skip to content

This is the legendary Titanic ML competition – the best, first challenge for you to dive into ML competitions and familiarize yourself with how the Kaggle platform works.

License

Notifications You must be signed in to change notification settings

pykrishna/Titanic---Machine-Learning-from-Disaster

Repository files navigation

Titanic - Machine Learning from Disaster

Welcome to the Titanic Machine Learning competition! This is the legendary challenge hosted on Kaggle, designed to be the perfect entry point for those looking to dive into Machine Learning competitions and get acquainted with the Kaggle platform.

Overview

The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, resulting in the loss of many lives. In this competition, your task is to build a predictive model that answers the question: "What sorts of people were more likely to survive?" using passenger data (e.g., name, age, gender, socio-economic class, etc.).

Objective

The main objective of this competition is to predict whether a passenger survived the Titanic disaster or not. This is a binary classification problem where the target variable is "Survived" (1 if survived, 0 otherwise).

Dataset

The dataset provided contains information about each passenger including various features such as:

  • PassengerId: Unique identifier for each passenger
  • Survived: Whether the passenger survived or not (0 = No, 1 = Yes)
  • Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd)
  • Name: Passenger's name
  • Sex: Passenger's sex
  • Age: Passenger's age in years
  • SibSp: Number of siblings/spouses aboard the Titanic
  • Parch: Number of parents/children aboard the Titanic
  • Ticket: Ticket number
  • Fare: Passenger fare
  • Cabin: Cabin number
  • Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)

Evaluation

The model's performance will be evaluated based on its accuracy in predicting whether a passenger survived or not. The evaluation metric for this competition is accuracy, the percentage of passengers correctly predicted.

Submission Format

Your submission should contain predictions for each passenger in the test set. The file should be a CSV with exactly 418 entries plus a header row. The file should have exactly two columns:

  • PassengerId (sorted in any order)
  • Survived (contains your binary predictions: 1 for survived, 0 for not survived)

Getting Started

To get started with the Titanic Machine Learning competition, follow these steps:

  1. Register on Kaggle and join the Titanic competition.
  2. Download the dataset provided.
  3. Explore the data and understand its structure and features.
  4. Preprocess the data as needed (e.g., handle missing values, encode categorical variables).
  5. Train machine learning models using the training data.
  6. Evaluate the models using cross-validation or holdout validation.
  7. Tune hyperparameters and improve model performance.
  8. Make predictions on the test set.
  9. Submit your predictions to Kaggle and see your score.

Resources

Best of luck with your Titanic Machine Learning journey! Feel free to explore, learn, and enjoy the competition experience.

About

This is the legendary Titanic ML competition – the best, first challenge for you to dive into ML competitions and familiarize yourself with how the Kaggle platform works.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published