Skip to content

diyadatascience/Titanic-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Titanic Dataset

This repository contains the analysis and visualization of the Titanic dataset. The project aims to explore various factors that affected the survival rates of passengers aboard the Titanic and to build a predictive model to determine the likelihood of survival.

Project Overview

The study examines the Titanic disaster by analyzing data related to passengers’ demographics, ticket information, and survival outcomes. The goal is to identify key factors that influenced survival and to create a predictive model using this information.

Contents

  • Titanic Dataset PPT.pptx: A PowerPoint presentation that details the analysis of the Titanic dataset. It includes steps such as data cleaning, exploratory data analysis, visualizations, and model building to predict survival outcomes based on various factors.
  • Titanic IDS.ipynb: A Jupyter notebook containing the code for data cleaning, exploratory data analysis, and model building. This notebook provides a step-by-step walkthrough of the analysis process and the development of a predictive model for survival.

Dataset Description

The dataset includes the following columns:

  • PassengerId: Unique ID for each passenger.
  • Survived: Survival (0 = No, 1 = Yes).
  • Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd).
  • Name: Name of the passenger.
  • Sex: Gender of the passenger.
  • Age: Age of the passenger.
  • SibSp: Number of siblings/spouses aboard the Titanic.
  • Parch: Number of parents/children aboard the Titanic.
  • Ticket: Ticket number.
  • Fare: Passenger fare.
  • Cabin: Cabin number.
  • Embarked: Port of Embarkation (C = Cherbourg, Q = Queenstown, S = Southampton).

How It Works

  1. Data Preparation: The dataset is cleaned and preprocessed to handle missing values and inconsistencies.
  2. Exploratory Data Analysis (EDA): Various visualizations are created to understand the distribution of data and the relationships between different variables.
  3. Model Building: A predictive model is built using machine learning techniques to determine the likelihood of survival based on the available data.
  4. Visualization: Insights and patterns are visualized through charts and graphs to make the findings more accessible and understandable.

Analysis Highlights

  • Survival Analysis: Examines how different factors like class, gender, and age affected the likelihood of survival.
  • Passenger Demographics: Analyzes the distribution of passengers by age, gender, and class.
  • Ticket and Fare Analysis: Investigates the correlation between ticket fares, ticket class, and survival rates.
  • Embarkation Points: Explores the impact of embarkation points on survival.

Conclusion

This project provides a comprehensive analysis of the Titanic dataset, highlighting key factors that influenced survival and offering a predictive model for future analysis. The findings can be useful for historical research and for understanding the dynamics of survival in disaster scenarios.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages