Titanic-ML:

Introduction:

The sinking of the Titanic is one of the most infamous shipwrecks in history. The aim of the project is to predict whether a person can survive or not.

Dataset:

The dataset is downloaded from Kaggle. The downloaded data consist of 2 files (train and test sets). Train dataset consists of many features such as (PassengerId, Age, Sex elc) and one label, which represents if a person is survived or not. Test dataset consists of only the features and it is not labeled. In this project, only the train dataset has been used by splitting the whole rows into train and test dataset, so that the test dataset has labels.

The code:

In this Repository, there are one code file 'code.ipynb' has been scripted using Jupyter Notebook.

Project steps:

1. EDA:

In this section, different EDA methods has been used to observe visualize the nature of the data such as (distrobution, Box plots, missing values outliers etc).

3. Data Preprocessing:

Now we have a clear image of what is required to preprocess the data. by removing the outliers, adjusting the missing data, normalize the distribution of some features as well as creating new features based on existing features. As the dataset were either numerical or categorial, an ecoding step were required to transform the categorial into numerical data, as most of Machine Learning algorithms use only numerical data such as Logistic Regression, SVM and KNN.

4. Building and training the model:

Different models were build to classify the output, which are Logistic Regression, SVM, KNN, Decision Tree, Random Forest and ANN. Next, Prediction method has been applied based on the test set, so that we can compare the accuracy between the different applied algorithms as well as a Cross-Validation technique.

5. XAI (Shap Value):

Until this stage, we have measured the accuracy of each algorithm, but we have no idea what features contributed in the decision process of the algorithm to take such a decision. Here, measuring shap value is applied to understand the black box model of the algorithm and what features contributed the most compared to other features.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
dataset		dataset
venv		venv
README.md		README.md
code.ipynb		code.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Titanic-ML:

Introduction:

Dataset:

The code:

Project steps:

1. EDA:

3. Data Preprocessing:

4. Building and training the model:

5. XAI (Shap Value):

About

Releases

Packages

Languages

mohammedlajam/Titanic-ML

Folders and files

Latest commit

History

Repository files navigation

Titanic-ML:

Introduction:

Dataset:

The code:

Project steps:

1. EDA:

3. Data Preprocessing:

4. Building and training the model:

5. XAI (Shap Value):

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages