## Titanic survived Project
#### Project Description

The Titanic Problem is based on the sinking of the ‘Unsinkable’ ship Titanic in early 1912. It gives you information about multiple people like their ages, sexes, sibling counts, embarkment points, and whether or not they survived the disaster. 
Based on these features, you have to predict if an arbitrary passenger on Titanic would survive the sinking or not. 

#### Attribute Information
1. Passenger id- Unique Id of the passenger
2. Pclass- Passenger Class (1 = 1st; 2 = 2nd; 3 = 3rd)
3. Survived- Survived (0 = No; 1 = Yes)
4. Name- Name of the passenger
5. Sex- Sex of the passenger (Male, Female)
6. Age- Age of the passenger
7. Sibsp- Number of Siblings/Spouses Aboard
8. Parch- Number of Parents/Children Aboard
9. Ticket- Ticket Number
10. Fare- Passenger Fare (British pound)
11. Cabin- Cabin
12. Embarked- Port of Embarkation (C = Cherbourg; Q = Queenstown; S = Southampton)

#### Dataset Link-
https://github.com/FlipRoboTechnologies/ML-Datasets/blob/main/Titanic/titanic_train.csv


In [1]:
import numpy as np  
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

In [2]:
df=pd.read_csv('https://raw.githubusercontent.com/FlipRoboTechnologies/ML-Datasets/main/Titanic/titanic_train.csv') 
df

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.2500,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.9250,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1000,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.0500,,S
...,...,...,...,...,...,...,...,...,...,...,...,...
886,887,0,2,"Montvila, Rev. Juozas",male,27.0,0,0,211536,13.0000,,S
887,888,1,1,"Graham, Miss. Margaret Edith",female,19.0,0,0,112053,30.0000,B42,S
888,889,0,3,"Johnston, Miss. Catherine Helen ""Carrie""",female,,1,2,W./C. 6607,23.4500,,S
889,890,1,1,"Behr, Mr. Karl Howell",male,26.0,0,0,111369,30.0000,C148,C


In [3]:
#Removing the unwanted columns
df_class = df.drop(columns = ['PassengerId','Name','SibSp','Parch','Ticket','Fare','Embarked','Cabin']) 
df_class

Unnamed: 0,Survived,Pclass,Sex,Age
0,0,3,male,22.0
1,1,1,female,38.0
2,1,3,female,26.0
3,1,1,female,35.0
4,0,3,male,35.0
...,...,...,...,...
886,0,2,male,27.0
887,1,1,female,19.0
888,0,3,female,
889,1,1,male,26.0


The final result/aim of the model is to predict survival based on the inputs

In [4]:
#Replacing the null values in 'Age' by its mean value
df_class['Age'].fillna(df['Age'].mean(),inplace=True) 
df_class

Unnamed: 0,Survived,Pclass,Sex,Age
0,0,3,male,22.000000
1,1,1,female,38.000000
2,1,3,female,26.000000
3,1,1,female,35.000000
4,0,3,male,35.000000
...,...,...,...,...
886,0,2,male,27.000000
887,1,1,female,19.000000
888,0,3,female,29.699118
889,1,1,male,26.000000


In [21]:
#Removing null values in the below columns
df_class = df_class[['Sex','Survived','Pclass','Age']].dropna() 
df_class

Unnamed: 0,Sex,Survived,Pclass,Age
0,0,0,3,22.000000
1,1,1,1,38.000000
2,1,1,3,26.000000
3,1,1,1,35.000000
4,0,0,3,35.000000
...,...,...,...,...
886,0,0,2,27.000000
887,1,1,1,19.000000
888,1,0,3,29.699118
889,0,1,1,26.000000


In [23]:
#Replacing the gender male and female as '0' and '1'
df_class = df_class.replace({'male':0, 'female':1})  
df_class

Unnamed: 0,Sex,Survived,Pclass,Age
0,0,0,3,22.000000
1,1,1,1,38.000000
2,1,1,3,26.000000
3,1,1,1,35.000000
4,0,0,3,35.000000
...,...,...,...,...
886,0,0,2,27.000000
887,1,1,1,19.000000
888,1,0,3,29.699118
889,0,1,1,26.000000


In [25]:
#importing decision tree
from sklearn.tree import DecisionTreeClassifier 

In [26]:
#Assigning values for X and Y to fit into the model
X = df_class.drop(columns=['Survived']) 
Y = df['Survived'] 
X

Unnamed: 0,Sex,Pclass,Age
0,0,3,22.000000
1,1,1,38.000000
2,1,3,26.000000
3,1,1,35.000000
4,0,3,35.000000
...,...,...,...
886,0,2,27.000000
887,1,1,19.000000
888,1,3,29.699118
889,0,1,26.000000


In [27]:
#Assigning a varible named 'model' to the imported Decision Tree to access the decision tree
model=DecisionTreeClassifier() 
model.fit(X,Y) 

In [28]:
#Here we are predicting the surivival rate by entering the inputs
ans=model.predict([[1,2,23]]) 
print(ans)

[1]


In [29]:
#Accuracy of the model before implementing train and test model
model.score(X,Y) 

0.8799102132435466

In [16]:
X1=df_class.drop(columns=['Survived']) 
Y1=df['Survived']
from sklearn.model_selection import train_test_split
X_train_1,X_test_1,Y_train_1,Y_test_1=train_test_split(X1,Y1,test_size=0.70)

In [17]:
traintestmodel=DecisionTreeClassifier()
traintestmodel.fit(X1,Y1)

In [18]:
traintestans=traintestmodel.predict([[1,2,23]])
print(traintestans)

[1]


In [30]:
#Accuracy after train and test model
model.score(X,Y) 

0.8799102132435466