Titanic-Survival-Prediction

The popular Titanic dataset ia available and as part of practice competition in Kaggle. For the full HTML page output, please click this link.

Data Exploratory

In the Rmd script, I will use library(Amelia) for quick visualization on the missing data, and another code to actual get the number of the missing data points. There are also some other library which I used (apart from ggplot2) to visualize the relationship of the feature to survival binary output.

Sample code and output

library(vcd)
mosaicplot(training.data$pclass ~ training.data$survived, 
           main="Passenger Fate by Traveling Class", shade=FALSE, 
           color=TRUE, xlab="Passenger class", ylab="Survived")

Prediction

Finally we will use randomForest library to make the prediction as well as party::cforest and compare the result.

Outcome from randomForest library

confusionMatrix(test.batch.rf.pred, test.batch$survived)

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  0  1
##          0 96 18
##          1 13 50
##                                         
##                Accuracy : 0.825         
##                  95% CI : (0.761, 0.878)
##     No Information Rate : 0.616         
##     P-Value [Acc > NIR] : 1.3e-09       
##                                         
##                   Kappa : 0.625         
##  Mcnemar's Test P-Value : 0.472         
##                                         
##             Sensitivity : 0.881         
##             Specificity : 0.735         
##          Pos Pred Value : 0.842         
##          Neg Pred Value : 0.794         
##              Prevalence : 0.616         
##          Detection Rate : 0.542         
##    Detection Prevalence : 0.644         
##       Balanced Accuracy : 0.808         
##                                         
##        'Positive' Class : 0             
##

Outcome from party library (unbiased forest)

confusionMatrix(test.batch.party.pred, test.batch$survived)
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction   0   1
##          0 102  16
##          1   7  52
##                                         
##                Accuracy : 0.87          
##                  95% CI : (0.811, 0.916)
##     No Information Rate : 0.616         
##     P-Value [Acc > NIR] : 6e-14         
##                                         
##                   Kappa : 0.718         
##  Mcnemar's Test P-Value : 0.0953        
##                                         
##             Sensitivity : 0.936         
##             Specificity : 0.765         
##          Pos Pred Value : 0.864         
##          Neg Pred Value : 0.881         
##              Prevalence : 0.616         
##          Detection Rate : 0.576         
##    Detection Prevalence : 0.667         
##       Balanced Accuracy : 0.850         
##                                         
##        'Positive' Class : 0             
##

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Image		Image
R Markdown Script		R Markdown Script
.gitignore		.gitignore
README.md		README.md
Titanic_survival_prediction.html		Titanic_survival_prediction.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Titanic-Survival-Prediction

Data Exploratory

Prediction

About

Releases

Packages

Languages

netsatsawat/Titanic-Survival-Prediction

Folders and files

Latest commit

History

Repository files navigation

Titanic-Survival-Prediction

Data Exploratory

Prediction

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages