Intro

In this machine learning project, I used the best method I think to process data. There is still much room to improve the accuracy of 76%. If you have any suggestions for this, please leave a message!

kaggle competition：Titanic Survivor Prediction

The competition is simple: use machine learning to create a model that predicts which passengers survived the Titanic shipwreck.

Goal

It is your job to predict if a passenger survived the sinking of the Titanic or not. For each in the test set, you must predict a 0 or 1 value for the variable.

I achieved an accuracy rate of 0.7679 in the public ranking list, and ranked the top 20% of all Kagglers except for cheating. I am used to using colab for experiments, but because it is an ipynb standard file, you can also use jupyter notebook.

Data

Survived: if this passenage died or survived.
Pclass: It divides passante into different levels from one to three.
SibSp:SibSp is the number of siblings or spouses of a person onboard.
Parch:It contained the number of parents or children each passenger was touring with.
Fare: The cost of a ticket.
Embarked:Southampton, Cherbourg, and Queenstown.

Procedure

My data has gone through the following steps Exploratory data analysis with visualization function

Data cleaning
Feature engineering
Modeling
Test accuracy(Cross validation may be a good way to verify the accuracy of different models. Compared with the cross-validation model, I prefer to submit the answer directly at kaggle to compare the accuracy)

Data modeling

K-NN
Decision tree
Random Forest
Stochastic Gradient Descent
Logistic Regression

Conclusion

The random forest and the decision tree are overfitting. After comparing the accuracy of KNN, LogisticRegression and Stochastic Gradient Percent, the logistic regression model gives me the highest accuracy

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
File		File
README.md		README.md
titannic.ipynb		titannic.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File

File

README.md

README.md

titannic.ipynb

titannic.ipynb

Repository files navigation

Intro

kaggle competition：Titanic Survivor Prediction

Goal

Data

Procedure

Data modeling

Conclusion

About

Releases

Packages

Languages

SuperCarryZY/kaggle-competition-Titanic-Survivor-Prediction

Folders and files

Latest commit

History

Repository files navigation

Intro

kaggle competition：Titanic Survivor Prediction

Goal

Data

Procedure

Data modeling

Conclusion

About

Resources

Stars

Watchers

Forks

Languages