Week 3 – Titanic Working 🚢

This week focuses on the Titanic dataset from Kaggle, moving from data exploration to model serving with FastAPI and Docker.
It follows the internship training plan (Days 6–10).

Week 4 tokenization work lives in TokenHF: https://github.com/MFaresJA/TokenHF

📂 Structure

Week3/
├── TitanicWorking/
│   ├── Day6_EDA.ipynb              # Exploratory Data Analysis
│   ├── Day7_FeatureEngineering.ipynb
│   ├── Day8_ModelTraining.ipynb
│   ├── titanicModel.py             # Utility functions for features & predictions
│   ├── models/                     # Saved ML models (ignored by Git, tracked later with DVC)
│   └── data/                       # Dataset files
├── main.py                         # FastAPI service (Day 10)
├── requirements.txt
├── Dockerfile
├── .gitignore
└── README.md

🎯 Goals (Days 6–10)

Day 6: Perform Exploratory Data Analysis (EDA)
- Handle nulls, visualize survival by class/sex, plot distributions
Day 7: Feature Engineering
- Create FamilySize, IsAlone, extract Title from names
- Impute missing values, one-hot encode categoricals
Day 8: Train Models
- Logistic Regression, Decision Tree, Random Forest
- Evaluate with accuracy, F1, confusion matrix
Day 9: Model Optimization
- Tune Random Forest with GridSearchCV
Day 10: Serve Model via FastAPI + Docker
- Expose /predict and /predict_batch endpoints
- Build Docker image for easy deployment

⚙️ How to Run

Local

uvicorn main:app --reload --host 0.0.0.0 --port 8000

Docs: http://127.0.0.1:8000/docs

Docker

docker build -t titanic-api .
docker run -p 8000:8000 titanic-api

If port 8000 is busy:

docker run -p 8001:8000 titanic-api

📡 Example Requests

Single passenger

curl -X POST http://127.0.0.1:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"PassengerId":1,"Pclass":3,"Name":"Doe, Mr. John","Sex":"male",
       "Age":22,"SibSp":1,"Parch":0,"Fare":7.25,"Embarked":"S"}'

Batch

curl -X POST http://127.0.0.1:8000/predict_batch \
  -H "Content-Type: application/json" \
  -d '[{"PassengerId":1,"Pclass":1,"Name":"Allen, Miss. Alice","Sex":"female","Age":35,"SibSp":0,"Parch":0,"Fare":71.28,"Embarked":"C"},
       {"PassengerId":2,"Pclass":3,"Name":"Kelly, Mr. James","Sex":"male","Age":22,"SibSp":1,"Parch":0,"Fare":7.25,"Embarked":"S"}]'

📌 Notes

Model artifacts (.joblib, .json) are excluded by .gitignore.
The notebooks demonstrate the progression from EDA → features → training → optimization.

For Docker, DVC, and tokenizer details, see README_API.md.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Week 3 – Titanic Working 🚢

📂 Structure

🎯 Goals (Days 6–10)

⚙️ How to Run

Local

Docker

📡 Example Requests

📌 Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.dvc		.dvc
TitanicWorking		TitanicWorking
.dvcignore		.dvcignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
README_API.md		README_API.md
main.py		main.py
reflection.txt		reflection.txt
requirements.txt		requirements.txt
tokenizer_demo.py		tokenizer_demo.py

Folders and files

Latest commit

History

Repository files navigation

Week 3 – Titanic Working 🚢

📂 Structure

🎯 Goals (Days 6–10)

⚙️ How to Run

Local

Docker

📡 Example Requests

📌 Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages