🧠 Data Science with Python

Welcome to Data Science with Python — a curated portfolio of practical data science and machine learning projects developed by Tega Jarikre.
Each project demonstrates real-world applications of Python in solving problems across insurance, logistics, content moderation, and agricultural analytics.

This repository represents a continuous learning journey in data-driven problem-solving, from raw data wrangling to model deployment.

🚀 Project Objectives

Build and evaluate machine learning models across diverse domains
Apply data preprocessing, feature engineering, and model optimization techniques
Experiment with classification, regression, and clustering algorithms
Explore bias detection, economic efficiency, and yield analysis using real datasets
Strengthen portfolio readiness for data science career roles

🧩 Repository Structure

Data-Science-with-Python/ │ ├── insurance_premium_prediction/ # Regression models for premium forecasting ├── fake_content_detection/ # NLP + metadata-based bias and fake news detection ├── delivery_delay_classification/ # Predicting early, on-time, or late deliveries ├── earthquake damage classification/ # Predicting low, medium, or high grade earthquake damage ├── china real estate demand prediction/ # Regression models for real estate demand forecasting ├── borehole functionality classification/ # Predicting function, non-functional, or functional needs repair boreholes ├── notebooks/ # Shared EDA, feature engineering, and model experiments ├── scripts/ # Reusable Python utilities ├── data/ # Sample datasets (clean or synthetic) ├── results/ # Visualizations and performance reports ├── requirements.txt # Dependencies for reproducibility └── README.md # You’re here!

🧰 Tech Stack

Category	Tools & Libraries
Core Language	Python 3.10+
Data Manipulation	Pandas, NumPy
Visualization	Matplotlib, Seaborn, Plotly
Modeling & ML	Scikit-learn, XGBoost, Random Forest, Logistic Regression
NLP & Text Analytics	NLTK, spaCy, TF-IDF, Word2Vec
Evaluation & Metrics	Precision, Recall, F1-score, RMSE, ROC-AUC
Version Control	Git & GitHub
Notebooks & IDEs	Jupyter Notebook, VS Code

📊 Highlighted Projects

1️⃣ Insurance Premium Prediction

Goal: Predict customer insurance premiums using demographic and risk variables.
Approach: Regression models (Linear Regression, XGBoost, Random Forest).
Focus: Feature selection, multicollinearity detection, and interpretability.

2️⃣ Fake or Biased Content Detection

Goal: Classify online content as fake, biased, or neutral.
Approach: Natural Language Processing (NLP) with metadata features.
Focus: Text cleaning, vectorization (TF-IDF), and ensemble learning.

3️⃣ Delivery Delay Classification

Goal: Predict delivery status — early, on time, or late.
Approach: Multi-class classification with Logistic Regression, XGBoost, and Random Forest.
Focus: Handling class imbalance, feature importance, and business impact analysis.

4️⃣ Agricultural Efficiency & Productivity Studies

Goal: Analyze farm typology, post-harvest loss, and yield determinants.
Approach: Clustering, feature selection, and supervised learning for productivity prediction.
Focus: Data science-driven agricultural analytics.

📚 Learning Focus Areas

Data wrangling and cleaning workflows
Feature engineering and transformation
Model training, validation, and hyperparameter tuning
Model interpretability (SHAP, feature importance)
End-to-end data science pipeline documentation

🧑‍💻 How to Use

Clone the repository

git clone https://github.com/Tegazini/Data-Science-with-Python.git

Navigate into the folder
```
cd Data-Science-with-Python
```
Install dependencies
```
pip install -r requirements.txt
```
Run notebooks
```
jupyter notebook
```

Explore each project folder for its own datasets and notebook scripts.

🌟 Future Work

Add deep learning experiments with TensorFlow/PyTorch
Incorporate MLOps tools (e.g., MLflow, DVC) for versioned model tracking
Deploy selected models using Streamlit or FastAPI

🧾 Author

👤 Tega Jarikre

📧 Email: jarikretega@gmail.com

🔗 LinkedIn: https://www.linkedin.com/in/tega-jarikre-92138342

💻 GitHub: https://github.com/Tegazini

"Data science isn’t just about algorithms — it’s about understanding data deeply enough to tell meaningful stories."

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
China's Real Estate Demand Prediction datasets		China's Real Estate Demand Prediction datasets
Covid Data Dataset		Covid Data Dataset
Earthquake damage datasets		Earthquake damage datasets
Pump-it-up Dataset		Pump-it-up Dataset
.gitignore		.gitignore
China Real Estate Demand Prediction.ipynb		China Real Estate Demand Prediction.ipynb
Covid Data Analysis.ipynb		Covid Data Analysis.ipynb
Earthquake Damage Classification.ipynb		Earthquake Damage Classification.ipynb
Pump it Up - Data Mining the Water Table.ipynb		Pump it Up - Data Mining the Water Table.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Data Science with Python

🚀 Project Objectives

🧩 Repository Structure

🧰 Tech Stack

📊 Highlighted Projects

1️⃣ Insurance Premium Prediction

2️⃣ Fake or Biased Content Detection

3️⃣ Delivery Delay Classification

4️⃣ Agricultural Efficiency & Productivity Studies

📚 Learning Focus Areas

🧑‍💻 How to Use

🌟 Future Work

🧾 Author

👤 Tega Jarikre

📧 Email: jarikretega@gmail.com

🔗 LinkedIn: https://www.linkedin.com/in/tega-jarikre-92138342

💻 GitHub: https://github.com/Tegazini

About

Uh oh!

Releases

Packages

Languages

Tegazini/Data-Science-with-Python

Folders and files

Latest commit

History

Repository files navigation

🧠 Data Science with Python

🚀 Project Objectives

🧩 Repository Structure

🧰 Tech Stack

📊 Highlighted Projects

1️⃣ Insurance Premium Prediction

2️⃣ Fake or Biased Content Detection

3️⃣ Delivery Delay Classification

4️⃣ Agricultural Efficiency & Productivity Studies

📚 Learning Focus Areas

🧑‍💻 How to Use

🌟 Future Work

🧾 Author

👤 Tega Jarikre

📧 Email: jarikretega@gmail.com

🔗 LinkedIn: https://www.linkedin.com/in/tega-jarikre-92138342

💻 GitHub: https://github.com/Tegazini

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages