Welcome to my data science portfolio! I'm a passionate data scientist eager to contribute insights through data-driven solutions. Below are highlights of my projects showcasing skills in data analysis, machine learning, and visualization. Explore repositories on GitHub for details.
-
- Auto-MPG: Predictive modeling of automotive fuel efficiency (mpg) with ML. This project includes data preprocessing, model training, evaluation, and visualization.
- Exploring Wine Quality: Conducted EDA and built ML models to predict wine quality based on various attributes.
- Stroke Risk Forecasting: Forecasted stroke risk using health data, identifying risk factors and providing actionable insights.
- Predicting Cancellations And Gaining Hotel Booking: A focus on hotel booking cancellation predictions, revenue optimization, and customer behavior analysis.
- Room Occupancy Estimation: Utilize non-intrusive sensor data and machine learning techniques to predict and monitor the number of occupants in a controlled environment.
- Anomaly Detection for Steel Plate Defects: Detect and classify faults or defects in steel plates using machine learning
- Fraud Detection in Imbalanced Data Settings: Investigating and applying methods for identifying fraudulent activities within imbalanced datasets is the central goal of this GitHub repository.
- Customer Churn Prediction: Analyze customer data for a subscription-based service and build a model to predict and reduce customer churn.
- Talent Insight Analytics: A comprehensive solution for businesses and HR professionals seeking to gain valuable insights into their talent pool and make data-driven decisions about recruitment, retention, and workforce planning.
- Employee Attrition Predictor: The Employee-Attrition-Predictor focuses on predicting employee attrition using machine learning techniques. It is likely inspired by the IBM HR Analytics Employee Attrition & Performance dataset, a widely-used dataset in the field of HR analytics and employee retention.
Tools - Python, scikit-learn, XGBoost, pandas, matplotlib, collaborative filtering, seaborn, TensorFlow, Keras, deep learning, image classification, data analysis, network analysis, data visualization, feature engineering
-
- COVID-19 CNN & Grad-CAM: Developed a CNN model for COVID-19 classification and employed Grad-CAM visualization to interpret model decisions.
- EarlyDetect: Cervical Cancer: Contributed to cervical cancer detection using image classification and transfer learning techniques.
- Devanagari Character Recognition: Built a deep learning model to identify Devanagari characters, enhancing South Asian language processing.
- Environmental Data Analysis: The project has collected data from approximately 497 unique locations across various regions in Rwanda, including farmlands, cities, and power plants.
- Mental Health Treatment Decisions: Leveraging data-driven insights to enhance treatment accuracy, efficacy, and patient outcomes in mental health care.
- CNN Based Face Mask Detection: CNN-powered solution for accurate and efficient face mask detection.
- Employee Attrition Predictor: The Employee-Attrition-Predictor focuses on predicting employee attrition using machine learning techniques. It is likely inspired by the IBM HR Analytics Employee Attrition & Performance dataset, a widely-used dataset in the field of HR analytics and employee retention.
Tools - TensorFlow, Keras, deep learning, image classification
-
- Sarcasm-Detection-Through-Word-Embeddings: Development and exploration of machine learning models and techniques for detecting sarcasm in text using word embeddings.
- Noun Phrase Classification: Development and implementation of a Noun Phrase Classification system, involves categorizing noun phrases within text documents into predefined or custom-defined classes or categories.
- Movie Recommendation System: Developed a collaborative filtering-based recommendation system for personalized movie suggestions.
- Text-Summarization: Delve into two fundamental methods for text summarization: extractive and abstractive. We will also explore ways to evaluate their effectiveness.
Tools - NLTK, spaCy, Gensim, Transformers (Hugging Face), TextBlob, AllenNLP
-
- EuroVision Network Insights: Analyzed EuroVision data, unveiling voting patterns and connections between participating countries.
- HPA Single-Cell Classification: Explored Human Protein Atlas data, gaining insights into single-cell classification and cellular structures.
- European Football Trends: Explored football data, revealing trends, patterns, and performance insights across multiple seasons.
Tools - pandas, matplotlib, seaborn, data analysis, data visualization