Skip to content

Explore my collection of projects showing Machine Learning, Data Analysis, etc. It's organized by project where each directory contains a README, code and datasets. I learned by myself.

Notifications You must be signed in to change notification settings

moniquecardoso25/Data-Science-Projects

Repository files navigation

Data Science Projects

image

Welcome to my Data Science Projects Repository! It contains a collection of my Data Science projects in order to show my skills and expertise in the field. Each project is a demonstration of different aspects of Data Analysis, Visualization, Machine Learning and Cloud Computing.

Projects

Description: This project is a captivating journey of a self-taught data science enthusiast who tackled the challenge of predicting house prices using the Kaggle dataset "House Prices: Advanced Regression Techniques." The goal was to showcase skills in exploratory analytics, feature engineering, and machine learning models.

Technologies Used:

  • Python
  • Exploratory Data Analysis (EDA)
  • Feature Engineering
  • Linear Regression, Tree Regression, K-Nearest Neighbors (KNN)
  • Data Visualization

Results:

  • Explored dataset and handled missing values by substituting with -1.
  • Conducted creative feature engineering to enhance model performance.
  • Utilized three machine learning models, with Linear Regression achieving the lowest mean squared error.
  • Validated the Linear Regression model through visualization, confirming its accuracy.
  • Achieved an impressive Kaggle score of 0.25 for house price predictions.

Check my article on Medium about this project

This project showcased the ability to independently tackle real-world data challenges and deliver valuable insights through exploratory analytics and feature engineering. The outcome solidified my understanding of evaluation methodologies and reinforced my passion for data science.

Description: The objective of this project is to predict customer churn in a telecom company. Customer Churn is the rate at which customers stop doing business with a company or discontinue their services. For that, develop a machine learning model that can predict customers who will leave the company.

Technologies Used:

  • Python
  • Exploratory Data Analysis (EDA)
  • Data Preprocessing - Robust Scaler
  • Feature Engineering
  • Logistic Regression, Random Forest Regression, XGB Classifier
  • Data Visualization
  • Encoding Variables using LabelEncoder
  • Evaluation and confusion matrix

Results:

  • Understanding of the Business problem.
  • Explored dataset through the graphics.
  • Conducted feature engineering to improve the model performance.
  • Utilized three machine learning models, in which Logistic Regression had the best performance related to others.
  • Achieved 80% of Accuracy, the number of predicted Customer churns was 460.
  • Conclusions and recommendations to the company about the analysis.

Description: The challenge is to recognize fraudulent credit card transactions so that the customers of credit card companies are not charged for items that they did not purchase.

Technologies Used:

  • Google Colab
  • Exploratory Data Analysis (EDA)
  • Data Preprocessing
  • Feature Engineering
  • MLP Classifier, Random Forest Regression, Logistic Regression
  • Data Visualization
  • Robust Scaler and sample to deal with imbalanced data
  • Evaluation model

Results:

  • Understanding of the Business problem
  • Explored dataset through the graphics
  • Conducted feature engineering to improve the model performance
  • Utilized three machine learning models, in which Logistic Regression had the best performance related to others.
  • Achieved 99.96% of Accuracy using Logistic Regression and 99.95% with MLP Classifier.

About

Explore my collection of projects showing Machine Learning, Data Analysis, etc. It's organized by project where each directory contains a README, code and datasets. I learned by myself.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published