Skip to content

NadaAbbasMohamed/DataScience-and-ML-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataScience-and-ML-Projects

until now this repo contains 4 stages in my learning path as a machine learning engineer and a data scientist. it is organized from stages W - to - Z from the latest as W to the earliest projects as Z.

Stage Z: data visualization and data preprocessing

contains 5 files:

  • Uber Data Analysis.ipynb
  • Sales_Analysis_1.ipynb
  • Sales_Analysis_2.ipynb
  • pre-processing Loan Data.ipynb
  • pre-processing Black Friday.ipynb

this was my earliest practices inculding different preprocessing operations on different datafiles and including some visualizations as well to enable me to select the appropriate preprocessing technique

Stage Y: some early stage Machine learning and storytelling (creating insights from the data through charts and plots)

contains 3 files and a folder:

  • Diamonds_classification_Visualization_and_preprocessing_stage.ipynb
  • Diamonds_classification_stage.ipynb
  • 50ulke data set Story telling.ipynb
  • FOLDER : Comparing-the-performance-of-SVM-and-KNN-classification-techniques-master

The diamond project was my first end-to-end Machine learning project, starting from domain understanding to the prediction and model enhancement stage. in this project I was able to achieve 98.45676 accuracy score for the diamond classification problem the Folder contains an implementation of code for comparing the performance of the basic classifiers: SVM and KNN. There are 2 implementation codes one was done using randomly generated matrices the other was implemented on a very small data set of dogs and flower images (the images are the reason why it is contained in a folder).

Stage X: Comparing the performance of regression techniques I learnt on different datasets and in different preprocessing stages to gain better understanding of the use cases of each model

contains 4 files:

  • comparing Regression techniques -Small Dataset-WITH Feature Engineering-diabetes,csv.ipynb
  • comparing Regression techniques -Small Dataset-NO Feature Engineering-diabetes,csv.ipynb
  • comparing Regression techniques -Large Dataset-WITH Feature Engineering-Salaries,csv.ipynb
  • comparing Regression techniques -Large Dataset-NO Feature Engineering-Salaries,csv.ipynb

The results are documented in the files

Stage W: More Practice of ML concepts - Alitle more advanced projects

contains 2 files:

  • Classification of Social_Network_Ads_csv.ipynb
  • Regression of 50_startups_csv.ipynb