Skip to content

This repository consists of Data Preprocessing techniques such as Importing Libraries and Datasets, Feature Scaling, Level Encoding, Splitting Training and Testing sets and dealing with missing values.

Notifications You must be signed in to change notification settings

peak27/Data-Preprocessing-Python-and-R-

Repository files navigation

Data-Preprocessing-Python-and-R-

This repository consists of Data Preprocessing techniques such as Importing Libraries and Datasets, Feature Scaling, Level Encoding, Splitting Training and Testing sets and dealing with missing values.

Project name: Data Preprocessing (w/ Python & R)

Description

    This project contains the learning and implementation process of Data Cleaning and processing before applying machine learning models like regression, classification or clustering to the datasets.

    Explanation of importing required libraries and creating the dependent and independent variables metrices, dealing with missing value in a data frame with mean, median or mode depending on the variable type of the column, Encoding categorical variable to provide them numeric values, Splitting of test and training sets for the model training process and feature scaling to normalize independent variables.

Installation

    In order to be able to apply these transformations and methods to your dataset you need to have following tools and libraries:

  • Python 2.x or Python 3.x
  • Pandas
  • NumPy
  • Scikit-Learn
  • R (For R implementation)

Usage

    Data Preprocessing is extremely crucial and important step of any data modeling steps therefore you can use these codes to refine and preprocess your dataset throughout any of your statistical model building process using Python and R.

Contributing

   You can implement these transformation on your existing model to see if it is going to increase accuracy of the model.

About

This repository consists of Data Preprocessing techniques such as Importing Libraries and Datasets, Feature Scaling, Level Encoding, Splitting Training and Testing sets and dealing with missing values.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published