Ensemble method----Random-Forest using Python

Random Forest

It also comes under Bagging Technique, But here model used is strictly decision tree.

Random Forest can be used for both classification and regression task.Random Forest is a model made up of many decision trees rather than simply averaging the prediction of trees.

The model uses two key concepts:-

a) Random Sampling of training data points when building trees.
b) Random Subsets of features considered when splitting nodes.

It can also be used for Feature Engineering.

Random forest pseudocode:

  1)	Randomly select “k” features from total “m” features.
          Where,  k << m
  2)	Among the “k” features,calculate the node “d” using best split point.
  3)	Split the node into daughter nodes using best split.
  4)	Repeat 1 to 3 steps until “1” number of nodes has been reached.
  5)	Build forest by repeating steps 1 to 4 for “n” number times to create “n” number of trees.

The main advantage of random forest is it eliminates Overfitting problem observed in Decison tree.

Data Used :

                        Company dataset - for knowing the attribute that causes high sale using random forest.
                        Fraud dataset -  for treating those who have taxable_income <= 30000 as "Risky" and others are "Good" using random forest.
                        Pima-indians-diabetes - Classification of diabetes 
                        Iris - classification of Species

Programming:

                          Python

The Codes regarding Random Forest for classification of company data with company dataset, Classification of Risky with Fraud dataset, classification if pima indians diabetes with pima indians diabetes dataset and classification of species with iris dataset are present in this Repository in detail.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Company_Data.csv		Company_Data.csv
Fraud_check.csv		Fraud_check.csv
README.md		README.md
Random forest.py		Random forest.py
company_RF.py		company_RF.py
fraud_RF.py		fraud_RF.py
iris.csv		iris.csv
iris_RF.py		iris_RF.py
pima-indians-diabetes.data.csv		pima-indians-diabetes.data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ensemble method----Random-Forest using Python

Random Forest

Data Used :

Programming:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ensemble method----Random-Forest using Python

Random Forest

Data Used :

Programming:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages