GitHub - jayvischeng/IGBB: A bagging&boosting model with information gain to select features.

IGBB-Information Gain based Bagging Boosting classifier

Description

This is an implementation of the integrating information gain algorithm(ig) into tree based boosting for two-class classification problem. The ig is used to select sub set features of data. Due to the imbalanced property of our data sets, we add a bagging structure to solve the provlem. In boosting process, the algorithm sequentially applies a weak classification result to modified versions of the data. By increasing the weights of the mis-classified observations, each weak learner focuses on the error of the previous one. We assign the weights to all the data (by multiply), and in those "modified data" we run a feature selection method(using information gain) to select top-ranked features. The predictions are aggregated through a weighted majority vote. In bagging process, the input negative data (majority data) sets are under-sampled to balance data distribution. The final predictions of bagging are also aggregated through majority vote.

Method

Information Gain based Bagging Boosting algorithm:

Data

The original BGP data set is from RIPE Network Coordination Center: [RIPE RIS raw data] (https://www.ripe.net/analyse/internet-measurements/routing-information-service-ris/ris-raw-data)

Results

Using the Nimda dataset, we can appreciate a significant reduction in the error rate as we increase the number of iterations.

References

Trevor Hastie, Robert Tibshirani, Jerome Friedman - The Elements of Statistical Learning

If had any problem, please send me email: mc.cheng@my.cityu.edu.hk

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
BGPData		BGPData
images		images
unbalanced_dataset		unbalanced_dataset
.gitignore		.gitignore
README.md		README.md
evaluation.py		evaluation.py
loaddata.py		loaddata.py
mainfunc.py		mainfunc.py
sampling.py		sampling.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IGBB-Information Gain based Bagging Boosting classifier

Description

Method

Data

Results

References

License

About

Releases

Packages

Languages

jayvischeng/IGBB

Folders and files

Latest commit

History

Repository files navigation

IGBB-Information Gain based Bagging Boosting classifier

Description

Method

Data

Results

References

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages