GitHub - gstar1987td/IForest-On-Spark

This project Implement the article of :

Liu, Fei Tony, Kai Ming Ting, and Zhi-Hua Zhou. "Isolation forest."Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. IEEE, 2008.

IForest On Spark use spark to sampling data, and separate each partitoin to a spark worker. Each partition train n isolate trees. The train process is runing on paralle mode.

The prediction uses all isolation trees trained by spark, to predict the outlier factors.

SKLearn Iforest: http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html

Comparation: SKLearn Iforest: http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html

SVM OneClass: SVM OneClass Result:

IForest On Spark:

IForest On Spark Test Result:

Project rely on spark-2.1.0-bin-hadoop2.7.Download at :http://spark.apache.org/downloads.html

How To Use:

   var prop = new IForestProperty
    prop.max_sample = 5000
    prop.n_estimators = 1500
    prop.max_depth_limit = (math.log(prop.max_sample) / math.log(2)).toInt
    prop.bootstrap = true
    prop.partition = 10
    
    var ift = new IForestOnSpark(prop)
    var data_mtx:DenseMatrix[Double] = ... (train data in matrix)
    ift.fit(data_mtx, spark)
    
    x: DenseVector[Double] = ... (test data)
    var output = ift.predict(x)
    
    //Serialize model to HDFS
    var if_seralizer = new IForestSerializer
    if_seralizer.serialize("hdfs://127.0.0.1/ifserialized", ift)

    //Load model from HDFS
    var if_loader = new IForestSerializer
    var localmodel = if_loader.deserialize("hdfs://172.16.22.14:9000/ifserialized")

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Data		Data
result		result
src		src
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

gstar1987td/IForest-On-Spark

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages