SMORK

SMOTE in Spark

Implementation of SMOTE - Synthetic Minority Over-sampling Technique in SparkML / MLLib

Getting Started

This is a very basic implementation of SMOTE Algorithm in SparkML. This is the only available implementation which plugs in to Spark Pipelines.

Prerequisites

Spark 2.3.0 +

Installation

1. Build The Jar

      sbt clean package

2. Add The Jar to your Spark Application

Linux

      --conf "spark.driver.extraClassPath=/path/to/smork-0.0.1.jar"

3. Use it normally as you would use any Estimator in Spark.

- Import

      import com.iresium.ml.SMOTE

- Initialize & Fit

    val smote = new SMOTE()
    smote.setfeatureCol("myFeatures").setlabelCol("myLabel").setbucketLength(100)

    val smoteModel = smote.fit(df)

- Transform

    val newDF = smoteModel.transform(df)

You can also see and run an example in src/main/scala/SMORKApp.scala

Coming Soon

PySMORK - Python Wrapper for SMORK - allows you to use SMOTE in PySpark
Support for categorical attributes

Contributing

Looking for contributors ! You are welcome to raise issues / send a pull-request.

Authors

Abhinandan Dubey - @alivcor

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github		.github
src/main/scala		src/main/scala
static		static
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMORK

SMOTE in Spark

Getting Started

Prerequisites

Installation

1. Build The Jar

2. Add The Jar to your Spark Application

3. Use it normally as you would use any Estimator in Spark.

- Import

- Initialize & Fit

- Transform

Coming Soon

Contributing

Authors

License

About

Releases

Sponsor this project

Packages

Languages

alivcor/SMORK

Folders and files

Latest commit

History

Repository files navigation

SMORK

SMOTE in Spark

Getting Started

Prerequisites

Installation

1. Build The Jar

2. Add The Jar to your Spark Application

3. Use it normally as you would use any Estimator in Spark.

- Import

- Initialize & Fit

- Transform

Coming Soon

Contributing

Authors

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages