Some Weka-based tools written in Jython
Python Shell
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Set of Jython tools to perform data mining tasks using Weka

Needs Jython and Weka.

Uses UCI Michalski and Chilausky soybean data set

Originally developed for a class assignment.


  1. ** setup.bat** Shows how to set up classpath to use WEKA from Jython
  2. Pre-processes the soybean data set
  3. Finds subset of attributes that give best classification accuracy for a given algorithm and data set
  4. Weka .arff file reader and writer
  5. Splits a WEKA .arff file to preserve class distribution and maximize or minimize aggregate accuracy of a set of classifiers. Output is 2 WEKA .arff files
  6. *find_soybean_split.bat / * Shows how to run on a pre-processed soybean .arff file

Results are in the data directory.

Example use of

The batch/shell file find_soybean_split.bat / runs on to create the training and test files and which give the classification results soybean.split.results.txt whose summary is

Classifier Correct (out of 60) Percentage Correct
NaiveBayes 57 95 %
J48 58 96.67 %
BayesNet 59 98.33 %
RandomForest 59 98.33 %
JRip 60 100 %
KStar 60 100 %
SMO 60 100 %
MLP 60 100 %