# Uso de Weka desde python
Autores: Mauricio Beltrán, Juan Antonio Vicente, Alfonso Carabantes

Vamos a ver como realizar la instalación para poder usar **Weka** dentro de **python**. Para esta tarea usaremos el módulo **python-weka-wrapper3** (http://fracpete.github.io/python-weka-wrapper3) que permite através del API de Java poder usar los algoritmos que tiene implementado Weka.

## Instalación de módulos

In [None]:
!pip install javabridge
!pip install python-weka-wrapper3
!apt-get install  python3-pygraphviz

Reading package lists... Done
Building dependency tree       
Reading state information... Done
python3-pygraphviz is already the newest version (1.4~rc1-1build2.1).
The following package was automatically installed and is no longer required:
  libnvidia-common-440
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 35 not upgraded.


## Arrancamos la máquina virtual java para poder usar weka

In [None]:
import weka.core.jvm as jvm
#jvm.stop()
jvm.start( packages = True)

DEBUG:weka.core.jvm:Adding bundled jars
DEBUG:weka.core.jvm:Classpath=['/usr/local/lib/python3.6/dist-packages/javabridge/jars/rhino-1.7R4.jar', '/usr/local/lib/python3.6/dist-packages/javabridge/jars/runnablequeue.jar', '/usr/local/lib/python3.6/dist-packages/javabridge/jars/cpython.jar', '/usr/local/lib/python3.6/dist-packages/weka/lib/weka.jar', '/usr/local/lib/python3.6/dist-packages/weka/lib/python-weka-wrapper.jar']
DEBUG:weka.core.jvm:MaxHeapSize=default
DEBUG:weka.core.jvm:Package support enabled


## Cargamos el fichero de datos de iris

In [None]:
from weka.core.converters import Loader
data_dir = "./"
loader = Loader(classname="weka.core.converters.ArffLoader")
data = loader.load_url("https://gist.githubusercontent.com/myui/143fa9d05bd6e7db0114/raw/500f178316b802f1cade6e3bf8dc814a96e84b1e/iris.arff")
#data = loader.load_file(data_dir + "iris.arff")
data.class_is_last()

#print(data)

## Probamos el clasificador J48

In [None]:
from weka.classifiers import Classifier
cls = Classifier(classname="weka.classifiers.trees.J48", options=["-C", "0.3"])
cls.build_classifier(data)

print(cls)


J48 pruned tree
------------------

petalwidth <= 0.6: Iris-setosa (50.0)
petalwidth > 0.6
|   petalwidth <= 1.7
|   |   petallength <= 4.9: Iris-versicolor (48.0/1.0)
|   |   petallength > 4.9
|   |   |   petalwidth <= 1.5: Iris-virginica (3.0)
|   |   |   petalwidth > 1.5: Iris-versicolor (3.0/1.0)
|   petalwidth > 1.7: Iris-virginica (46.0/1.0)

Number of Leaves  : 	5

Size of the tree : 	9



## Instalación de otros paquetes de weka



Podemos ver los paquetes que tenemos instalados

In [None]:
packages.installed_packages()


[AnDE (1.2.1), multilayerPerceptronCS (1.0.1)]

Podemos listar los paquetes disponibles para descargar

In [None]:
import weka.core.packages as packages
items = packages.all_packages()
for item in items:
  print(item.classname + " " + item.url )




weka.core.packageManagement.DefaultPackage https://github.com/felipebravom/AffectiveTweets/releases/download/1.0.2/AffectiveTweets1.0.2.zip
weka.core.packageManagement.DefaultPackage http://prdownloads.sourceforge.net/averagedndepend/AnDE1.2.1.zip?download
weka.core.packageManagement.DefaultPackage https://github.com/garfieldnate/Weka_AnalogicalModeling/releases/download/0.04/Weka_AnalogicalModeling-0.04.zip
weka.core.packageManagement.DefaultPackage http://prdownloads.sourceforge.net/ar-text-mining/ArabicStemmers_LightStemmers_1.0.0.zip?download
weka.core.packageManagement.DefaultPackage http://www.cs.ubc.ca/labs/beta/Projects/autoweka/autoweka-2.6.1.zip
weka.core.packageManagement.DefaultPackage https://github.com/fritziF/BANGFile-WekaPackage/releases/download/v1.0/BANGFile.zip
weka.core.packageManagement.DefaultPackage http://csusap.csu.edu.au/~zislam/code/CAIRAD.1.0.zip
weka.core.packageManagement.DefaultPackage https://github.com/jiangliangxiao/CFWNB/blob/master/CFWNB.zip?raw=true

Instalamos el de **AveradeNDependenceEstimators**

In [None]:
cls2 = Classifier(classname="weka.classifiers.bayes.AveragedNDependenceEstimators.A1DE" )


Probamos el clasificador **A1DE**

In [None]:

# Construimos el modelo
cls2.build_classifier(data)

# Mostramos el modelo
print(cls2)

# Hacemos predicciones sobre los datos
for index, inst in enumerate(data):
    # Predicción
    pred = cls2.classify_instance(inst)
    # Probabilidades
    dist = cls2.distribution_for_instance(inst)
    print(str(index+1) + ": label index=" + str(pred) + ", class distribution=" + str(dist))

The A1DE Classifier

Class Iris-setosa: Prior probability = 0.33
Class Iris-versicolor: Prior probability = 0.33
Class Iris-virginica: Prior probability = 0.33

Dataset: iris-weka.filters.supervised.attribute.Discretize-Rfirst-last-precision6
Instances: 150
Attributes: 5
Frequency limit for superParents: (F = 1) 
Correction: m-estimate (m = 1.0)
Incremental Classifier Flag: (false)
Subsumption Resolution Flag: (false)
Critical Value for Subsumption Resolution (100)
Weighted AODE Flag: (false)

1: label index=0.0, class distribution=[9.99737472e-01 1.28257235e-04 1.34270696e-04]
2: label index=0.0, class distribution=[9.99653232e-01 1.45670854e-04 2.01097391e-04]
3: label index=0.0, class distribution=[9.99653232e-01 1.45670854e-04 2.01097391e-04]
4: label index=0.0, class distribution=[9.99653232e-01 1.45670854e-04 2.01097391e-04]
5: label index=0.0, class distribution=[9.99737472e-01 1.28257235e-04 1.34270696e-04]
6: label index=0.0, class distribution=[9.99737472e-01 1.28257235e-04 1