-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Hey, thanks for this awesome library. Would be awesome to see it integrated in sklearn some day.
I'm currently exploring sampling methods sort of like a hyper parameter search, by running each sampler on my data set before classification. It makes it through all of the Oversampler methods listed in the iPythonNotebook examples, but when it gets to EasyEnsemble (and balance cascade), my code breaks because the return type for X_train has changed.
I'm not sure why it's changing, or if its expected, but it doesn't seem to be listed in the docs or in a any code comments. It seems later in the example both ensemble methods are given special treatment. Perhaps I should go and read the paper, but if there is an unexpected difference in what your library outputs, I really think it should be documented.
elif sampler:
sampler.ratio = float(np.count_nonzero(y==1)) / float(np.count_nonzero(y==0))
print "Using {1} sampling method with ratio {0}".format(sampler.ratio,str(sampler))
X_train, y_train = sampler.fit_transform(X_train,y_train)
print "Training: Feature space holds %d observations and %d features" % X_train.shapeThis code normally runs fine, but breaks on ensemble methods:
File "code.py", line 168, in run_labelkfold
print "Training: Feature space holds %d observations and %d features" % X_train.shape
AttributeError: 'list' object has no attribute 'shape'