New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ArrayIndexOutOfBoundException after dimensional reduction #27
Comments
Does this error occur with other classifiers or just StochasticMultinomialLogisticReression?
|
@EdwardRaff It happens on MultinomialNaiveBayes too, so I suspect it is not an isolated phenomenon. |
Can you share the data that this happens with? And have you tried using JSAT built from the head revision?
|
Of course. https://github.com/DJBen/genre-classifier/blob/master/data/dataset_train.libsvm I am using the 0.0.2 on Maven and I will get back to you as I am trying the head version. |
I just pulled the head and it seems the problem persists. |
I can't reproduce this issue. What does the rest of your code look like/environment? Below works fine for me. import java.io.File;
import java.io.IOException;
import jsat.classifiers.ClassificationDataSet;
import jsat.classifiers.Classifier;
import jsat.classifiers.bayesian.MultinomialNaiveBayes;
import jsat.classifiers.linear.StochasticMultinomialLogisticRegression;
import jsat.datatransform.PCA;
import jsat.datatransform.featureselection.MutualInfoFS;
import jsat.io.LIBSVMLoader;
/**
*
* @author Edward Raff <Raff.Edward@gmail.com>
*/
public class TmpTest
{
public static void main(String[] args) throws IOException
{
String path = "dataset_train.libsvm";
ClassificationDataSet cds = LIBSVMLoader.loadC(new File(path));
System.out.println("N=" + cds.getSampleSize());
System.out.println("D=" + cds.getNumNumericalVars());
cds.applyTransform(new MutualInfoFS(cds, 200));
// cds.applyTransform(new PCA(cds, 50));
System.out.println("N=" + cds.getSampleSize());
System.out.println("D=" + cds.getNumNumericalVars());
// Classifier classifier = new MultinomialNaiveBayes();
Classifier classifier = new StochasticMultinomialLogisticRegression();
classifier.trainC(cds);
}
} |
I tried this on various systems and cannot reproduce it, neither. |
I discovered I didn't apply the PCA to the test dataset so the test dataset exceeds 200 dimensions. Thank you it was my mistake. |
I tried to use PCA or MutualInfoFS to do dimensional reduction. After doing that I run the regression classifier, it crashes on the max dimension + 1.
And the error is
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 201
at jsat.linear.DenseVector.get(DenseVector.java:116)
at jsat.linear.SparseVector.dot(SparseVector.java:511)
at jsat.classifiers.linear.StochasticMultinomialLogisticRegression.classify(StochasticMultinomialLogisticRegression.java:539)
Is that a bug? Or did I miss something?
The text was updated successfully, but these errors were encountered: