Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArrayIndexOutOfBoundException after dimensional reduction #27

Closed
DJBen opened this issue Dec 15, 2015 · 8 comments
Closed

ArrayIndexOutOfBoundException after dimensional reduction #27

DJBen opened this issue Dec 15, 2015 · 8 comments

Comments

@DJBen
Copy link

DJBen commented Dec 15, 2015

I tried to use PCA or MutualInfoFS to do dimensional reduction. After doing that I run the regression classifier, it crashes on the max dimension + 1.

MutualInfoFS transform = new MutualInfoFS(dataSet, 200);
dataSet.applyTransform(transform);
classifier = new StochasticMultinomialLogisticRegression(learningRate, iterations);
classifier.trainC(dataSet);

And the error is

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 201
at jsat.linear.DenseVector.get(DenseVector.java:116)
at jsat.linear.SparseVector.dot(SparseVector.java:511)
at jsat.classifiers.linear.StochasticMultinomialLogisticRegression.classify(StochasticMultinomialLogisticRegression.java:539)

Is that a bug? Or did I miss something?

@EdwardRaff
Copy link
Owner

Does this error occur with other classifiers or just StochasticMultinomialLogisticReression?

On Dec 14, 2015, at 9:47 PM, Sihao Lu notifications@github.com wrote:

I tried to use PCA or MutualInfoFS to do dimensional reduction. After doing that I run the regression classifier, it crashes on the max dimension + 1.

MutualInfoFS transform = new MutualInfoFS(dataSet, 200);
dataSet.applyTransform(transform);
classifier = new StochasticMultinomialLogisticRegression(learningRate, iterations);
classifier.trainC(dataSet);
And the error is

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 201
at jsat.linear.DenseVector.get(DenseVector.java:116)
at jsat.linear.SparseVector.dot(SparseVector.java:511)
at jsat.classifiers.linear.StochasticMultinomialLogisticRegression.classify(StochasticMultinomialLogisticRegression.java:539)

Is that a bug? Or did I miss something?


Reply to this email directly or view it on GitHub #27.

@DJBen
Copy link
Author

DJBen commented Dec 15, 2015

@EdwardRaff It happens on MultinomialNaiveBayes too, so I suspect it is not an isolated phenomenon.

@EdwardRaff
Copy link
Owner

Can you share the data that this happens with? And have you tried using JSAT built from the head revision?

On Dec 14, 2015, at 9:56 PM, Sihao Lu notifications@github.com wrote:

@EdwardRaff https://github.com/EdwardRaff It happens on MultinomialNaiveBayes too, so I suspect it is not an isolated phenomenon.


Reply to this email directly or view it on GitHub #27 (comment).

@DJBen
Copy link
Author

DJBen commented Dec 15, 2015

Of course.

https://github.com/DJBen/genre-classifier/blob/master/data/dataset_train.libsvm
https://github.com/DJBen/genre-classifier/blob/master/data/dataset_test.libsvm

I am using the 0.0.2 on Maven and I will get back to you as I am trying the head version.

@DJBen
Copy link
Author

DJBen commented Dec 15, 2015

I just pulled the head and it seems the problem persists.

@EdwardRaff
Copy link
Owner

I can't reproduce this issue. What does the rest of your code look like/environment? Below works fine for me.

import java.io.File;
import java.io.IOException;
import jsat.classifiers.ClassificationDataSet;
import jsat.classifiers.Classifier;
import jsat.classifiers.bayesian.MultinomialNaiveBayes;
import jsat.classifiers.linear.StochasticMultinomialLogisticRegression;
import jsat.datatransform.PCA;
import jsat.datatransform.featureselection.MutualInfoFS;
import jsat.io.LIBSVMLoader;

/**
 *
 * @author Edward Raff <Raff.Edward@gmail.com>
 */
public class TmpTest
{
    public static void main(String[] args) throws IOException
    {
        String path = "dataset_train.libsvm";

        ClassificationDataSet cds = LIBSVMLoader.loadC(new File(path));
        System.out.println("N=" + cds.getSampleSize());
        System.out.println("D=" + cds.getNumNumericalVars());


        cds.applyTransform(new MutualInfoFS(cds, 200));
//        cds.applyTransform(new PCA(cds, 50));

        System.out.println("N=" + cds.getSampleSize());
        System.out.println("D=" + cds.getNumNumericalVars());

//        Classifier classifier = new MultinomialNaiveBayes();
        Classifier classifier = new StochasticMultinomialLogisticRegression();

        classifier.trainC(cds);
    }
}

@TKlerx
Copy link
Contributor

TKlerx commented Dec 15, 2015

I tried this on various systems and cannot reproduce it, neither.
What are your values for learningRate and iterations?

@DJBen
Copy link
Author

DJBen commented Dec 16, 2015

I discovered I didn't apply the PCA to the test dataset so the test dataset exceeds 200 dimensions. Thank you it was my mistake.

@DJBen DJBen closed this as completed Dec 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants