Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InvocationTargetException error #22

Closed
ajing opened this issue Aug 14, 2017 · 11 comments
Closed

InvocationTargetException error #22

ajing opened this issue Aug 14, 2017 · 11 comments

Comments

@ajing
Copy link

ajing commented Aug 14, 2017

Hi,

I encountered an error for running StackNet. Here is the command:

java -Xmx12144m -jar StackNet.jar train train_file='/home/jlu/Experiments/Examples/Instacart/imba/data/nz_train_slim.csv' test_file='/home/jlu/
Experiments/Examples/Instacart/imba/data/all_data_test_V1.csv' has_head=true params='/home/jlu/Experiments/Examples/Instacart/imba/paramsv1.txt' sparse=false pred_file='/home/jlu/Experiments/Exam
ples/Instacart/imba/data/stacknet_pred_V1.csv' test_target=false verbose=true Threads=10 folds=5 seed=1 metric=auc output_name=restack_instacart folds=10 seed=1 task=classification

Here is the error message. What does InvocationTargetException error here imply?

parameter name : train_file value :  /home/jlu/experiments/examples/instacart/imba/data/nz_train_slim.csv
parameter name : test_file value :  /home/jlu/experiments/examples/instacart/imba/data/all_data_test_v1.csv
parameter name : has_head value :  true
parameter name : params value :  /home/jlu/experiments/examples/instacart/imba/paramsv1.txt
parameter name : sparse value :  false
parameter name : pred_file value :  /home/jlu/experiments/examples/instacart/imba/data/stacknet_pred_v1.csv
parameter name : test_target value :  false
parameter name : verbose value :  true
parameter name : threads value :  10
parameter name : folds value :  5
parameter name : seed value :  1
parameter name : metric value :  auc
parameter name : output_name value :  restack_instacart
parameter name : folds value :  10
parameter name : seed value :  1
parameter name : task value :  classification
 Completed: 5.00 %
 Completed: 10.00 %
 Completed: 15.00 %
 Completed: 20.00 %
 Completed: 25.00 %
 Completed: 30.00 %
 Completed: 35.00 %
 Completed: 40.00 %
 Completed: 45.00 %
 Completed: 50.00 %
 Completed: 55.00 %
 Completed: 60.00 %
 Completed: 65.00 %
 Completed: 70.00 %
 Completed: 75.00 %
 Completed: 80.00 %
 Completed: 85.00 %
 Completed: 90.00 %
 Completed: 95.00 %
 Completed: 100.00 %
 Loaded File: /home/jlu/Experiments/Examples/Instacart/imba/data/nz_train_slim.csv
 Total rows in the file: 8474661
 Total columns in the file: 78
 Weighted variable : -1 counts: 0
 Int Id variable : -1 str id: -1 counts: 0
 Target Variables  : 1 values : [0]
 Actual columns number  : 77
 Number of Skipped rows   : 0
 Actual Rows (removing the skipped ones)  : 8474661
Loaded dense train data with 8474661 and columns 77
 loaded data in : 125.971000
 Level: 1 dimensionality: 893
 Starting cross validation
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.NegativeArraySizeException
        at matrix.fsmatrix.<init>(fsmatrix.java:85)
        at ml.stacknet.StackNetClassifier.fit(StackNetClassifier.java:2749)
        at stacknetrun.runstacknet.main(runstacknet.java:471)
        ... 5 more

@goldentom42
Copy link

goldentom42 commented Aug 14, 2017

Hi ajing,

Looks like you're trying to predict what's in your next shopping cart ;-) But it may not be the right time to make a joke...

I assume the cause of the exception is :

Caused by: java.lang.NegativeArraySizeException

Somehow StackNet and more specifically fsmatrix initialization fails at line 85

this.data=new double [rows*columns];

So StackNet ends up with negative values for either rows or columns eventhough it successfully reads your files...

Just for the record is there anything wrong in paramsv1.txt like negative values ?
A last question what version of stacknet do you use? The stack trace does not seem in line with the latest master branch.
Goldentom.

@ajing
Copy link
Author

ajing commented Aug 14, 2017

Exactly, Goldentom. This is the last few hour of the Instacart competition. I just quickly throw a model last night and want to see what can happen. So, I am not very cautious about selecting models. I just used the Quora example (because also binary classification...)

LogisticRegression Type:Liblinear C:0.8 threads:1 usescale:True maxim_Iteration:100 seed:1 verbose:false
RandomForestClassifier estimators:100 threads:1 rounding:3 cut_off_subsample:0.15 max_depth:7 max_features:0.7 min_leaf:3.0 min_split:5.0 Objective:ENTROPY row_subsample:0.95 seed:1 verbose:false
LogisticRegression Type:SGD C:0.00001 threads:1 learn_rate:0.1 usescale:True maxim_Iteration:20 seed:1 verbose:false
LSVC Type:Liblinear threads:1 usescale:True C:3.0 maxim_Iteration:100 seed:1 verbose:false copy:false
LSVC Type:SGD C:0.00001 threads:1 learn_rate:0.1 usescale:True maxim_Iteration:20 seed:1 verbose:false copy:false
RandomForestClassifier estimators:100 threads:1 rounding:3 cut_off_subsample:1.0 max_depth:5 max_features:0.7 min_leaf:3.0 min_split:5.0 Objective:ENTROPY row_subsample:0.95 seed:1 verbose:false
softmaxnnclassifier usescale:True seed:1 Type:SGD maxim_Iteration:30 C:0.00001 shuffle:false learn_rate:0.001 smooth:0.1 h1:20 h2:30 connection_nonlinearity:Relu init_values:0.01 verbose:false copy:false
LibFmClassifier maxim_Iteration:100 C:0.000001 C2:0.02 lfeatures:3 seed:1 usescale:True init_values:0.001 learn_rate:0.04 smooth:0.0001 threads:1 verbose:false
GradientBoostingForestClassifier rounding:3 estimators:1000 shrinkage:0.1 threads:1 offset:0.00001 max_depth:8 max_features:0.4 min_leaf:4.0 min_split:8.0 Objective:RMSE row_subsample:0.7 seed:1 verbose:false
LibFmRegressor maxim_Iteration:100 C:0.000001 C2:0.02 lfeatures:3 seed:1 usescale:True init_values:0.001 learn_rate:0.04 smooth:0.0001 threads:1 verbose:false
GradientBoostingForestRegressor rounding:3 estimators:100 shrinkage:0.2 threads:1 cut_off_subsample:0.8 offset:0.00001 max_depth:9 max_features:0.4 min_leaf:4.0 min_split:8.0 Objective:RMSE row_subsample:0.7 seed:1 verbose:false

RandomForestClassifier estimators:300 threads:3 rounding:3.0 max_depth:12 max_features:0.4 min_leaf:3.0 min_split:5.0 Objective:ENTROPY row_subsample:0.9 seed:1 verbose:false

I updated the package and here is the new error message:

Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.NegativeArraySizeException
        at matrix.fsmatrix.<init>(fsmatrix.java:85)
        at ml.stacknet.StackNetClassifier.fit(StackNetClassifier.java:2871)
        at stacknetrun.runstacknet.main(runstacknet.java:471)
        ... 5 more

@goldentom42
Copy link

Thanks ajing, I was not expecting you to update the package.

Looking a bit more at the code (I found you had version of 26/06/2017).
Line 2749 (now 2871) of StackNetClassifier is

int temp_class=estimate_classes(level_grid,  this.n_classes, level==(parameters.length-1));
column_counts[level] = temp_class;
if (this.verbose){
	System.out.println(" Level: " +  (level+1) + " dimensionality: " + temp_class);
	System.out.println(" Starting cross validation ");
}
if (level<parameters.length -1){
	trainstacker=new fsmatrix(target.length, temp_class); <- This is line 2871

The last line is the call to fsmatrix that throws the exception. And with the logs I can see that
rows = 8474661 and temp_class = 893
rows * temp_class = 7 567 872 273 and this is big...

if the double vector allocation expects an int (-2 147 483 648 et +2 147 483 647):

this.data=new double [rows*columns]

Then we're out of bound!

I'm not a java expert so we may need to wait for @kaz-Anova to check this out.

@kaz-Anova
Copy link
Owner

@goldentom42 is right about the negative exception happening due to the size . However my main problem is with temp_class = 893 where StackNet thinks paramsv1.txt contains 893 models in the first layer! @ajing Could you please send a few lines of the train file (nz_train_slim.csv') that replicate the problem and the paramsv1.txt please?

@kaz-Anova
Copy link
Owner

Please send to kazanovassoftware@gmail.com

@goldentom42
Copy link

@kaz-Anova, sure I was surprised by the 893 as well but was focusing on the exception ;-)
In params there are 2 regressors and 9 classifiers, which means the program found 99 classes in the input file (9 * 99 + 2 = 893)
@ajing, anything suspicious in the first column of the input file?

@ajing
Copy link
Author

ajing commented Aug 14, 2017

@kaz-Anova @goldentom42 You guys are right. I was using a wrong column. Working on fixing it... Will there be an easy way to estimate the training time?

@kaz-Anova
Copy link
Owner

@ajing ..realistically speaking...it wont finish today :( I am afraid (e.g. you wont have enough time before Instacart finishes...)

@ajing
Copy link
Author

ajing commented Aug 14, 2017

@kaz-Anova That's my guess also.. Last time, I ran a smaller one on another data set, which was taking about three days. But, I still want to practice more on StackNet. You current submission achieves pretty a high score. Is that solely based on StackNet?

@kaz-Anova
Copy link
Owner

@ajing . You can see my approach here: https://www.kaggle.com/c/instacart-market-basket-analysis/discussion/38100

Stacking was not that important in this comp - but I would not have finished top 10 (not even top 20) without it.

@ajing
Copy link
Author

ajing commented Aug 15, 2017

@kaz-Anova Congratulations! I am really amazed you have tried so many ideas in such a short period of time. You must have something to make your work time efficient.

After fixing the number of class problem, now I have an out of memory error. But, I guess it can be solved by adding more memory..

 Loaded File: /home/jlu/Experiments/Examples/Instacart/imba/data/nz_train_slim.csv
 Total rows in the file: 8474661
 Total columns in the file: 78
 Weighted variable : -1 counts: 0
 Int Id variable : -1 str id: -1 counts: 0
 Target Variables  : 1 values : [0]
 Actual columns number  : 77
 Number of Skipped rows   : 0
 Actual Rows (removing the skipped ones)  : 8474661
Loaded dense train data with 8474661 and columns 77
 loaded data in : 127.731000
 Level: 1 dimensionality: 11
 Starting cross validation
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at matrix.fsmatrix.makerowsubset(fsmatrix.java:103)
        at ml.stacknet.StackNetClassifier.fit(StackNetClassifier.java:2900)
        at stacknetrun.runstacknet.main(runstacknet.java:471)
        ... 5 more

@ajing ajing closed this as completed Aug 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants