New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exception using Matlab for LSA #26
Comments
I found the solution: The java.util.Scanner class is used to import the Matlab-Files. This class is dependent on the language of the Java environment. |
Awesome find! If you wan to send us a pull request, i'll be more than happy to merge this little fix :) |
I seem to be having a similar problem: Mar 23, 2013 2:32:56 PM edu.ucla.sspace.common.GenericTermDocumentVectorSpace pr I use Win7, MatLab2010a, and the current version of sspace (2.03). |
All three of the matlab output matrices remain empty. Does anyone know why this might be the case? |
I think we've tested with 2010a, but I don't think we saw this behavior. Thanks, On Sun, Mar 24, 2013 at 12:40 AM, ganonp notifications@github.com wrote:
|
It might well be that the corpus is too small. If you do not have some more documents than dimensions, matlab cannot compute the SVD. |
Thank you for your help! So the problem originally occurred with a rather large corpus (several thousand lines in an ~150 megabyte .txt file), since then I've been using a smaller version and messing with the code in an attempt to remedy the problem. I have actually taken the matrix file (via matrix.getAbsolutePath()) and run it through matlab manually using the same code found in the matlab script in the SingularValueDecompositionMatlab.java document and it has no problem producing any of the three uOutput, sOutput, or vOutput files correctly (at least not blank). I've never run a java program that accessed matlab, so it's possible there is some issue in my environment variables? Every time I run the program, matlab opens up to a command window, no script is visibly implemented, and the uOutput, sOutput, and vOutput files remain blank, so I believe it's an issue of actually getting matlab to run the script... |
Wow, that really helps track down the issue! My guess is that there is an We've tried hard to make sure all the code is platform independent, so it's Thanks, On Sun, Mar 24, 2013 at 4:52 PM, ganonp notifications@github.com wrote:
|
Hey, no problem! I'm thankful someone is putting this type of software out open source, it's going to be really helpful for some projects I'm working on. Let me know if there's anything I can do to help! Ganon |
Hi, |
Whoops, I thought we had integrated that pull request! I'll make sure that On Sun, Mar 24, 2013 at 8:06 PM, N2D2 notifications@github.com wrote:
|
See that was my initial thought too, which is why I posted here. I can't imagine why I wouldn't have a US-Version of Java, but I went ahead and changed the scanner objects in the readDenseTextMatrix method in the MatrixIO class to "Scanner s = new Scanner(line).useLocale(new Locale("en", "US"));" around line 897 and "Scanner scanner = new Scanner(matrix).useLocale(new Locale("en", "US"));" around line 928 Upon doing this I added: "System.out.println("reading in text matrix with " + rows + So I could evaluate whether it was actually counting anything. and it was not - ie it was returning 0 rows and -1 cols. From here I looked at what the actual file "matrix" was referring to, and it appears to be the sOutput and the uOutput from the SingularValueDecompositionMatlab class in the factorize method. Now, these are the two files that are empty, however they are not empty when I run the matlab code manually using the exact same MatrixFile used in the factorize method manually. This is what is leading me to believe that it's a problem with the interface between matlab and java/command line. It is also these two files which the scanner is using to count rows and columns - unless I missed something. I've been trying to evaluate if there is another step where the termdocumentvectorspace variable (from the generictermdocumentvectorspace class) is used to determine columns and rows but I can't find one, but I'm also not that great at java :p |
Ok, that means the Term-Document-Matrix is read in correctly? Than that is not the point. I remember, it was a struggle to set the environment variables for matlab. Finally I put an alias-file of matlab to /usr/bin under Mac OS. Without this, I had the same problem. Have you set matlab in the PATH environment? In my memory I had the situation, matlab starts normally in the terminal, but a java-program in the same terminal did not find it, until I put the matlab-aliases. It may be true that I did something more, e. g. set a path-variable, but I do not remember my steps exactly. |
It appears to me that it is read in correctly, though as I said, I could be mistaken. I have set my path environment variable to C:\Program Files (x86)\MATLAB\R2010a Student\bin. I've done some googling on this and can't seem to find anything else. |
I quote things I changed, that has probably nothing to do with your concrete problem, but can help in the future. "opts.maxit = 2000;\n" +
for (int s = 0; s < dimensions; ++s) with: double lastNotNull=0; 61 was the average decrease for my matrices. Remember that this are the smallest and nonrelevant singular values. If this code is reached, you want to get too many dimensions from a too small corpus!
103 dataClasses.set(r, c, U.get(r, c) * singularValues[c]); and 124 classFeatures.set(r, c, V.get(r, c) * singularValues[r]); but at this point I am not sure, that I am right. Maybe I do not understand the code completely, but only with this changes I can reproduce the LSI-example from Landauer et al. That are all my changes, with this adjustments you have a stable LSA-implementation. ########### `Mrz 26, 2013 11:04:27 AM edu.ucla.sspace.mains.LSAMain verbose Mar 26, 2013 11:04:47 AM edu.ucla.sspace.matrix.factorization.SingularValueDecompositionMatlab factorize To get started, type one of these: helpwin, helpdesk, or demo.
Mar 26, 2013 11:05:10 AM edu.ucla.sspace.matrix.factorization.SingularValueDecompositionMatlab factorize |
I've encountered the same problem (IllegalArgumentException: dimensions must be positive) when running dimension reduction by using SingularValueDecompositionMatlab or SingularValueDecompositionOctave in Windows 7. I have matlab installed. The problem is exactly the same as the one described by @ganonp . When i run the script (as below) in matlab, it works well. I suspect that there is an issue to make matlab script run correctly by writing scripts to matlab output stream (as the code in line 102 in SingularValueDecompositionMatlab.java). I have tried the version 2.0.4 and 2.0.3 and none of them works. Any ideas? Z=load('C:\Users\jerry\AppData\Local\Temp\matlab-input3613993774554135994.dat','-ascii'); |
Hi,
if I put the -S MATLAB command, Java throws
IllegalArgumentException: dimensions must be positive
at edu.ucla.sspace.matrix.OnDiskMatrix.(OnDiskMatrix.java:98)
at edu.ucla.sspace.matrix.Matrices.create(Matrices.java:216)
at edu.ucla.sspace.matrix.MatrixIO.readDenseTextMatrix(MatrixIO.java:924)
In Matrices.create(Matrices.java:216) I find the lines:
case SPARSE_ON_DISK:
//return new SparseOnDiskMatrix(rows, cols);
// REMDINER: implement me
return new OnDiskMatrix(rows, cols);
Is the MATLAB matrix format not implemented yet or is it a bug?
The output above the Exception suggests a general problem with reading the Matlab-Output:
Nov 09, 2012 11:21:52 AM edu.ucla.sspace.matrix.MatrixIO readDenseTextMatrix
FINE: reading in text matrix with 15262 rows and 0 cols
Nov 09, 2012 11:21:52 AM edu.ucla.sspace.matrix.MatrixIO readDenseTextMatrix
FINE: reading in text matrix with 100 rows and 0 cols
And Matlab gives the warning:
I use Mac 10.7, Matlab 2012a and the SSpace 2.0-Code (but this happened with earlier code, too)
The text was updated successfully, but these errors were encountered: