-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TETRAD cmd search doesn't work for dataset with more than 100 nodes #122
Comments
There's no such limitation I know of. The error messages didn't come On Sunday, December 27, 2015, biotech25 notifications@github.com wrote:
Joseph D. Ramsey jsph.ramsey@gmail.com |
Hmm.. I had tested the dataset with different number of nodes, such as 90, 100, 101, 105, 110, or 4136. I got the error message when I used a dataset with only the number of nodes > 100. I attached the dataset which has 502 nodes (columns). Could you test it when you have time? It won't work. Then, could cut the dataset with less than 100 nodes (columns) and run it. Then, it will work. The error message can't come through because innumerable messages flew over very quickly. |
That files loads fine for me in the interface; the method used is the same java -cp ... -Xmx4g ... On Sun, Dec 27, 2015 at 12:25 PM, biotech25 notifications@github.com
Joseph D. Ramsey jsph.ramsey@gmail.com |
Thank you for your advice. I followed your advice and tested many things. First, I tested the heap size extension (-cp -Xmx4g) on the dataset with 79 nodes, and confirmed it was working. Then, I applied the heap size extension on the dataset with 502 nodes. java -cp -Xmx4096m -jar lib-tetrad-5.3.0-20151113.150857-1-tetradcmd.jar -data tgen_imputed_apoe_ChiSqaure_501SNPs_p0.001.txt -datatype discrete -algorithm pc -depth -1 -significance 0.01 When I set the heap size extension and ran command line TETRAD on the dataset with 502 nodes, I still got the same error massages flowing over fast, but I just waited until it ends. Even though I got error messages, surprisingly I got TETRAD output. The output correctly detected direct causes as I expected. Therefore, this time I didn't set the heap size extention, ran TETRAD search, and just waited until it ends. Yes, I got the output files, but the output content (direct causes and the number of edge pairs) was different from the first output. It is strange. I set the heap size extension and ran TETRAD search several times. Strangely, I got different output contents whenever I ran TETRAD with the same condition. I tested this in the Linux workstation, but I got the same problem. I don't think the heap map extension is working at list in my computer as I still get the same error messages and different output contents in each TETRAD run. I am going to test more tomorrow and let you know if I find something. |
Sorry, you can leave out the -cp the way you have it. You already have a java -Xmx4096m -jar lib-tetrad-5.3.0-20151113.150857-1-tetradcmd.jar -data On Sun, Dec 27, 2015 at 11:25 PM, biotech25 notifications@github.com
Joseph D. Ramsey jsph.ramsey@gmail.com |
Thank you for your reply. I think you are right. When I left out -cp, I got this error message. Invalid maximum heap size: -Xmx4096m So, I set just "-Xmx1024m" and ran it again. Then, I got the same innumerable error messages like yesterday. When I set "-Xmx3000m" or "-Xmx2000m" , I got a single error message like below. Could not reserve enough space for 3072000KB object heap Error occurred during initialization of VM I googled to find some solution; I may have to change 'system environment variable' to start and reserve enough JVM heap size. I am going to try and will you updated. Thank you, |
You don't have enough memory on your machine, I don't think. On Mon, Dec 28, 2015 at 10:03 AM, biotech25 notifications@github.com
Joseph D. Ramsey jsph.ramsey@gmail.com |
I agree with you. I need to reboot the Windows server or test it on a machine that I can reboot. I will update you later. |
I reboot computer and re-ran, or tried in on Mac or another Windows server. It still doesn't work. I got the same error messages plus "java.lang.NullPointerException". I don't understand why I don't have enough memory on every machine. I am going to test more.. |
Tell me about the null pointer exception. J On Mon, Dec 28, 2015 at 12:29 PM, biotech25 notifications@github.com
Joseph D. Ramsey jsph.ramsey@gmail.com |
I attached the screen shots of the error message. The null pointer exception doesn't tell me a lot. The first screen shot is the first part of the error message as soon as I execute the command line TETRAD. Those messages flow over very fast for 1~2 seconds. And then, as you see the second screen shot, innumerable lines of 'java.lang.NullPointerException' flow over for about 2 minutes. It is still being written. Please let me know if there is more I can explain. |
Do you have any missing values in your data? On Mon, Dec 28, 2015 at 3:16 PM, biotech25 notifications@github.com wrote:
Joseph D. Ramsey jsph.ramsey@gmail.com |
Well, I had suspected it and tested in many ways. When I open a txt file in Excel, if there is a missing value, I can find it by 'Find and Replace' function. (i tested it after deleting one value purposely) But, I didn't find any missing value. So, I duplicated a dataset with 78 SNPs, which was working well, to make a new dataset with more than 100 nodes; then I got the same problem, which is innumerable error messages flow over. I made the dataset have 104 nodes - still the same problem. I deleted 4 nodes to make it 100 nodes - then, it is working well without error message. |
Let me try asking a different way. Can you add any other nodes to the On Mon, Dec 28, 2015 at 3:41 PM, biotech25 notifications@github.com wrote:
Joseph D. Ramsey jsph.ramsey@gmail.com |
Well, the 4 nodes I deleted are what I duplicated from a dataset which was successful. But, I followed your advice to add any other nodes and tested it; still the same problem. Once I delete any nodes to make the total number of nodes 100, it works well. |
I wish I could dig further into this right now, but I'm busy with other J On Mon, Dec 28, 2015 at 3:54 PM, biotech25 notifications@github.com wrote:
Joseph D. Ramsey jsph.ramsey@gmail.com |
I am going to look at the code. I understand that you are busy and I appreciate your help! |
Hi Dr. Ramsey,
I have a question about running command line TETRAD PC Search algorithm. I am running it on a dataset that has several hundreds nodes, but it doesn't work. I found that command line TETRAD doesn't work if the dataset has more than 100 nodes (predictors and target), which means that it works for a dataset with only up to 100 nodes. The number of cases doesn't matter; it works well for a dataset with > 1000 cases, once the number of nodes is less than 100. However, it seems that it doesn't work if the number of nodes exceeds 100. Innumerable error messages were made and it flew over the windows command. I got a screenshot of error messages and attached here. Command line TETRAD has a limitation to be run on a dataset with less than 100 nodes or something?
Sanghoon
The text was updated successfully, but these errors were encountered: