-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
h2o flow unresponsive? #16331
Comments
Thank you for reporting it. It doesn't seem to be working as expected. Is it possible it ran out of memory? Could you provide us with the backend logs(https://docs.h2o.ai/h2o/latest-stable/h2o-docs/logs.html)? It's likely that they would suffice to find out what is wrong but if you could provide us with more information it will get easier for us. Would you be able to check if it is browser/serialization issue? I would start by right click on the page -> "Inspect" and look if there are some errors in "console" tab and if not would you be able to rerun it with "Network" tab open to see if there is a long reply that could block the backend? |
h2o_127.0.0.1_54321-1-trace.log Thanks so much for your response! I uploaded some of the log files I found. It does seem like it was an OOM error. I admit to being out of my depth here - h2o was suggested to me because the PCA was too big for my computer but I suppose I don't know how to set up h2o properly. Is it possible to get enough memory to run this analysis? If so, can you please explain how? You can see in my screenshot of RStudio above I requested 20GB but it said 9.98 cluster memory. I had requested 9 days prior, so perhaps I need to clear something? I'm assuming I will actually need to have much more than even 20GB? Thanks again! |
In the RStudio screenshot, the h2o is already running and the I would also try to specify different @wendycwong do you have any other ideas? |
Hello, I tried specifying mim_mem_size and max_mem_size as seen below. Knowing that 10G was probably not enough, I just went up to 100G to see what happened. I uploaded my data through R and then went to the GUI interface to build the model with the hope that any errors would be clearly displayed there. Now when I go to build the PCA model, not even run it, it says H2O is no longer connected. I had the same error on both Chrome and Firefox. What am I doing wrong? Thanks again! |
This is baffling. Your dataset size is not that big and your memory allocation is fine. There really is no reason to see the failure. Is it possible to share your dataset so we can reproduce the error here locally and fix it? |
Of course! I'm sure there is some mistake on my end that I'm not expert enough to know to even mention here. It does have missing data if that helps troubleshoot, though I thought I picked the correct parameters to deal with that. I was going to test several combinations of the PCA methods (e.g., standardized or not) after I confirmed one of them worked. Additionally, I was and still am able to get a PCA working with a smaller dataset. Zipped data filed is attached. Thanks so much for your help! It's very much appreciated. |
Thank you so much for providing me with the information. Will try it out and let you know. |
However, I do run into one problem. When I set PCA_method="GLRM" like here: fitModel = H2OPrincipalComponentAnalysisEstimator(k=4, PCA_method="glrm", ) use_all_factor_levels=True) I will run into a NPE error. I have opened an issue to resolve this: #16335 This is embarrassing. |
That's great! Thank you for taking the time to try that out! I'm still having the same issue. Unless I'm missing something, I set up the PCA just like your screenshot above. And I get an unresponsive error almost immediately. The only difference this time is that the progress bar doesn't go up to 100% before it quits. I see this error in both Chrome and Firefox. Is it something about how I'm connecting to h2o? |
@blgodwin I would recommend trying it in R or Python. Flow gets much less attention than clients for the aforementioned languages so there might be more bugs than in R or Python client. One thing that is probably different is that we use macOS and linux for development and testing so it's possible the bug is related to the OS you use or it might be due to the newer version of Java. IIRC from logs you use Java that's not yet officially supported by us. If you can run h2o on some different OS it might help. If that would be too complicated, you might try different java version (older; AFAIK we support java 8 to 17). Or you might try running h2o in windows subsystem for linux. @wendycwong knows more about our PCA implementation so she might have some more ideas what to try if you feel uncomfortable with installing different java etc. |
I got it to work if I did not specify any min_mem_size, max_mem_size, or threads! PCA ran with a simple "localh2o = h2o.init()" to connect |
That's great! Thank you for mentioning how you resolved that! |
Connecting to h2o through R v. 4.3.3
H2O cluster version: 3.46.0.4
I connect to h2o through R and then begin to run a PCA model with a large dataset (30k+ datapoints) through the GUI webpage interface. I import the data, parse, it and view it, and the run the job. The job runs through all the iterations of alternating minimizations to where it says "100% progress." It takes about 1 minute 20 seconds. The progress bar does not change but the job status says "RUNNING." If I click "view" the Action a line in grey appears that says "getModel 'modelname'" but nothing more and a yellow bar appears at the bottom saying "Requesting http://localhost.54321/...."
Then it just stays this way. It has not given me a termination error, but it also remains unresponsive. I am unsure if it is just taking a long time to run or if it is indeed broken and not working. At the time of this writing I have let it sit for about 2 days. If I try to investigate the status using R it is also unresponsive.
Is this working as intended or have I encountered an error?
Thanks!
The text was updated successfully, but these errors were encountered: