Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standalone R client #7782

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 17 comments
Closed

Standalone R client #7782

exalate-issue-sync bot opened this issue May 11, 2023 · 17 comments
Assignees

Comments

@exalate-issue-sync
Copy link

Offer a standalone R client (the same way we have a python one) that:

  • doesn't include the h2o jar
  • doesn't try to download an h2o jar
  • doesn't print out a use h2o.init message when the library is loaded

This will be useful for Steam installs where h2o isn't running on the same machine as the client.

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: [~accountid:5c355702a217aa69bce55831] I am not sure I understand what the advantage is of not downloading the jar is. It’s not that big of a file and only takes a few seconds to download…?

This sounds like it would add complexity to our release process for little gain, or maybe it will make more sense when you explain the value.

The normal Python client includes the jar though, right? I am not aware that we have a Python equivalent of this.

@exalate-issue-sync
Copy link
Author

Joseph Granados commented: In the hadoop downloads theres

{noformat}123M Jul 21 11:16 h2o-3.30.0.7-py2.py3-none-any.whl
386K Jul 21 11:16 h2o_client-3.30.0.7-py2.py3-none-any.whl{noformat}

I wouldn’t say the whl with the jar is small. However the main advantage is that it forces users to use connect to an already running cluster on hadoop instead of accidentally launching a local cluster. This is an issue our hadoop r users run into.

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: [~accountid:5c355702a217aa69bce55831] I think that second file will just download the H2O jar, just like R, no?

Thanks for clarifying what the issue that this is trying to solve is (that users are not connecting to the hadoop cluster). Even if we have a lightweight version which does not have a local jar, the default settings of {{h2o.init()}} will still try to connect to a local cluster. If there’s no jar to start, then it will give some error (which might not be very helpful if they don’t already know that they were supposed to connect to a remote hadoop cluster in {{h2o.init()}}.

I wonder if there’s another way to solve this – i guess there’s no way we could automatically detect whether or not they would want to connect to a hadoop cluster, right?

@exalate-issue-sync
Copy link
Author

Joseph Granados commented: The second whl doesn’t download anything.

If there’s an error that’s okay. Another part of the issue right now is when the h2o library is loaded in R it immediately prints a message telling the user to run {{h2o.init()}}, (which they usually do even if they have instructions otherwise) which is the wrong thing to do when connecting to a hadoop cluster. {{h2o.connect()}} is used instead of {{h2o.init()}} : [http://docs.h2o.ai/enterprise-steam/latest-stable/docs/r-docs/articles/h2osteam.html|http://docs.h2o.ai/enterprise-steam/latest-stable/docs/r-docs/articles/h2osteam.html]

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: [~accountid:5c355702a217aa69bce55831] Would it help if we updated the print-out from {{h2o.init()}} to say something like: “Your next step is to start H2O using h2o.init() or connect to an existing cluster using h2o.connect().”

It probably hasn’t been updated in years and could probably use a refresh. Here’s the current state:

!Screen Shot 2020-10-29 at 2.58.08 PM.png|width=678,height=615!

@exalate-issue-sync
Copy link
Author

Adam Valenta commented: H2O distribution zip archive will contain also h2o_client${PROJECT_VERSION}.tar.gz file under R folder, with client version of R package.

Similarly to client python package, the client package is not contain full h2o.jar and does not suggest to call h2o.init() but h2o.connect() instead.

@exalate-issue-sync
Copy link
Author

Joseph Granados commented: [~accountid:5f8e6929461cc40075215ee0] not a big deal, but could we add an underscore separating the name and project version like so: h2o_client_${PROJECT_VERSION}.tar.gz

@exalate-issue-sync
Copy link
Author

Joseph Granados commented: In python, if you call {{h2o.init()}} the message {{Error Output:}}
{{Exception in thread "main" java.lang.IllegalStateException: Client version of the library cannot be used to start a local H2O instance. Use h2o.connect() instead.}}
{{at water.H2OApp.main(H2OApp.java:6)}} appears instantly. In R it trys to connect for a while and then times out before showing the message. Can R be changed to show the message instantly?

@exalate-issue-sync
Copy link
Author

Adam Valenta commented: Yes, I’ll check it

@exalate-issue-sync
Copy link
Author

Joseph Granados commented: Also, maybe I’m doing something wrong but I don’t see the message when I import the library. Which I think is fine, just wanted to make sure that’s expected.

This is the message I don’t get anymore in R Studio:

!Screen Shot 2020-10-29 at 2.58.08 PM (012f8f10-6342-4732-96e9-09b1945382a3).png|width=678,height=615!

@exalate-issue-sync
Copy link
Author

Adam Valenta commented: It is weird, did you install client package to the clean workspace?

@exalate-issue-sync
Copy link
Author

Joseph Granados commented: I restarted R and the message is now there.

@exalate-issue-sync
Copy link
Author

Joseph Granados commented: Maybe “Your next step is to start H2O:” should be changed to “Your next step is to connect to H2O:”.

@exalate-issue-sync
Copy link
Author

Adam Valenta commented: Sure, is it is still trying to connect with clean workspace?

@exalate-issue-sync
Copy link
Author

Joseph Granados commented: Yes.

@exalate-issue-sync
Copy link
Author

Adam Valenta commented: h2o_client_${PROJECT_VERSION}.tar.gz (/)
Maybe “Your next step is to start H2O:” should be changed to “Your next step is to connect to H2O:”. (/)

The connection actually worked analogically to python, the difference is that R is waiting for 60s to tell you what is wrong. The current PR is changing the behavior to prevent starting and give the message directly. (/)

[https://github.com//pull/5184|https://github.com//pull/5184|smart-link]

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

JIRA Issue Migration Info

Jira Issue: PUBDEV-7861
Assignee: Adam Valenta
Reporter: Joseph Granados
State: Resolved
Fix Version: 3.32.0.3
Attachments: Available (Count: 2)
Development PRs: Available

Linked PRs from JIRA

#5184
#5190
#5135

Attachments From Jira

Attachment Name: Screen Shot 2020-10-29 at 2.58.08 PM.png
Attached By: Erin LeDell
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-7861/Screen Shot 2020-10-29 at 2.58.08 PM.png

Attachment Name: Screen Shot 2020-10-29 at 2.58.08 PM (012f8f10-6342-4732-96e9-09b1945382a3).png
Attached By: Joseph Granados
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-7861/Screen Shot 2020-10-29 at 2.58.08 PM (012f8f10-6342-4732-96e9-09b1945382a3).png

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants