Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External XGBoost on Hadoop #8171

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 2 comments
Closed

External XGBoost on Hadoop #8171

exalate-issue-sync bot opened this issue May 11, 2023 · 2 comments

Comments

@exalate-issue-sync
Copy link

exalate-issue-sync bot commented May 11, 2023

Finish work that started as part of the PoC PUBDEV-6862

Execute XGBoost on an external cluster.

  • TODO SECURITY
    ** cluster with https
    ** cluster with auth
  • TODO STEAM - decision mechanism of if to use remote execution and starting of the remote cluster (via Steam)
    ** it may not be possible to establish connection H2O->Steam
    ** think about being usable on k8s

Data exchange

  • -the H2O Frame is converted to DMatrix on each node and the DMatrix is written to the file system, the execution cluster then loads one DMatrix part in each node-
    ** -TODO- -think about transfering the DMatrix data directly over TCP/HTTP, maybe only for smaller matrices, but maybe, we could do 1:1 transfer for regular node to XGB node-
    ** -TODO- -when loading the DMatrix on executor cluster we first load it all into memory and then dump into a local file and then load it into DMatrix (native memory), maybe unnecessary to have it it memory twice-
  • -used single Schema classes for HTTP req/resp, actual data is passed as Base64 encoded binary data in JSON-
  • -the binary data in JSON is either a java Serialized object or raw xgboost booster bytes-
  • -TODO- -the above is fine except when transmitting a large booster, this may lead to HTTP timetouts and we might want to use a Streaming req/resp and move away from using SchemaV3 based API-
  • -TODO- -think about connection pooling to have faster HTTP turnaround (-[-comment-|https://github.com/[PUBDEV-6862] XGBoost off cluster POC #4344/files#r401372608] -from Pavel)-
@exalate-issue-sync
Copy link
Author

Jan Sterba commented: initial impl done

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

JIRA Issue Migration Info

Jira Issue: PUBDEV-7467
Assignee: Jan Sterba
Reporter: Jan Sterba
State: Closed
Fix Version: N/A
Attachments: N/A
Development PRs: Available

Linked PRs from JIRA

#4530
#4635
#4640
#4654
#4714
#4743
#4783
#4789
#4805
#4810
#4814

@h2o-ops h2o-ops closed this as completed May 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant