Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to Create H20Context #2838

Closed
tsengj opened this issue Nov 18, 2022 · 3 comments
Closed

Failed to Create H20Context #2838

tsengj opened this issue Nov 18, 2022 · 3 comments

Comments

@tsengj
Copy link

tsengj commented Nov 18, 2022

Followed everything here, and unable to create h2o context.

https://docs.h2o.ai/sparkling-water/3.3/latest-stable/doc/rsparkling.html#install-sparklyr

Clear libraries

# The following two commands remove any previously installed H2O packages for R.
if ("package:rsparkling" %in% search()) { detach("package:rsparkling", unload=TRUE) }
if ("rsparkling" %in% rownames(installed.packages())) { remove.packages("rsparkling") }

if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }
if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }

# Install packages H2O depends on
pkgs <- c("methods", "statmod", "stats", "graphics", "RCurl", "jsonlite", "tools", "utils")
for (pkg in pkgs) {
    if (! (pkg %in% rownames(installed.packages()))) { install.packages(pkg) }
}

Install libraries

if (!require("h2o", quietly = TRUE)) install.packages("h2o", type = "source", repos = "http://h2o-release.s3.amazonaws.com/h2o/rel-zygmund/2/R")
if (!require("rsparkling")) install.packages("rsparkling", type = "source", repos = "http://h2o-release.s3.amazonaws.com/sparkling-water/spark-3.3/3.38.0.2-1-3.3/R") 

library("rsparkling")
library("h2o")

installed successfully

* installing *source* package ‘h2o’ ...
** using staged installation
** R
** demo
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (h2o)
* installing *source* package ‘rsparkling’ ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
No man pages found in package  ‘rsparkling’ 
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (rsparkling)
Installing package into ‘/local_disk0/.ephemeral_nfs/envs/rEnv-fc8e4f75-68fe-4030-b830-1b5166d74880’
(as ‘lib’ is unspecified)
trying URL 'http://h2o-release.s3.amazonaws.com/h2o/rel-zygmund/2/R/src/contrib/h2o_3.38.0.2.tar.gz'
Content type 'application/x-tar' length 177414951 bytes (169.2 MB)
==================================================
downloaded 169.2 MB


The downloaded source packages are in
	‘/tmp/RtmpeQOg9g/downloaded_packages’
Loading required package: rsparkling
Warning in library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘rsparkling’
Installing package into ‘/local_disk0/.ephemeral_nfs/envs/rEnv-fc8e4f75-68fe-4030-b830-1b5166d74880’
(as ‘lib’ is unspecified)
trying URL 'http://h2o-release.s3.amazonaws.com/sparkling-water/spark-3.3/3.38.0.2-1-3.3/R/src/contrib/rsparkling_3.38.0.2-1-3.3.tar.gz'
Content type 'application/x-tar' length 161159310 bytes (153.7 MB)
==================================================
downloaded 153.7 MB


The downloaded source packages are in
	‘/tmp/RtmpeQOg9g/downloaded_packages’

----------------------------------------------------------------------

Your next step is to start H2O:
    > h2o.init()

For H2O package documentation, ask for help:
    > ??h2o

After starting H2O, you can use the Web UI at http://localhost:54321
For more information visit https://docs.h2o.ai

----------------------------------------------------------------------


Attaching package: ‘h2o’

The following objects are masked from ‘package:stats’:

    cor, sd, var

The following objects are masked from ‘package:base’:

    &&, %*%, %in%, ||, apply, as.factor, as.numeric, colnames,
    colnames<-, ifelse, is.character, is.factor, is.numeric, log,
    log10, log1p, log2, round, signif, trunc

Load Libraries and Establish Connection

#load libraries
library(sparklyr)
library(tidyverse)
library(lubridate)

# spark_home_set()
config <- spark_config()
# Memory
config["sparklyr.shell.driver-memory"] <- "64g"
# Cores
config["sparklyr.connect.cores.local"] <- 8

sc <- spark_connect(method = "databricks", config = config) #remotely spark_home = "c:/programdata/anaconda3/lib/site-packages/pyspark"
options(warn=-1) #suppress warning messages
h2o.init()

Create H2o Context and fails here

h2oConf <- H2OConf()

Error Log

Error : java.lang.ClassNotFoundException: ai.h2o.sparkling.H2OConf
	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
	at com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at sparklyr.StreamHandler.handleMethodCall(stream.scala:111)
	at sparklyr.StreamHandler.read(stream.scala:62)
	at sparklyr.BackendHandler.$anonfun$channelRead0$1(handler.scala:60)
	at scala.util.control.Breaks.breakable(Breaks.scala:42)
	at sparklyr.BackendHandler.channelRead0(handler.scala:41)
	at sparklyr.BackendHandler.channelRead0(handler.scala:14)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:327)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:299)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:750)

Error: java.lang.ClassNotFoundException: ai.h2o.sparkling.H2OConf
Error: java.lang.ClassNotFoundException: ai.h2o.sparkling.H2OConf
	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
	at com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at sparklyr.StreamHandler.handleMethodCall(stream.scala:111)
	at sparklyr.StreamHandler.read(stream.scala:62)
	at sparklyr.BackendHandler.$anonfun$channelRead0$1(handler.scala:60)
	at scala.util.control.Breaks.breakable(Breaks.scala:42)
	at sparklyr.BackendHandler.channelRead0(handler.scala:41)
	at sparklyr.BackendHandler.channelRead0(handler.scala:14)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:327)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:299)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:750)

Session Info

R version 4.1.3 (2022-03-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.8.0           forcats_0.5.1            
 [3] stringr_1.4.0             dplyr_1.0.9              
 [5] purrr_0.3.4               readr_2.1.2              
 [7] tidyr_1.2.0               tibble_3.1.7             
 [9] ggplot2_3.3.6             tidyverse_1.3.1          
[11] sparklyr_1.7.5            h2o_3.38.0.2             
[13] rsparkling_3.38.0.2-1-3.3

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.2  forge_0.2.0       haven_2.5.0       colorspace_2.0-3 
 [5] vctrs_0.4.1       generics_0.1.2    htmltools_0.5.2   yaml_2.3.5       
 [9] base64enc_0.1-3   utf8_1.2.2        SparkR_3.3.0      rlang_1.0.2      
[13] pillar_1.7.0      withr_2.5.0       glue_1.6.2        DBI_1.1.2        
[17] Rserve_1.8-10     dbplyr_2.1.1      modelr_0.1.8      readxl_1.4.0     
[21] lifecycle_1.0.1   munsell_0.5.0     gtable_0.3.0      cellranger_1.1.0 
[25] rvest_1.0.2       htmlwidgets_1.5.4 tzdb_0.3.0        fastmap_1.1.0    
[29] curl_4.3.2        parallel_4.1.3    fansi_1.0.3       broom_0.8.0      
[33] r2d3_0.2.6        backports_1.4.1   scales_1.2.0      jsonlite_1.8.0   
[37] config_0.3.1      fs_1.5.2          hms_1.1.1         digest_0.6.29    
[41] stringi_1.7.6     rprojroot_2.0.3   grid_4.1.3        cli_3.3.0        
[45] tools_4.1.3       bitops_1.0-7      magrittr_2.0.3    RCurl_1.98-1.9   
[49] crayon_1.5.1      pkgconfig_2.0.3   ellipsis_0.3.2    xml2_1.3.3       
[53] reprex_2.0.1      assertthat_0.2.1  httr_1.4.3        rstudioapi_0.13  
[57] R6_2.5.1          compiler_4.1.3   
@krasinski
Copy link
Member

@tsengj can you please describe what you're trying to do, how does the environment look like? where does it run? are you connecting from your local PC to Databricks? What Databricks cluster is that?

@valenad1 valenad1 pinned this issue Dec 20, 2022
@valenad1 valenad1 unpinned this issue Dec 20, 2022
@tsengj
Copy link
Author

tsengj commented Feb 1, 2023

As I understand it, i need H2o context to move dataframes between spark and H2o. such as below;
If I am unable to establish the H2o context, how else do i convert spark dataframe to h2o dataframe?

thanks,

mtcars_tbl <- copy_to(sc, mtcars, "mtcars",overwrite = TRUE)
mtcars_hf <- hc$asH2OFrame(mtcars_tbl)
mtcars_hf

@krasinski
Copy link
Member

@tsengj I would need some more details about your use case to be able to reproduce that, I asked some questions in the comment above
the reason I asked is that not everything is clear - I see some local windows paths there, I see some databricks mentions etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants