Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Failed to create Hive context" warning on EC2 instance (/tmp/hive not writable) #386

Closed
fereshtehRS opened this issue Dec 15, 2016 · 2 comments

Comments

@fereshtehRS
Copy link

Sparklyr version 0.4.41

Logging this for reference.

  • Brought up an EC2 cluster with Spark 1.6.1
  • After going through the setup (install packages, ...), a connection to spark was giving the following warning:
Warning messages:
1: In value[[3L]](cond) :
  java.lang.RuntimeException: java.io.IOException: Filesystem closed
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
    at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:204)
    at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:238)
    at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:218)
    at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:208)
    at org.apache.spark.sql.hive.HiveContext.functionRegistry$lzycompute(HiveContext.scala:462)
    at org.apache.spark.sql.hive.HiveContext.functionRegistry(HiveContext.scala:461)
    at org.apache.spark.sql.UDFRegistration.<init>(UDFRegistration.scala:40)
    at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:330)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:90)
    at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:101)
    at sparklyr.Backend$.getOrCreateHiveCont [... truncated]
2: In create_hive_context_v1(sc) :
  Failed to create Hive context, falling back to SQL. Some operations, like window-functions, will not work
  • It is not very clear from that message what the root cause is. But running the following shows the root cause:
ctx <- spark_context(sc)

invoke_new(
      sc,
      "org.apache.spark.sql.hive.HiveContext",
      ctx
)

which is:

 java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwx--x--x
  • The mentioned HDFS directory has these permissions and ownerships:
drwx--x--x   - rstudio supergroup          0 2016-12-15 16:32 /tmp/hive
  • Changed the permissions and ownerships as follows:
drwxrwxr-x   - rstudio hadoop          0 2016-12-15 16:32 /tmp/hive

but even this was not enough. The only thing that worked was full write permissions:

drwxrwxrwx   - rstudio supergroup          0 2016-12-15 16:32 /tmp/hive
@edgararuiz-zz
Copy link
Contributor

Hi @fereshtehRS, alternatively, the hadoop fs command can be used to update user and folder changes

@javierluraschi
Copy link
Collaborator

We've been using EMR heavily and haven't hit this one, maybe an older version of EMR triggering, etc. I think we can close at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants