New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re #516 #533

Closed
thbeh opened this Issue Mar 14, 2017 · 5 comments

Comments

Projects
None yet
3 participants
@thbeh

thbeh commented Mar 14, 2017

Hi hbhanawat,

So the compilation work but I got some complain of hive folder not found (refer towards the end of the attached) -

[mapr@myspark ~]$ /opt/mapr/spark/spark-2.0.1/bin/spark-shell --master yarn --jars '/home/mapr/ext-jars/snappydata-core_2.11-0.7.jar' --conf spark.snappydata.store.locators=locator1:10334
17/03/06 20:27:50 WARN ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1070)
17/03/06 20:28:04 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
Spark context Web UI available at http://192.168.100.96:4040
Spark context available as 'sc' (master = yarn, app id = application_1485921759427_0065).
Spark session available as 'spark'.
Welcome to
____ __
/ / ___ / /
\ / _ / _ `/ __/ '/
/
/ .__/_,// //_\ version 2.0.1-mapr-1611
/
/

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_121)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import org.apache.spark.sql.{SnappySession, SparkSession}
import org.apache.spark.sql.{SnappySession, SparkSession}

scala> val snSession = new SnappySession(sc)
snSession: org.apache.spark.sql.SnappySession = org.apache.spark.sql.SnappySession@2ae0eb98

scala> val colTable = snSession.table("TestColumnTable")
java.lang.RuntimeException: java.io.FileNotFoundException: File /user/mapr/tmp/hive does not exist
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.spark.sql.hive.client.HiveClientImpl.(HiveClientImpl.scala:189)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:247)
at org.apache.spark.sql.hive.HiveClientUtil.newClient(HiveClientUtil.scala:235)
at org.apache.spark.sql.hive.HiveClientUtil.(HiveClientUtil.scala:129)
at org.apache.spark.sql.internal.SnappySharedState.metadataHive$lzycompute(SnappySharedState.scala:33)
at org.apache.spark.sql.internal.SnappySharedState.metadataHive(SnappySharedState.scala:33)
at org.apache.spark.sql.internal.SnappySharedState.externalCatalog$lzycompute(SnappySharedState.scala:37)
at org.apache.spark.sql.internal.SnappySharedState.externalCatalog(SnappySharedState.scala:36)
at org.apache.spark.sql.internal.SnappySessionState.catalog$lzycompute(SnappySessionState.scala:200)
at org.apache.spark.sql.internal.SnappySessionState.catalog(SnappySessionState.scala:199)
at org.apache.spark.sql.internal.SnappySessionState.catalog(SnappySessionState.scala:53)
at org.apache.spark.sql.SparkSession.table(SparkSession.scala:568)
at org.apache.spark.sql.SparkSession.table(SparkSession.scala:564)
... 48 elided
Caused by: java.io.FileNotFoundException: File /user/mapr/tmp/hive does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:607)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:877)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:597)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:602)
at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
... 61 more

scala> :q
17/03/06 20:30:55 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorMetricsUpdate(1,WrappedArray())
[mapr@myspark ~]$ hadoop fs -ls /user/mapr/tmp/hive
Found 1 items
drwxr-xr-x - mapr mapr 1 2017-01-31 04:05 /user/mapr/tmp/hive/mapr
[mapr@myspark ~]$ hadoop fs -ls /user/mapr/tmp/hive
Found 1 items
drwxr-xr-x - mapr mapr 1 2017-01-31 04:05 /user/mapr/tmp/hive/mapr
[mapr@myspark ~]$ hadoop fs -ls /user/mapr/tmp/hive/mapr
Found 1 items
drwxr-xr-x - mapr mapr 1 2017-01-31 04:05 /user/mapr/tmp/hive/mapr/9668005d-1bfe-4ce1-8513-f4a95ac3b8b3
[mapr@myspark ~]$ hadoop fs -ls /user/mapr/tmp/hive/mapr/9668005d-1bfe-4ce1-8513-f4a95ac3b8b3
Found 1 items
drwxr-xr-x - mapr mapr 0 2017-01-31 04:05 /user/mapr/tmp/hive/mapr/9668005d-1bfe-4ce1-8513-f4a95ac3b8b3/_tmp_space.db
[mapr@myspark ~]$

Cheers

@hbhanawat

This comment has been minimized.

Show comment
Hide comment
@hbhanawat

hbhanawat Mar 15, 2017

Contributor

SnappyData internally uses some hive binaries. Looks like with mapr there is some scratch space which the hive functionality is using. As you can see in the exception, it is trying to find /user/mapr/tmp/hive folder on your local file system and not on hadoop. Can you try explicitly specifying the hive's scratch space somewhere on your local file system as mentioned here http://doc.mapr.com/display/MapR/Hive+Directories and see if it works?

Contributor

hbhanawat commented Mar 15, 2017

SnappyData internally uses some hive binaries. Looks like with mapr there is some scratch space which the hive functionality is using. As you can see in the exception, it is trying to find /user/mapr/tmp/hive folder on your local file system and not on hadoop. Can you try explicitly specifying the hive's scratch space somewhere on your local file system as mentioned here http://doc.mapr.com/display/MapR/Hive+Directories and see if it works?

@thbeh

This comment has been minimized.

Show comment
Hide comment
@thbeh

thbeh Mar 17, 2017

thbeh commented Mar 17, 2017

@thbeh

This comment has been minimized.

Show comment
Hide comment
@thbeh

thbeh Mar 19, 2017

Got it sorted. It seems that only --deploy-mode client works. It is physically looking as client filesystem for '/user/mapr/tmp/hive' instead of on hdfs. But failing on cluster mode, not clear on the logs what causes the failure.

thbeh commented Mar 19, 2017

Got it sorted. It seems that only --deploy-mode client works. It is physically looking as client filesystem for '/user/mapr/tmp/hive' instead of on hdfs. But failing on cluster mode, not clear on the logs what causes the failure.

@sumwale

This comment has been minimized.

Show comment
Hide comment
@sumwale

sumwale Apr 22, 2017

Contributor

@thbeh thanks for the input. @hbhanawat @rishitesh should this be documented in more details? This hive meta-store has been causing too much trouble with the myriad of distributions. Perhaps it can be replaced with a different catalog implementation (like the ConnectorCatalog) to use store catalog directly?

Contributor

sumwale commented Apr 22, 2017

@thbeh thanks for the input. @hbhanawat @rishitesh should this be documented in more details? This hive meta-store has been causing too much trouble with the myriad of distributions. Perhaps it can be replaced with a different catalog implementation (like the ConnectorCatalog) to use store catalog directly?

@sumwale

This comment has been minimized.

Show comment
Hide comment
@sumwale

sumwale Sep 19, 2017

Contributor

@thbeh The default hive location (not used by SnappyData but created by Hive meta-store initialization) has been changed to working directory to avoid these kinds of issues in #809 so closing this. Please check with latest release (final 1.0 release will be out this week).

Contributor

sumwale commented Sep 19, 2017

@thbeh The default hive location (not used by SnappyData but created by Hive meta-store initialization) has been changed to working directory to avoid these kinds of issues in #809 so closing this. Please check with latest release (final 1.0 release will be out this week).

@sumwale sumwale closed this Sep 19, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment