New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TINKERPOP-1023: Add a spark variable in SparkGremlinPlugin like we do hdfs for HadoopGremlinPlugin #173

Merged
merged 12 commits into from Dec 9, 2015

Conversation

Projects
None yet
4 participants
@okram
Contributor

okram commented Dec 8, 2015

https://issues.apache.org/jira/browse/TINKERPOP-1023

Like hdfs there is now spark which allows the user to manage their persisted contexts. In essence, the Spark Server looks like a file system with (named) RDDs accessible. For instance, you can spark.ls(), spark.rm(), spark.describe(). I added a SparkGremlinPluginTest which ensures that all the proper imports/etc. work in the Console. I also added the information the reference docs. I published the reference docs so people can see it in action:

http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/reference/#sparkgraphcomputer (scroll down to "Using A Persisted Context" section)

VOTE +1. (mvn clean install and Spark integration tests)

okram added some commits Dec 8, 2015

added Spark persited RDD utility that can be spark.ls(), spark.head()…
…, spark.rm(), spark.describe(), etc. in the Console. Really cool. Added a SparkGremlinPluginTest that verifies everything works as expected. Updated docs explaining the new tool.
@okram

This comment has been minimized.

Show comment
Hide comment
@okram

okram Dec 8, 2015

Contributor

Note that I also tested this with Spark Server and it works great. This is a really really cool thing.

Contributor

okram commented Dec 8, 2015

Note that I also tested this with Spark Server and it works great. This is a really really cool thing.

okram added some commits Dec 8, 2015

@dkuppitz

This comment has been minimized.

Show comment
Hide comment
@dkuppitz

dkuppitz Dec 8, 2015

Contributor
  • mvn clean install: worked
  • manual tests using the new spark object: failed
daniel@cube /projects/apache/test/incubator-tinkerpop/gremlin-console/target/apache-gremlin-console-3.1.1-SNAPSHOT-standalone (TINKERPOP-1023) $ HADOOP_GREMLIN_LIBS=`pwd`/ext/hadoop-gremlin/lib:`pwd`/ext/spark-gremlin/lib bin/gremlin.sh

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
WARN  org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
INFO  org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph  - HADOOP_GREMLIN_LIBS is set to: /projects/apache/test/incubator-tinkerpop/gremlin-console/target/apache-gremlin-console-3.1.1-SNAPSHOT-standalone/ext/hadoop-gremlin/lib:/projects/apache/test/incubator-tinkerpop/gremlin-console/target/apache-gremlin-console-3.1.1-SNAPSHOT-standalone/ext/spark-gremlin/lib
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.spark
plugin activated: tinkerpop.tinkergraph
gremlin> spark
No such property: spark for class: groovysh_evaluate
Display stack trace? [yN] N
gremlin> spark.create("local[4]")
No such property: spark for class: groovysh_evaluate
Display stack trace? [yN] N
gremlin> 

What am I missing here?

Contributor

dkuppitz commented Dec 8, 2015

  • mvn clean install: worked
  • manual tests using the new spark object: failed
daniel@cube /projects/apache/test/incubator-tinkerpop/gremlin-console/target/apache-gremlin-console-3.1.1-SNAPSHOT-standalone (TINKERPOP-1023) $ HADOOP_GREMLIN_LIBS=`pwd`/ext/hadoop-gremlin/lib:`pwd`/ext/spark-gremlin/lib bin/gremlin.sh

         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
WARN  org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
INFO  org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph  - HADOOP_GREMLIN_LIBS is set to: /projects/apache/test/incubator-tinkerpop/gremlin-console/target/apache-gremlin-console-3.1.1-SNAPSHOT-standalone/ext/hadoop-gremlin/lib:/projects/apache/test/incubator-tinkerpop/gremlin-console/target/apache-gremlin-console-3.1.1-SNAPSHOT-standalone/ext/spark-gremlin/lib
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.spark
plugin activated: tinkerpop.tinkergraph
gremlin> spark
No such property: spark for class: groovysh_evaluate
Display stack trace? [yN] N
gremlin> spark.create("local[4]")
No such property: spark for class: groovysh_evaluate
Display stack trace? [yN] N
gremlin> 

What am I missing here?

@okram

This comment has been minimized.

Show comment
Hide comment
@okram

okram Dec 8, 2015

Contributor

@dkuppitz -- did you clear your grapes?

Contributor

okram commented Dec 8, 2015

@dkuppitz -- did you clear your grapes?

BVLP problem with InputRDD was not a real problem. It had to do with …
…workers=1 and local[4]. This is ticket TINKERPOP-1025. Will merge with this work given that its so simple and all related.
@dkuppitz

This comment has been minimized.

Show comment
Hide comment
@dkuppitz

dkuppitz Dec 9, 2015

Contributor

Yes, that was the issue.

Update:

  • mvn clean install: worked
  • manual tests using the new spark object: worked

VOTE: +1

Contributor

dkuppitz commented Dec 9, 2015

Yes, that was the issue.

Update:

  • mvn clean install: worked
  • manual tests using the new spark object: worked

VOTE: +1

provided a general way to ensure that the number of parititions is NO…
…T larger than the number of workers. Moreover RDD.coalesce() does not require a shuffle as it only handle partition reduction.
@spmallette

This comment has been minimized.

Show comment
Hide comment
@spmallette

spmallette Dec 9, 2015

Contributor

Builds and tests nicely. Did some simple manual tests with spark object - worked.

Just a reminder that upgrade docs are lagging a bit behind all the spark work that's been done.

VOTE: +1

Contributor

spmallette commented Dec 9, 2015

Builds and tests nicely. Did some simple manual tests with spark object - worked.

Just a reminder that upgrade docs are lagging a bit behind all the spark work that's been done.

VOTE: +1

@asfgit asfgit merged commit aef9528 into master Dec 9, 2015

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@asfgit asfgit deleted the TINKERPOP-1023 branch Feb 8, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment