Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is it possible to use neo4j-spark-connector to write a Spark data frame to Neo4j database? #27

Open
simonegabbriellini opened this issue Oct 26, 2016 · 10 comments

Comments

@simonegabbriellini
Copy link

@simonegabbriellini simonegabbriellini commented Oct 26, 2016

I am trying to write a Spark data frame to Neo4j database and stumbled upon neo4j-spark-connector... I understand the examples to retrieve data from Neo4j and make further processing into Spark, but is the opposite workflow contemplated by this connector?

@jexp

This comment has been minimized.

Copy link
Member

@jexp jexp commented Oct 26, 2016

Hi @simonegabbriellini yes, definitely.

There are some examples in the older API to support that, I just have to find a bit of time to add it to the new API.

See Neo4jDataFrame.mergeEdgeList for one example.

@simonegabbriellini

This comment has been minimized.

Copy link
Author

@simonegabbriellini simonegabbriellini commented Oct 27, 2016

thanks a lot! I have asked on SO but the only reply I got stated the opposite... I'll put this link in the answer so to set the record straight.

@jexp

This comment has been minimized.

Copy link
Member

@jexp jexp commented Mar 7, 2017

Thanks a lot.

@jexp

This comment has been minimized.

Copy link
Member

@jexp jexp commented Mar 7, 2017

Were you successful with what you wanted to achieve?

Any other feedback what I can improve with the connector? Besides the API for writing and documentation?

@syarram

This comment has been minimized.

Copy link

@syarram syarram commented Apr 26, 2017

Hi Jexp, Can you please let us know if we can write to Neo4j using Spark Data Frames or RDD's, or you are working on it?

@jexp

This comment has been minimized.

Copy link
Member

@jexp jexp commented Jun 26, 2017

@syarram did you look at the examples in the readme?
Did you try the mergeEdgeList ? https://github.com/neo4j-contrib/neo4j-spark-connector/blob/master/src/main/scala/org/neo4j/spark/Neo4jDataFrame.scala#L17

The new API should look like this, I just worked on the code for saveGraph the other datastructures come next.

https://github.com/neo4j-contrib/neo4j-spark-connector/blob/master/src/main/scala/org/neo4j/spark/Neo4j.scala#L56

@ShahNewazKhan

This comment has been minimized.

Copy link

@ShahNewazKhan ShahNewazKhan commented Jan 11, 2018

hi @jexp do you have any such examples in python?

@GreGGus

This comment has been minimized.

Copy link

@GreGGus GreGGus commented Mar 29, 2018

Looking for example in Scala/Spark :)

@azdafirmansyah

This comment has been minimized.

Copy link

@azdafirmansyah azdafirmansyah commented Jul 2, 2019

you can use this :

val rows = sc.makeRDD(Seq(Row("Keanu Reavee", "Male", "The Matrix")))
val schema = StructType(
  Seq(
    StructField("name", DataTypes.StringType),
    StructField("gender", DataTypes.StringType),
    StructField("title", DataTypes.StringType)))
val dfx = new SQLContext(sc).createDataFrame(rows, schema)
Neo4jDataFrame.mergeEdgeList(
  sc,
  dfx,
  ("Person",Seq("name","gender")),
  ("ACTED_IN",Seq.empty),
  ("Movie",Seq("title")))

My code working use this

@lansaloltd

This comment has been minimized.

Copy link

@lansaloltd lansaloltd commented Jan 9, 2020

@jexp I don't think the mergeEdgeList example can realistically persist a DataFrame/Dataset in Neo4j. Reasons are explained here in full but if you give it a try with a 1000 rows DataFrame (which should be still manageable locally) you should see the issue I'm referring to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.