Support pyspark #19

melin · 2021-12-07T15:51:36Z

Pyspark is best supported. Algorithmic people are familiar with Python and easy to use

harryprince · 2022-01-03T12:29:01Z

same need +1

wey-gu · 2022-04-19T05:43:12Z

@melin @harryprince
I had spent some time today to figure out that pyspark is supported out of box, I will write more in doc/blog later.

/spark/bin/pyspark --driver-class-path nebula-spark-connector-3.0.0.jar --jars nebula-spark-connector-3.0.0.jar

df = spark.read.format(
    "com.vesoft.nebula.connector.NebulaDataSource").option(
        "type", "vertex").option(
        "spaceName", "basketballplayer").option(
        "label", "player").option(
        "returnCols", "name,age").option(
        "metaAddress", "metad0:9559").option(
        "partitionNumber", 1).load()

>>> df.show(n=2)
+---------+--------------+---+
|_vertexId|          name|age|
+---------+--------------+---+
|player105|   Danny Green| 31|
|player109|Tiago Splitter| 34|
+---------+--------------+---+
only showing top 2 rows

wey-gu · 2022-04-19T11:09:31Z

Article here: https://www.siwei.io/en/spark-on-nebula-graph/
Doc/ Examples

harryprince · 2022-04-19T13:22:54Z

does the schema information could be detected automatically like we use Hive with meta info?
specific schema via option seems not a better way, when the column list is too long.

wey-gu · 2022-04-20T02:20:35Z

does the schema information could be detected automatically like we use Hive with meta info?
specific schema via option seems not a better way, when the column list is too long.

I think you could use the nebula-python client to fetch meta/schema easier(it should be working to do so via spark-c, too with this py4j under the hood, I didn't try that yet), while, please be noted returnCols isn't mandatory, if it's omitted, all prop will be fetched by default.

wey-gu · 2022-08-23T04:29:12Z

Now both write and read examples were provided #55

jamieliu1023 mentioned this issue Dec 11, 2021

Weekly Report 2021-12-10 vesoft-inc/nebula-community#68

Closed

Nicole00 added the help wanted Community: does anyone want to work on it? label Jan 28, 2022

wey-gu self-assigned this Apr 20, 2022

wey-gu mentioned this issue Jul 12, 2022

pyspark example added #51

Merged

Nicole00 closed this as completed in #51 Jul 12, 2022

wey-gu mentioned this issue Jul 16, 2022

Weekly Report 2022-07-15 vesoft-inc/nebula-community#122

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support pyspark #19

Support pyspark #19

melin commented Dec 7, 2021

harryprince commented Jan 3, 2022

wey-gu commented Apr 19, 2022 •

edited

Loading

wey-gu commented Apr 19, 2022 •

edited

Loading

harryprince commented Apr 19, 2022

wey-gu commented Apr 20, 2022 •

edited

Loading

wey-gu commented Aug 23, 2022

Support pyspark #19

Support pyspark #19

Comments

melin commented Dec 7, 2021

harryprince commented Jan 3, 2022

wey-gu commented Apr 19, 2022 • edited Loading

wey-gu commented Apr 19, 2022 • edited Loading

harryprince commented Apr 19, 2022

wey-gu commented Apr 20, 2022 • edited Loading

wey-gu commented Aug 23, 2022

wey-gu commented Apr 19, 2022 •

edited

Loading

wey-gu commented Apr 19, 2022 •

edited

Loading

wey-gu commented Apr 20, 2022 •

edited

Loading