kafka-value-parser #29

sworduo · 2021-11-23T04:07:28Z

issue #8
import data from kafka to nebula

codecov-commenter · 2021-11-24T05:45:04Z

Codecov Report

Merging #29 (393136f) into master (389bead) will decrease coverage by 0.20%.
The diff coverage is 11.76%.

@@             Coverage Diff              @@
##             master      #29      +/-   ##
============================================
- Coverage     28.24%   28.04%   -0.21%     
  Complexity        6        6              
============================================
  Files            24       24              
  Lines          2078     2111      +33     
  Branches        388      396       +8     
============================================
+ Hits            587      592       +5     
- Misses         1389     1412      +23     
- Partials        102      107       +5

Impacted Files	Coverage Δ
...cala/com/vesoft/nebula/exchange/ErrorHandler.scala	`0.00% <0.00%> (ø)`
...in/scala/com/vesoft/nebula/exchange/Exchange.scala	`2.07% <0.00%> (-0.05%)`	⬇️
...t/nebula/exchange/reader/StreamingBaseReader.scala	`0.00% <0.00%> (ø)`
...la/com/vesoft/nebula/exchange/config/Configs.scala	`63.65% <50.00%> (-0.54%)`	⬇️
...cala/com/vesoft/nebula/exchange/MetaProvider.scala	`30.15% <0.00%> (-1.33%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 389bead...393136f. Read the comment docs.

Nicole00 · 2021-11-24T05:44:59Z

nebula-exchange/src/main/scala/com/vesoft/nebula/exchange/reader/StreamingBaseReader.scala

-class KafkaReader(override val session: SparkSession, kafkaConfig: KafkaSourceConfigEntry)
+class KafkaReader(override val session: SparkSession,
+                  kafkaConfig: KafkaSourceConfigEntry,
+                  fields: List[String])


please distinct the fields before using it.

Nicole00 · 2021-11-24T05:46:18Z

nebula-exchange/src/main/scala/com/vesoft/nebula/exchange/reader/StreamingBaseReader.scala

+           .selectExpr("CAST(value AS STRING)")
+           .as[(String)]
+           .withColumn("value", from_json(col("value"), jsonSchema))
+           .select("value.*")


don't we need alias the dataframe's column name to name in fields?

I had ever print the col name which is the name of fields. It's workable in my machine..

Yeah, i tested it and the schema is the same with fields. Great work~

Nicole00 · 2021-11-24T05:46:52Z

nebula-exchange/src/main/scala/com/vesoft/nebula/exchange/Exchange.scala

@@ -172,7 +173,8 @@ object Exchange {
        LOG.info(s"field keys: ${fieldKeys.mkString(", ")}")
        val nebulaKeys = edgeConfig.nebulaFields
        LOG.info(s"nebula keys: ${nebulaKeys.mkString(", ")}")
-        val data = createDataSource(spark, edgeConfig.dataSourceConfigEntry)
+        val fields = edgeConfig.sourceField::edgeConfig.targetField::edgeConfig.fields


edgeConfig.rankField should also be added.

Nicole00 · 2021-11-25T06:06:47Z

nebula-exchange/src/test/resources/application.conf

-      batch: 10
-      interval.seconds: 10
-    }
+    # {


please roll back these changes

A new check is added in the config parse that an exception would be throw if any other config define after kafka, see Config.scala. However, there is two kafka defined in the application.conf. If I don't comment this section, the test would not pass.

Ok， I'll split the config of Kafka out and use a single config file for Kafka later.

Nicole00

Great PR ~

Nicole00 reviewed Nov 24, 2021

View reviewed changes

kafka-value-parser

393136f

sworduo force-pushed the kafkaParser branch from cdc3581 to 393136f Compare November 25, 2021 03:02

Nicole00 reviewed Nov 25, 2021

View reviewed changes

Nicole00 approved these changes Nov 25, 2021

View reviewed changes

Nicole00 merged commit 9c79a1b into vesoft-inc:master Nov 25, 2021

jamieliu1023 mentioned this pull request Nov 27, 2021

Weekly Report 2021-11-26 vesoft-inc/nebula-community#49

Closed

sworduo deleted the kafkaParser branch November 30, 2021 03:12

Nicole00 mentioned this pull request Dec 2, 2021

[WIP]Fix the error "Field “xxx” does not exist." when data is imported from Kafka to Nebula Graph via Nebula Exchange #8

Closed

Nicole00 added the doc affected PR: improvements or additions to documentation label Dec 16, 2021

foesa-yang added this to Done in Nebula Docs Jan 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kafka-value-parser #29

kafka-value-parser #29

sworduo commented Nov 23, 2021

codecov-commenter commented Nov 24, 2021 •

edited

Nicole00 Nov 24, 2021

Nicole00 Nov 24, 2021

sworduo Nov 25, 2021

Nicole00 Nov 25, 2021

Nicole00 Nov 24, 2021

Nicole00 Nov 25, 2021

sworduo Nov 25, 2021

Nicole00 Nov 25, 2021

Nicole00 left a comment

kafka-value-parser #29

kafka-value-parser #29

Conversation

sworduo commented Nov 23, 2021

codecov-commenter commented Nov 24, 2021 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Nicole00 left a comment

Choose a reason for hiding this comment

codecov-commenter commented Nov 24, 2021 •

edited