Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support path in nGQL, optimized nGQL edge reader #133

Merged
merged 14 commits into from Dec 21, 2023

Conversation

wey-gu
Copy link
Contributor

@wey-gu wey-gu commented Nov 21, 2023

What type of PR is this?

  • bug
  • feature
  • enhancement

What problem(s) does this PR solve?

Issue(s) number:

close #132

Description:

We should not expect only edge or list of edges to be in resultSet for nGQL, this change fixes it.

How do you solve it?

  • Parse Relationship, too
  • Fix spark3 missing OperateType
  • Added examples and e2e tests for match/subgraph/find path with ngqlEdgeReader
  • Added non-edge in List handling subroutine, i.e. subgraph return list of nodes could trigger exceptions that were unexpected before.

@wey-gu wey-gu requested a review from Nicole00 November 21, 2023 11:05
} else {
LOG.error(s"Exception convert edge type ${valueType} ")
throw new RuntimeException(" convert value type failed");
LOG.warn(s"Unexpected edge type ${valueType}. Skipping.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connector is an API not a tool, why just skip but not throw exception?

Copy link
Contributor Author

@wey-gu wey-gu Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, was doing so due to some of the users are not familiar, or experienced enough to inspect error logs properly.

Will bring back the exception.

But we may need to handle NULL/Empty properly in such cases.

@wey-gu wey-gu requested a review from Nicole00 November 22, 2023 05:37
@@ -83,6 +83,12 @@ object NebulaConnectionConfig {
/**
* set connectionRetry, connectionRetry is optional
*/
@deprecated("use withConnectionRetry instead", "3.7.0")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nicole00 could we do so to correct the typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe put 4.0.0/5.0.0 here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update, after the rebase I realized I was doing this on the wrong base, but we should maintain the capability of the interface for a while, so keep this old typo for a while. Or it'll break things

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree! great idea.

@@ -70,12 +70,19 @@ class NebulaNgqlEdgePartitionReader extends InputPartitionReader[InternalRow] {
val list: mutable.Buffer[ValueWrapper] = value.asList()
edges.appendAll(
list.toStream
.filter(e => checkLabel(e.asRelationship()))
.filter(e => e.isEdge() && checkLabel(e.asRelationship()))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is helpful @Nicole00 to mitigate the case like NULL/EMPTY in List?

@wey-gu wey-gu changed the title feat: support path in nGQL, not raise exception for other types feat: support path in nGQL, optimized nGQL edge reader Nov 22, 2023
@wey-gu
Copy link
Contributor Author

wey-gu commented Nov 22, 2023

It's strange that after rebasing I kept got this error:

Screenshot 2023-11-22 at 17 10 56
- read edge from nGQL: MATCH ()-[e:friend]->() RETURN e LIMIT 1000 *** FAILED ***
  java.util.NoSuchElementException: key not found: operateType
  at scala.collection.MapLike$class.default(MapLike.scala:228)
  at org.apache.spark.sql.catalyst.util.CaseInsensitiveMap.default(CaseInsensitiveMap.scala:28)
  at scala.collection.MapLike$class.apply(MapLike.scala:141)
  at org.apache.spark.sql.catalyst.util.CaseInsensitiveMap.apply(CaseInsensitiveMap.scala:28)
  at com.vesoft.nebula.connector.NebulaOptions.<init>(NebulaOptions.scala:38)
  at com.vesoft.nebula.connector.NebulaDataSource.getNebulaOptions(NebulaDataSource.scala:147)
  at com.vesoft.nebula.connector.NebulaDataSource.createReader(NebulaDataSource.scala:40)
  at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$SourceHelpers.createReader(DataSourceV2Relation.scala:155)
  at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$.create(DataSourceV2Relation.scala:172)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:204)
  ...

update: fixed by adding options...

@wey-gu wey-gu force-pushed the ngql_reader branch 6 times, most recently from 618b530 to e0bacf7 Compare November 23, 2023 11:54
@codecov-commenter
Copy link

codecov-commenter commented Nov 23, 2023

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (49306d0) 32.04% compared to head (e4b6ef5) 36.35%.

Files Patch % Lines
...nnector/reader/NebulaNgqlEdgePartitionReader.scala 30.00% 3 Missing and 4 partials ⚠️
...ala/com/vesoft/nebula/connector/NebulaConfig.scala 0.00% 2 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #133      +/-   ##
============================================
+ Coverage     32.04%   36.35%   +4.30%     
- Complexity       79       92      +13     
============================================
  Files            31       31              
  Lines          1504     1513       +9     
  Branches        226      230       +4     
============================================
+ Hits            482      550      +68     
+ Misses          944      868      -76     
- Partials         78       95      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Nicole00 Nicole00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Nicole00 Nicole00 merged commit 4666866 into vesoft-inc:master Dec 21, 2023
2 checks passed
@wey-gu wey-gu deleted the ngql_reader branch December 22, 2023 03:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: ngql reader to support arbitrary query result parsing
3 participants