Skip to content

[Improvement] Add methods for Spark Reader and improve the performance #86

Closed
@lixueclaire

Description

@lixueclaire

Is your feature request related to a problem? Please describe.

  1. Support to read multiple property groups for VertexReader and EdgeReader.
  2. Optimize the Spark Reader when reading multiple property groups simultaneously, and maintain the order of rows in resulting DataFrame.
  3. When reading multiple chunks simultaneously, adding indices by default.
  4. Update the examples and related documentations.

Describe the solution you'd like

  1. Add methods in VertexReader and EdgeReader to allow to pass in a list of property groups and read related chunks.
  2. Currently, this is done by adding indices and join different DataFrames. We would like to concatenate the DataFrames row by row without repartitioning and shuffling.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions