Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python][Client] Unify the graph level load_from and save to API #2919

Closed
wants to merge 7 commits into from

Conversation

acezen
Copy link
Collaborator

@acezen acezen commented Jun 19, 2023

This is a prototype implementation of the design of load_from and write_to Python API
related issue: #2836

Use examples:

import graphscope.Graph as Graph
# load from graphar file
g = Graph.load_from("graphar+file:///tmp/ldbc/ldbc_sample.graph.yml")
# load from oss file storage
g = Graph.load_from("graphar+oss://bucket/ldbc/ldbc_sample.graph.yml")
import graphscope
g = graphscope.load_ldbc()

# write to graphar format
g.save_to("/tmp/ldbc/", format="graphar")
# write to oss output
g.save_to("oss://bucket/ldbc", format="graphar")
# write to graphar format with configuration
g.save_to("/tmp/ldbc/", format="graphar", graphar_graph_name="ldbc", 
      graphar_vertex_block_size=1024, graphar_edge_block_size=2048,
      graphar_file_format="orc")

@acezen acezen changed the title [WIP][Do not merge] Prototype for load_from and write_to [WIP] Prototype for load_from and write_to Jun 20, 2023
@acezen acezen requested a review from yecol June 20, 2023 02:32
@acezen acezen changed the title [WIP] Prototype for load_from and write_to [Python][Client] Unify the graph level load_from and save to API Jun 26, 2023
@acezen acezen marked this pull request as ready for review June 26, 2023 01:38
@acezen acezen requested a review from siyuan0322 June 26, 2023 02:22
@codecov-commenter
Copy link

codecov-commenter commented Jun 26, 2023

Codecov Report

Merging #2919 (1ea86b3) into main (a6fd8ba) will decrease coverage by 6.55%.
Report is 1 commits behind head on main.
The diff coverage is 7.84%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2919      +/-   ##
==========================================
- Coverage   41.39%   34.84%   -6.55%     
==========================================
  Files         107      107              
  Lines       11203    11309     +106     
==========================================
- Hits         4637     3941     -696     
- Misses       6566     7368     +802     
Files Coverage Δ
python/graphscope/__init__.py 82.85% <ø> (-0.48%) ⬇️
python/graphscope/client/session.py 56.31% <ø> (-8.38%) ⬇️
python/graphscope/dataset/ldbc.py 35.29% <ø> (-35.30%) ⬇️
python/graphscope/framework/dag_utils.py 37.38% <100.00%> (-7.17%) ⬇️
python/graphscope/framework/graph_builder.py 87.17% <ø> (+24.67%) ⬆️
python/graphscope/tests/unittest/test_graphar.py 0.00% <0.00%> (ø)
python/graphscope/framework/graph.py 60.10% <12.82%> (-9.15%) ⬇️
python/graphscope/framework/utils.py 39.80% <4.21%> (-4.64%) ⬇️

... and 26 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c1713bc...1ea86b3. Read the comment docs.

python/graphscope/framework/graph.py Outdated Show resolved Hide resolved
python/graphscope/framework/graph.py Outdated Show resolved Hide resolved
Returns:
`Graph`: A new graph object. Schema and data is supposed to be
identical with the one that called serialized method.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this line.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

python/graphscope/framework/graph.py Outdated Show resolved Hide resolved
graphar_vertex_block_size=262144, # 2^18
graphar_edge_block_size=4194304, # 2^22
graphar_file_format="parquet",
graphar_version="v1",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more clear to pass graphar related parameters in a dict, not messed up with other parameters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's our design discuss in issue #2957 , and the parameters are like Session init method's parameter.

python/graphscope/framework/graph.py Show resolved Hide resolved
python/graphscope/framework/graph.py Outdated Show resolved Hide resolved
@acezen acezen force-pushed the 2836-api-prototype branch 2 times, most recently from 80cf33b to d21e949 Compare July 3, 2023 09:25
siyuan0322
siyuan0322 previously approved these changes Jul 18, 2023
Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com>

Add doc

Update

Update

Unify the API

Runnable

Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com>

Committed-by: acezen from Dev container

Committed-by: acezen from Dev container

Committed-by: acezen from Dev container

Rebase

Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com>

Committed-by: acezen from Dev container

Format

Committed-by: acezen from Dev container

Fix

Committed-by: acezen from Dev container

Update

Committed-by: acezen from Dev container

Fix

Signed-off-by: acezen <qiaozi.zwb@alibaba-inc.com>

Committed-by: acezen from Dev container

Fix

Committed-by: acezen from Dev container

Committed-by: acezen from Dev container

Committed-by: acezen from Dev container

Committed-by: acezen from Dev container

Format

Committed-by: acezen from Dev container

Committed-by: acezen from Dev container

Update

Committed-by: acezen from Dev container

Committed-by: acezen from Dev container

Update

Format

Committed-by: acezen from Dev container
Committed-by: acezen from Dev container

Committed-by: acezen from Dev container
Committed-by: acezen from Dev container
Committed-by: acezen from Dev container

Committed-by: acezen from Dev container

Committed-by: acezen from Dev container
Committed-by: acezen from Dev container

Committed-by: acezen from Dev container
Committed-by: acezen from Dev container

Committed-by: acezen from Dev container
Committed-by: acezen from Dev container
@acezen
Copy link
Collaborator Author

acezen commented Mar 6, 2024

this PR is outdated, create a new pull request #3610

@acezen acezen closed this Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants