Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to provide their own edge IDS to PropertyGraph #2757

Merged
merged 4 commits into from
Oct 4, 2022

Conversation

eriknw
Copy link
Contributor

@eriknw eriknw commented Sep 29, 2022

Closes #2565

Should we try to avoid creating conflicting edge IDs (for example, when user doesn't provides all edge IDs). Should we expose last_edge_id.

@eriknw eriknw requested a review from a team as a code owner September 29, 2022 05:30
@BradReesWork BradReesWork added this to the 22.10 milestone Sep 29, 2022
@BradReesWork BradReesWork added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 29, 2022
Copy link
Member

@VibhuJawa VibhuJawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall changes look good but I think we should ask the user provide edge_ids if they do it once and also expose last_edge_id.

#2757 (comment)

@eriknw
Copy link
Contributor Author

eriknw commented Sep 29, 2022

Case 1:

  • User adds data with edge ids
  • then user adds data w/o edge ids

Case 2:

  • User adds data w/o edge ids
  • then user adds data with edge ids

@VibhuJawa, do I understand your comment correctly that you want to support case 2, but not case 1?

We can handle case 1 by taking the max of the user-provided edge ids (as is currently done in this PR). Not supporting this lets us avoid taking the max.

We could support case 2 by trusting the user. Giving them the last (i.e., max) edge id could be a way to give them the tool to do this. Exposing this value once again requires taking the max of the user-provided edge ids.

So, basically, all options are available to us:

  • don't support case 1 or 2
    • no need to call max
  • support case 1 only (current state of PR)
    • calls max, probably no need to expose last_edge_id
  • support case 2 only
    • only call max if we also expose last edge id
  • support cases 1 and 2
    • calls max, can expose last_edge_id

@VibhuJawa
Copy link
Member

support case 1 only (current state of PR)
calls max, probably no need to expose last_edge_id

I think this good for now . The usecase i think this is useful is when we want to create new edges like adding self-loops etc .

Can we create a feature request for other cases so that we can track and prioritize them as need arises.

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.10@a23caff). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-22.10    #2757   +/-   ##
===============================================
  Coverage                ?   59.80%           
===============================================
  Files                   ?      111           
  Lines                   ?     6185           
  Branches                ?        0           
===============================================
  Hits                    ?     3699           
  Misses                  ?     2486           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@VibhuJawa VibhuJawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rlratzel
Copy link
Contributor

rlratzel commented Oct 3, 2022

We had further discussion about case 1/case 2 offline, captured here.

Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I just had one question, then I'll re-review after the changes from this conversation are done.

python/cugraph/cugraph/dask/structure/mg_property_graph.py Outdated Show resolved Hide resolved
If auto-generated, all edge IDs must be auto-generate.
Conversely, if user-provied, all edge IDs must be user-provided.
Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, especially the added test coverage.

@rlratzel
Copy link
Contributor

rlratzel commented Oct 4, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit f95ee18 into rapidsai:branch-22.10 Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Support adding edge_ids to PG in add_edge_data
5 participants