Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy rdf graph #2602

Merged
merged 1 commit into from
Dec 21, 2023
Merged

Copy rdf graph #2602

merged 1 commit into from
Dec 21, 2023

Conversation

andyfengHKU
Copy link
Contributor

@andyfengHKU andyfengHKU commented Dec 20, 2023

This PR rework COPY statement for RDF graph so that all internal tables are copied in one plan.

The plan looks like the following (detailed operators like indexLookup are omitted)

                                CopyRDF
       CopyRel    CopyRel   CopyNode    CopyNode
       Scan       Scan      Scan        Scan

This is an out-of-memory solution where we scan file for each Copy pipeline (in total 4 times).

Note
CopyNodeSharedState is exposed to IndexLookup and Partition in this PR because

  • We need to perform lookup before hash index is persist on disk (which happens in WAL).
  • We need to know num of nodes before statistics is available (which happens in WAL).

Copy link

codecov bot commented Dec 20, 2023

Codecov Report

Attention: 18 lines in your changes are missing coverage. Please review.

Comparison is base (d591bff) 93.26% compared to head (ef72884) 93.24%.

Files Patch % Lines
src/processor/operator/index_lookup.cpp 67.74% 10 Missing ⚠️
src/include/binder/copy/bound_copy_from.h 83.33% 2 Missing ⚠️
src/processor/operator/persistent/copy_node.cpp 85.71% 2 Missing ⚠️
...de/planner/operator/persistent/logical_copy_from.h 83.33% 1 Missing ⚠️
src/planner/operator/logical_operator.cpp 85.71% 1 Missing ⚠️
src/planner/plan/plan_copy.cpp 97.43% 1 Missing ⚠️
...cessor/operator/persistent/reader/rdf/rdf_scan.cpp 96.42% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2602      +/-   ##
==========================================
- Coverage   93.26%   93.24%   -0.02%     
==========================================
  Files        1034     1037       +3     
  Lines       38664    38815     +151     
==========================================
+ Hits        36059    36193     +134     
- Misses       2605     2622      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andyfengHKU andyfengHKU merged commit dd0efd3 into master Dec 21, 2023
14 checks passed
@andyfengHKU andyfengHKU deleted the copy-rdf-graph branch December 21, 2023 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants