Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GIE Physical] Support LateProject Strategy when Fetching Properties. #2451

Merged
merged 9 commits into from
Mar 1, 2023

Conversation

BingqingLyu
Copy link
Collaborator

@BingqingLyu BingqingLyu commented Feb 22, 2023

What do these changes do?

As titled.

Related issue number

Fixes #2460

@BingqingLyu BingqingLyu marked this pull request as draft February 22, 2023 12:27
@BingqingLyu BingqingLyu reopened this Feb 23, 2023
@BingqingLyu BingqingLyu marked this pull request as ready for review February 24, 2023 02:45
// This would be refined as:
// In Logical Plan: `Source + EdgeExpand(ExpandE) + GetV`
// In Physical Plan:
// 1. on distributed graph store, `Source + EdgeExpand(ExpandV) + Shuffle + Auxilia`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we do not have Auxilia? Why still comment it as Auxilia

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated as GetV

merged_params.columns.push(column.clone());
}
}
fn process_tag_columns(builder: &mut JobBuilder, tag: Option<TagId>, columns_opt: ColumnsOpt) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is JobBuilder still used to build the Pegasus plan? or Physical plan?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Physical plan.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then probably PhysicalBuilder is better than JobBuilder.

fn process_tag_columns(builder: &mut JobBuilder, tag: Option<TagId>, columns_opt: ColumnsOpt) {
if columns_opt.len() > 0 {
let tag_pb = tag.map(|tag_id| (tag_id as i32).into());
builder.shuffle(tag_pb.clone());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have to shuffle here? We do not verify wether it is single thread?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function would be called only when plan_meta.is_partition()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then why not put plan_meta.is_partition() in this function?
Let the caller be free of the context.
Moreover, better comment this function to improve the understanding.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Move this into post_process_vars() function.

let node_meta = plan_meta.get_curr_node_meta().unwrap();
let tag_columns = node_meta.get_tag_columns();
let len = tag_columns.len();
if len == 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better not using return in the middle. Can do this instead:

if len == 1 && !is_order_or_group {

} else if len != 0 {

} else {
   Ok(())
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@Graph.OptOut(
method = "g_V_order_byXnameX_name",
test = "org.apache.tinkerpop.gremlin.process.traversal.step.map.OrderTest",
reason = "unsupoorted")
Copy link
Collaborator

@longbinlai longbinlai Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason should not be unsupported (btw: unsupoorted misspelled)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated as "project (with shuffle) after order occurs bug"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Projection may introduce additional shuffling that can break the order. "

sample_ratio: 1.0,
extra: Default::default(),
};
let auxilia = pb::GetV { tag: tag_pb.clone(), opt: 4, params: Some(params), alias: tag_pb.clone() };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better comment what is opt: 4 .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

merge_query_param_columns(merged_params, other_params.is_all_columns, &other_params.columns);
other_params.columns.clear();
other_params.is_all_columns = false;
// Fetch properties before used in select, order, dedup, group, join, and apply.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain more about how to process late project. Use some examples?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comment with examples added.

@codecov-commenter
Copy link

codecov-commenter commented Feb 27, 2023

Codecov Report

Merging #2451 (32442a1) into main (e9850d3) will decrease coverage by 33.27%.
The diff coverage is n/a.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #2451       +/-   ##
===========================================
- Coverage   73.22%   39.95%   -33.27%     
===========================================
  Files          88       88               
  Lines        9769     9769               
===========================================
- Hits         7153     3903     -3250     
- Misses       2616     5866     +3250     
Impacted Files Coverage Δ
python/graphscope/tests/unittest/test_java_app.py 0.00% <0.00%> (-100.00%) ⬇️
...ython/graphscope/tests/unittest/test_cython_ast.py 0.00% <0.00%> (-100.00%) ⬇️
...ython/graphscope/tests/unittest/test_serailaize.py 0.00% <0.00%> (-100.00%) ⬇️
python/graphscope/analytical/udf/patch.py 3.47% <0.00%> (-96.53%) ⬇️
python/graphscope/tests/unittest/test_app.py 0.00% <0.00%> (-96.32%) ⬇️
python/graphscope/tests/unittest/test_lazy.py 0.00% <0.00%> (-96.22%) ⬇️
...thon/graphscope/tests/unittest/test_scalability.py 0.00% <0.00%> (-96.16%) ⬇️
...hon/graphscope/tests/unittest/test_create_graph.py 0.00% <0.00%> (-92.89%) ⬇️
python/graphscope/tests/unittest/test_graph.py 0.00% <0.00%> (-85.92%) ⬇️
python/graphscope/tests/unittest/test_context.py 0.00% <0.00%> (-81.36%) ⬇️
... and 37 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c5d1338...32442a1. Read the comment docs.

other_params.is_all_columns = false;
// Fetch properties before used in Project, Select, Order, Dedup, Group, Join, and Apply.
// e.g.,
// Case 1: For singe property (except used in Order or GroupVal), e.g., g.V().out().as("a").out().out().out().select("a").by("name"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

single property. Can use chatgpt to help check typo fyi.

@@ -28,7 +28,7 @@ use ir_physical_client::physical_builder::{JobBuilder, Plan};

use crate::error::{IrError, IrResult};
use crate::plan::logical::{LogicalPlan, NodeType};
use crate::plan::meta::PlanMeta;
use crate::plan::meta::{ColumnsOpt, PlanMeta, TagId};

/// A trait for building physical plan (pegasus) from the logical plan
pub trait AsPhysical {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be pegasus's JobBuilder

@longbinlai longbinlai merged commit 1209e3e into alibaba:main Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[GIE Physical] Introduce LateProject strategy when fetching properties
3 participants