New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Produce even better DOT graphs #3
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me in principle; I left some minor comments, but I think we should merge this quickly as it will be useful to everyone.
@@ -326,6 +327,35 @@ impl<Q, U, D> FlowGraph<Q, U, D> | |||
(&self.graph, self.source) | |||
} | |||
|
|||
/// Write a DOT representation of the graph to `f`. | |||
pub fn write_dot<W: io::Write>(&self, f: &mut W) -> io::Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a minor comment, but would it make sense to either
- put this implementation into
impl Display for FlowGraph<...>
(orDebug
rather thanDisplay
), such that one can simply print a flow graph to get the dot graph? (AIUI, your PR leaves the old behavior of printing a simplistic graph in place?); and/or - change the second argument here to be
fmt::Formatter
, rather than a generic type that implementsio::Write
?
I think (1) we should do for sure, but (2) is a minor stylistic point and your code may even be preferable as it is more general and e.g., supports direct writing to a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to do what you think is best! I chose this method because I figured—at least for Debug—a dot graph might not be the most useful output format for immediate debug insight.
Regarding (2), how does one create an instance of fmt::Formatter
without calling format!
? It seems like you can only get one of those by implementing Display
or Debug
.
@@ -145,6 +155,12 @@ mod tests { | |||
} | |||
|
|||
#[test] | |||
fn it_describes() { | |||
let c = setup(true, true); | |||
assert_eq!(c.inner.description(), "Count γ[0, 2]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: might be good to test the SUM
case too here, since that produces a somewhat different output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
.flat_map(|(left, rs)| { | ||
rs.against | ||
.iter() | ||
.filter(move |&(right, _)| left < right) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this move
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The borrow checker complains otherwise
error: `left` does not live long enough
--> src/ops/join.rs:449:43
|
449 | .filter(|&(right, _)| left < right)
| ------------- ^^^^ does not live long enough
| |
| capture occurs here
...
457 | })
| - borrowed value only lives until here
458 | .collect::<Vec<_>>()
| - borrowed value needs to live until here
and I wasn't sure how else to fix it. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see -- yes, that seems fine. (I guess you could technically borrow left
in the inside closure, i.e., use &left
, but that wouldn't be much prettier.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is a somewhat annoying side-effect of using flat_map
. move
is perfectly acceptable to use here.
8a8ace7
to
5a3c950
Compare
Okay, I made this |
I don't think I currently have a strong preference for what As for the visualization, I'm not entirely sure I understand the top-right notation in each box, but that's probably just due to my own ignorance. Including a legend somewhere would be useful though. |
use ops::NodeOp; | ||
let indent = " "; | ||
try!(writeln!(f, "digraph {{")); | ||
try!(writeln!(f, "{}node [shape=record, fontsize=10]", indent)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Rust 1.13, you can now use writeln!(..)?
instead if you think that reads better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
seems unnecessary here, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
helps produce more human-readable dot graphs, which is useful for debugging them ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe, though the indent seems pretty basic (you could get the same output in vim by typing gg=G
), and it makes the code harder to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reformatted the function to use an indent
lambda that should make the code easier to read?
@@ -65,4 +65,8 @@ impl NodeOp for Base { | |||
fn is_base(&self) -> bool { | |||
true | |||
} | |||
|
|||
fn description(&self) -> String { | |||
"𝓕".into() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason for the choice of 𝓕 to signify base nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if so, maybe include a comment somewhere pointing to a resource laying out the "known" operators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F for fact. We could also do 𝓑 for base, or "Base"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer "B" for base. I don't think it should be stylized, because that makes me think that is some "well known" operator from relational algebra or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
fn description(&self) -> String { | ||
let op_string = match self.op { | ||
Aggregation::COUNT => "Count".into(), | ||
Aggregation::SUM => format!("Sum({})", self.over), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should "Sum" not be 𝛴 ? I don't know if there's an equivalent symbol for "Count"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, and I figured min and max are usually spelled out too, so I left it as "Sum" for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I suppose we could use ∧
for min and ∨
for max, and |*|
for count (magnitude bars). I don't have strong feelings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like |*|
for count. For min/max, I feel like T (top) and ⊥ (bottom) could work. We could also use ⌈x⌉ (ceiling) and ⌊x⌋ (floor) where x
is the column being computed over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Done!
}; | ||
let group_cols = self.group.iter().map(|g| g.to_string()) | ||
.collect::<Vec<_>>().join(", "); | ||
format!("{} γ[{}]", op_string, group_cols) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the γ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relational algebra for "group by", so that's how it reads in my head.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, so the γ here actually does stem from some definition. A reference would be useful. It doesn't need to be in the diagram, but at least somewhere in the code. Maybe in the docstring for the description
method in the trait definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
let group_cols = group_cols.iter().map(|g| g.to_string()) | ||
.collect::<Vec<_>>().join(", "); | ||
|
||
format!("Concat({} / \"{}\") γ[{}]", fields, self.separator, group_cols) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think || is often used to signify concatenation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
.flat_map(|(left, rs)| { | ||
rs.against | ||
.iter() | ||
.filter(move |&(right, _)| left < right) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is a somewhat annoying side-effect of using flat_map
. move
is perfectly acceptable to use here.
let op = if rs.outer { "⋉" } else { "⋈" }; | ||
rs.fields.iter().map(move |&(li, ri)| { | ||
format!("{}:{} {} {}:{}", | ||
left.index(), li, op, right.index(), ri) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This syntax reads weirdly to me. Can we come up with something that's easier to visually scan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a suggestion? I picked node:col
because it's consistent with unions: node:[col, col, col]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I meant more that I don't understand it. Is this showing which columns are emitted, or which columns are joined on? I feel like both should ideally be shown..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
fn description(&self) -> String { | ||
let key_cols = self.key.iter().map(|k| k.to_string()) | ||
.collect::<Vec<_>>().join(", "); | ||
format!("Latest γ[{}]", key_cols) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm tempted to have us use ⌈ ⌉ here, though its meaning would be somewhat overloaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is consistent with the other grouping operators, though I guess latest isn't considered a grouping operator as far as the implementation is concerned. You tell me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I'm fine with keeping the γ for grouping. Was more wondering if there's a more concise way to express "latest" rather than spelling it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about the hourglass? ⧖
@@ -315,7 +333,7 @@ impl Node { | |||
|
|||
impl Debug for Node { | |||
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { | |||
write!(f, "{:?}", self.name) | |||
write!(f, "{:?}({:#?})", self.name, *self.inner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the *
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
¯_(ツ)_/¯ that's just what it used to be: f2c8b42#diff-52cb7ae98bea74840c4725039169373cL318
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird.. Does it compile without it?
format!("{}:[{}]", src.index(), cols) | ||
}) | ||
.collect::<Vec<_>>() | ||
.join(" ∪ ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be ⋃, not ∪ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah, sure
Yeah, @jonhoo, the original PR added a format!("{:dot}", graph)
impl FormatDot for Graph {
fn fmt() { /* ... */ }
} but alas. |
I think using Adding a legend in Dot is a bit of a pain although it can be done (http://stackoverflow.com/a/4752766). I prefer "B" for "Base" over "F"/"Fact", but merely due to consistency with the implementation names. |
Yeah, the ability to write custom formatting traits would be really neat. Maybe we should start an RFC for it :p |
Output DOT graphs without using petgraph's stock formatter so that we can output each node's configuration using a dense, human-readable, relational-algebra-like format.
@benesch Thanks, looks good! Let's merge this -- it's a big improvement, and we have bigger research fish to fry 😉 |
Yessir ;) |
Output DOT graphs without using petgraph's stock formatter so that we
can output each node's configuration using a dense, human-readable,
relational-algebra-like format.
Here's what the main/web graph looks like in this new world: