New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Variable statistics #21
Conversation
(IIRC the CI failure is due to current infra nightly build issues, or in general rustfmt not being available on the latest nightly or similar) |
(The CI issue was indeed temporary) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idea! Sorry for being slow.
src/lib.rs
Outdated
@@ -437,6 +452,22 @@ impl<Tuple: Ord> VariableTrait for Variable<Tuple> { | |||
|
|||
!self.recent.borrow().is_empty() | |||
} | |||
|
|||
#[cfg(feature = "stats")] | |||
fn dump_stats(&self, round: u32) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should take a &mut Write
to write to; I'd like to be able to dump it to a file or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though what I might like best would be to accumulate the stats into some sort of data structure that you can dump later, but .. I'm not sure why. I guess I was thinking it might be nice to have the output in CSV format or something so you can pull it into a spreadsheet and graph it or whatever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use https://crates.io/crates/text-tables ?
src/lib.rs
Outdated
@@ -162,6 +173,10 @@ impl Iteration { | |||
trait VariableTrait { | |||
/// Reports whether the variable has changed since it was last asked. | |||
fn changed(&mut self) -> bool; | |||
|
|||
#[cfg(feature = "stats")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that the feature flag is worthwhile. It doesn't seem to add any cost if you don't invoke dump_stats
I'm working on dumping the stats to a file, so I'll close this PR until then. |
let mut result = false; | ||
for variable in self.variables.iter_mut() { | ||
if variable.changed() { | ||
result = true; | ||
} | ||
|
||
if let Some(ref mut stats_writer) = self.debug_stats { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the intended usage here is that we invoke
iteration.record_stats_to(writer);
and then it will dump out the data as it goes?
That seems pretty nice, yeah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usage-wise: exactly. Regarding the data dumping as it goes: there's also the other possibility of keeping the stats in memory (which would also be good in order to add stats about the per-variable join durations). Whichever you and @frankmcsherry prefer.
This adds simple profiling statistics in the spirit of #5, but inspired by Soufflé's profiler: adding the statistics per identifiable round.
For each round, there's the stable and recent tuple counts for each variable, as CSV, output to a
io::Write
of the user's choice.We can talk about what precise data (or format) we'd like to see here. In the future, we can also add a self-profiling option to tally up the time each join operation took to create the tuples.
Here's how it looks right now