Skip to content

Commit

Permalink
Add a "report" subcommand
Browse files Browse the repository at this point in the history
  • Loading branch information
jvns committed Feb 20, 2018
1 parent 8fe1ad5 commit 4b6015a
Show file tree
Hide file tree
Showing 5 changed files with 185 additions and 16 deletions.
41 changes: 38 additions & 3 deletions README.md
Expand Up @@ -47,7 +47,13 @@ sudo rbspy snapshot --pid $PID

### Record

Record records stack traces from your process and generates a flamegraph.
Record records stack traces from your process and saves them to disk.

`rbspy record` will save 2 files: a gzipped raw data file, and a visualization (by default a flamegraph, you
can configure the visualization format with `--format`). The raw data file contains every stack
trace that `rbspy` recorded, so that you can generate other visualizations later if you want. By
default, rbspy saves both files to `~/.cache/rbspy/records`, but you can customize where those are
stored with `--file` and `--raw-file`.

This is useful when you want to know what functions your program is spending most of its time in.

Expand Down Expand Up @@ -77,11 +83,40 @@ you're user `bork` and run `sudo rbspy record ruby script.rb`. You can disable t

#### Optional Arguments

These work regardless of how you started the recording.
These work regardless of how you started the recording.

* `--file`: Specifies where rbspy will save data. By default, rbspy saves to `~/.cache/rbspy/records`.
* `--rate`: Specifies the frequency of that stack traces are recorded. The interval is determined by `1000/rate`. The default rate is 100hz.
* `--duration`: Specifies how long to run before stopping rbspy. This conficts with running a subcommand (`rbspy record ruby myprogram.rb`).
* `--format`: Specifies what format to use to report profiling data. The options are:
* `flamegraph`: generates a flamegraph SVG that you can view in a browser
* `callgrind`: generates a callgrind-formatted file that you can view with a tool like
`kcachegrind`.
* `summary`: aggregates % self and % total times by function. Useful to get a basic overview
* `summary_by_line`: aggregates % self and % total times by line number. Especially useful when
there's 1 line in your program which is taking up all the time.
* `--file`: Specifies where rbspy will save formatted output.
* `--raw-file`: Specifies where rbspy will save formatted data. Use a gz extension because it will be gzipped.

## Reporting

If you have a raw rbspy data file that you've previously recorded, you can use `rbspy report` to
generate different kinds of visualizations from it (the flamegraph/callgrind/summary formats, as
documented above). This is useful because you can record raw data from a program and then decide how
you want to visualize it afterwards.

For example, here's what recording a simple program and then generating a summary report looks like:

```
$ sudo rbspy record --raw-file raw.gz ruby ci/ruby-programs/short_program.rb
$ rbspy report -f summary -i raw.gz -o summary.txt
$ cat summary.txt
% self % total name
100.00 100.00 <c function> - unknown
0.00 100.00 ccc - ci/ruby-programs/short_program.rb
0.00 100.00 bbb - ci/ruby-programs/short_program.rb
0.00 100.00 aaa - ci/ruby-programs/short_program.rb
0.00 100.00 <main> - ci/ruby-programs/short_program.rb
```

## What's a flamegraph?

Expand Down
2 changes: 1 addition & 1 deletion src/core/initialize.rs
Expand Up @@ -44,7 +44,7 @@ pub fn initialize(pid: pid_t) -> Result<StackTraceGetter, Error> {
})
}

#[derive(Debug, PartialEq, Eq, Hash, Clone, Serialize)]
#[derive(Debug, PartialEq, Eq, Hash, Clone, Serialize, Deserialize)]
pub struct StackFrame {
pub name: String,
pub relative_path: String,
Expand Down
34 changes: 32 additions & 2 deletions src/main.rs
Expand Up @@ -100,6 +100,7 @@ enum SubCmd {
},
/// Capture and print a stacktrace snapshot of process `pid`.
Snapshot { pid: pid_t },
Report { format: OutputFormat, input: PathBuf, output: PathBuf, },
}
use SubCmd::*;

Expand Down Expand Up @@ -165,7 +166,8 @@ fn do_main() -> Result<(), Error> {
sample_rate,
maybe_duration,
)
}
},
Report{format, input, output} => report(format, input, output),
}
}

Expand Down Expand Up @@ -348,6 +350,17 @@ fn record(
Ok(())
}

fn report(format: OutputFormat, input: PathBuf, output: PathBuf) -> Result<(), Error>{
let input_file = File::open(input)?;
let stuff = storage::from_reader(input_file)?;
let mut outputter = format.outputter();
for trace in stuff {
outputter.record(&trace)?;
}
outputter.complete(File::create(output)?)?;
Ok(())
}

fn print_summary(summary_out: &ui::summary::Stats) -> Result<(), Error> {
let width = match term_size::dimensions() {
Some((w, _)) => Some(w as usize),
Expand Down Expand Up @@ -467,7 +480,7 @@ fn arg_parser() -> App<'static, 'static> {
Arg::from_usage("--format=[FORMAT] 'Output format to write'")
.possible_values(&OutputFormat::variants())
.case_insensitive(true)
.default_value("Flamegraph"),
.default_value("flamegraph"),
)
.arg(
Arg::from_usage(
Expand All @@ -477,6 +490,18 @@ fn arg_parser() -> App<'static, 'static> {
)
.arg(Arg::from_usage("<cmd>... 'command to run'").required(false)),
)
.subcommand(
SubCommand::with_name("report")
.about("Generate visualization from raw data recorded by `rbspy record`")
.arg(Arg::from_usage("-i --input=[FILE] 'Input raw data to use'"))
.arg(Arg::from_usage("-o --output=[FILE] 'Output file'"))
.arg(
Arg::from_usage("-f --format=[FORMAT] 'Output format to write'")
.possible_values(&OutputFormat::variants())
.case_insensitive(true)
.default_value("flamegraph"),
)
)
}

impl Args {
Expand Down Expand Up @@ -537,6 +562,11 @@ impl Args {
no_drop_root,
}
}
("report", Some(submatches)) => Report {
format: value_t!(submatches, "format", OutputFormat).unwrap(),
input: value_t!(submatches, "input", String).unwrap().into(),
output: value_t!(submatches, "output", String).unwrap().into(),
},
_ => panic!("this shouldn't happen, please report the command you ran!"),
};

Expand Down
95 changes: 85 additions & 10 deletions src/storage/mod.rs
Expand Up @@ -6,11 +6,7 @@
///
/// b"rbspyXY\n"
///
/// Here, `XY` is [Julia: your choice of...]
/// - decimal number in [0-99]
/// - hex number in [0-255]
/// - base32 number in [0-1023]
/// - base64 number in [0-4096]
/// Here, `XY` is a decimal number in [0-99]
///
/// The use of b'\n' as a terminator effectively reserves a byte, and provides
/// flexibility to go to a different version encoding scheme if this format
Expand All @@ -20,34 +16,113 @@ extern crate flate2;
use std::io;
use std::io::prelude::*;
use std::fs::File;
use std::path::{Path, PathBuf};
use std::path::Path;

use core::initialize::StackFrame;

use self::flate2::write::GzEncoder;
use self::flate2::Compression;
use failure::Error;
use serde_json;

mod v0;

pub struct Store {
encoder: GzEncoder<File>,
encoder: flate2::write::GzEncoder<File>,
}

impl Store {
pub fn new(out_path: &Path) -> Result<Store, io::Error> {
let file = File::create(out_path)?;
let mut encoder = GzEncoder::new(file, Compression::default());
let mut encoder = flate2::write::GzEncoder::new(file, Compression::default());
encoder.write("rbspy00\n".as_bytes())?;
Ok(Store { encoder })
}

pub fn write(&mut self, trace: &Vec<StackFrame>) -> Result<(), Error> {
let json = serde_json::to_string(trace)?;
self.encoder.write(json.as_bytes())?;
writeln!(&mut self.encoder, "{}", json)?;
Ok(())
}

pub fn complete(self) {
drop(self.encoder)
}
}

#[derive(Clone, Debug, Copy, Eq, PartialEq, Ord, PartialOrd)]
pub(crate) struct Version(u64);

// Write impls like this for every storage version (no need for the internal
// one, since `impl<T> From<T> for T` exists.
//
// impl From<v7::Data> for JuliaData {
// fn from(d: v7::Data) -> JuliaData {
// unimplemented!();
// }
// }


impl ::std::fmt::Display for Version {
fn fmt(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
write!(f, "{}", self.0)
}
}

impl Version {
/// Parse bytes to a version.
///
/// # Errors
/// Fails with `StorageError::Invalid` if the version tag is in an unknown
/// format.
fn try_from(b: &[u8]) -> Result<Version, StorageError> {
if &b[0..3] == "00\n".as_bytes() {
Ok(Version(0))
} else {
Err(StorageError::Invalid)
}
}
}

#[derive(Fail, Debug)]
pub(crate) enum StorageError {
/// The file doesn't begin with the magic tag `rbspy` + version number.
#[fail(display = "Invalid rbpsy file")]
Invalid,
/// The version of the rbspy file can't be handled by this version of rbspy.
#[fail(display = "Cannot handle rbspy format {}", _0)]
UnknownVersion(Version),
/// An IO error occurred.
#[fail(display = "IO error {:?}", _0)]
Io(#[cause] io::Error),
}

/// Types that can be deserialized from an `io::Read` into something convertible
/// to the current internal form.
pub(crate) trait Storage: Into<v0::Data> {
fn from_reader<R: Read>(r: R) -> Result<Self, Error>;
fn version() -> Version;
}

fn read_version(r: &mut Read) -> Result<Version, StorageError> {
let mut buf = [0u8; 8];
// TODO: I don't know how to failure good, so this doesn't work.
r.read(&mut buf).map_err(StorageError::Io)?;
match &buf[..5] {
b"rbspy" => Ok(Version::try_from(&buf[5..])?),
_ => Err(StorageError::Invalid),
}
}

pub(crate) fn from_reader<R: Read>(r: R) -> Result<Vec<Vec<StackFrame>>, Error> {
// This will read 8 bytes, leaving the reader's cursor at the start of the
// "real" data.
let mut reader = flate2::read::GzDecoder::new(r);
let version = read_version(&mut reader)?;
match version {
Version(0) => {
let intermediate = v0::Data::from_reader(reader)?;
Ok(intermediate.into())
}
v => Err(StorageError::UnknownVersion(v).into()),
}
}
29 changes: 29 additions & 0 deletions src/storage/v0.rs
@@ -0,0 +1,29 @@
use std::io::prelude::*;
use std::io::BufReader;
use core::initialize::StackFrame;

use failure::Error;
use serde_json;

use super::*;

// The first version is just your internal representation.
pub(crate) type Data = Vec<Vec<StackFrame>>;

impl Storage for Data {
fn from_reader<R: Read>(r: R) -> Result<Data, Error> {
let reader = BufReader::new(r);
let mut result = Vec::new();
for line in reader.lines() {
let trace = serde_json::from_str(&line?)?;
result.push(trace);
}
Ok(result)
}
fn version() -> Version {
// The "version" for the internal representation is bumped every time
// you make a change to it, copying the old struct into a new file at
// `src/vblah`.
Version(0)
}
}

0 comments on commit 4b6015a

Please sign in to comment.