Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easily save log2 histograms #144

Closed
brendangregg opened this issue Aug 18, 2015 · 5 comments
Closed

Easily save log2 histograms #144

brendangregg opened this issue Aug 18, 2015 · 5 comments

Comments

@brendangregg
Copy link
Member

Power-of-2 histograms should be a common use case for analyzing distributions. Here is how they can be populated in C (code largely from https://github.com/torvalds/linux/blob/master/samples/bpf/tracex2_kern.c):

BPF_TABLE("array", int, u64, dist, 64);

static unsigned int log2(unsigned int v)
{
    unsigned int r;
    unsigned int shift;

    r = (v > 0xFFFF) << 4; v >>= r;
    shift = (v > 0xFF) << 3; v >>= shift; r |= shift;
    shift = (v > 0xF) << 2; v >>= shift; r |= shift;
    shift = (v > 0x3) << 1; v >>= shift; r |= shift;
    r |= (v >> 1);
    return r;
}

static unsigned int log2l(unsigned long v)
{
    unsigned int hi = v >> 32;
    if (hi)
        return log2(hi) + 32 + 1;
    else
        return log2(v) + 1;
}

[...]
    int index = log2l(req->__data_len / 1024);
    u64 *leaf = dist.lookup(&index);
    if (leaf) (*leaf)++;

Adding a BPF_LOG2_IDX macro for the above two C functions would be a big improvement, eg:

BPF_TABLE("array", int, u64, dist, 64);
[...]
    int index = BPF_LOG2_IDX(req->__data_len / 1024);
    u64 *leaf = dist.lookup(&index);
    if (leaf) (*leaf)++;

Even better would be a BPF_HIST macro, and a BPF helper:

BPF_HIST(dist, log2);
[...]
    dist.increment(req->__data_len / 1024);
@brendangregg
Copy link
Member Author

Something else to think about: support of histograms for each key. Imagine we want to save this log2 histogram per-PID, per-comm, per-device-name, or some combination?

@brendangregg
Copy link
Member Author

Could something like this be added in the meantime? ... I assume a final implementation will need bpf_log() functions anyway...

# git diff src/cc/export/helpers.h
diff --git a/src/cc/export/helpers.h b/src/cc/export/helpers.h
index 4571166..f0aaf9f 100644
--- a/src/cc/export/helpers.h
+++ b/src/cc/export/helpers.h
@@ -185,6 +185,28 @@ static inline void bpf_store_dword(void *skb, u64 off, u64 val) {
 #define MASK(_n) ((_n) < 64 ? (1ull << (_n)) - 1 : ((u64)-1LL))
 #define MASK128(_n) ((_n) < 128 ? ((unsigned __int128)1 << (_n)) - 1 : ((unsigned __int128)-1))

+static unsigned int bpf_log2(unsigned int v)
+{
+       unsigned int r;
+       unsigned int shift;
+
+       r = (v > 0xFFFF) << 4; v >>= r;
+       shift = (v > 0xFF) << 3; v >>= shift; r |= shift;
+       shift = (v > 0xF) << 2; v >>= shift; r |= shift;
+       shift = (v > 0x3) << 1; v >>= shift; r |= shift;
+       r |= (v >> 1);
+       return r;
+}
+
+static unsigned int bpf_log2l(unsigned long v)
+{
+       unsigned int hi = v >> 32;
+       if (hi)
+               return bpf_log2(hi) + 32 + 1;
+       else
+               return bpf_log2(v) + 1;
+}
+
 struct bpf_context;

 static inline __attribute__((always_inline))

@drzaeus77
Copy link
Collaborator

Sure, if you send that I will be happy to merge it. Otherwise I'm planning
to start churning through these issues after the meetup today.

On Mon, Sep 21, 2015 at 3:34 PM, Brendan Gregg notifications@github.com
wrote:

Could something like this be added in the meantime? ... I assume a final
implementation will need bpf_log() functions anyway...

git diff src/cc/export/helpers.h

diff --git a/src/cc/export/helpers.h b/src/cc/export/helpers.h
index 4571166..f0aaf9f 100644
--- a/src/cc/export/helpers.h
+++ b/src/cc/export/helpers.h
@@ -185,6 +185,28 @@ static inline void bpf_store_dword(void *skb, u64 off, u64 val) {
#define MASK(_n) ((_n) < 64 ? (1ull << (_n)) - 1 : ((u64)-1LL))
#define MASK128(_n) ((_n) < 128 ? ((unsigned __int128)1 << (_n)) - 1 : ((unsigned __int128)-1))

+static unsigned int bpf_log2(unsigned int v)
+{

  •   unsigned int r;
    
  •   unsigned int shift;
    
  •   r = (v > 0xFFFF) << 4; v >>= r;
    
  •   shift = (v > 0xFF) << 3; v >>= shift; r |= shift;
    
  •   shift = (v > 0xF) << 2; v >>= shift; r |= shift;
    
  •   shift = (v > 0x3) << 1; v >>= shift; r |= shift;
    
  •   r |= (v >> 1);
    
  •   return r;
    

    +}
    +
    +static unsigned int bpf_log2l(unsigned long v)
    +{

  •   unsigned int hi = v >> 32;
    
  •   if (hi)
    
  •           return bpf_log2(hi) + 32 + 1;
    
  •   else
    
  •           return bpf_log2(v) + 1;
    

    +}
    +
    struct bpf_context;

    static inline attribute((always_inline))


Reply to this email directly or view it on GitHub
#144 (comment).

@drzaeus77
Copy link
Collaborator

I have some (maybe) improvements, and am looking for feedback.

The table could be defined by BPF_HISTOGRAM(name) or BPF_HISTOGRAM(name, key_type). Default key_type is int. Leaf type is forced to be u64.

In the first scenario, a new table method is introduced, which just increments the slot based on integer argument. The underlying table type is an array.

BPF_HISTOGRAM(dist);
dist.increment(bpf_log2l(delta));

In the second scenario, the key type is definable, in this case it is a tuple of the bucket and the slot. The underlying table type is a hash.

typedef struct {
  int cpu;
  u32 slot;
} key_t;
BPF_HISTOGRAM(dist, key_t);
dist.increment((key_t){bpf_get_smp_processor_id(), bpf_log2l(delta)});

@brendangregg
Copy link
Member Author

Seems reasonable.

Just to expand on the biolatency example. Right now that's a global log2 histogram. Possible enhancements to that tool:

  • -P to print a latency histogram for each issuing PID
  • -I to print separate latency histograms for each I/O type: read, write
  • -D to print a latency histogram per-disk device

Doing multiple breakdowns -- eg, combining -P and -I, to get a breakdown per PID and per I/O type -- are less important given filter capabilities. Eg, imagine a workflow:

# ./biolatency -P          # show latency by-PID
# ./biolatency -p 181 -I   # given PID 181 has high latency, show PID 181 only by I/O type
# ./biolatency -rp 181 -D  # show PID 181 read latency by-device

So I'm accomplishing breakdowns of breakdowns, but by using filters. Being able to have multiple keys would be useful too. I'm just showing how important it might be (at least for this example).

drzaeus77 pushed a commit that referenced this issue Sep 24, 2015
This adds support for a specialized histogram type, which underneath
maps to an array or a hash table, depending on key type. With no
arguments, it takes on the type `u64 table[64];`. The other current
supported key type is `struct { int32|int64 bucket; int32|int64 slot }`.

To print these automatically, print_log2_hist is underneath split into
two types of printouts, one which prints the single histogram, and
another which prints a histogram for each unique `bucket` value.

See test_histogram.py for examples.

Fixes: #144
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
drzaeus77 pushed a commit that referenced this issue Sep 24, 2015
This adds support for a specialized histogram type, which underneath
maps to an array or a hash table, depending on key type. With no
arguments, it takes on the type `u64 table[64];`. The other current
supported key type is `struct { int32|int64 bucket; int32|int64 slot }`.

To print these automatically, print_log2_hist is underneath split into
two types of printouts, one which prints the single histogram, and
another which prints a histogram for each unique `bucket` value.

See test_histogram.py for examples.

Fixes: #144
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants