Skip to content
Permalink
Fetching contributors…
Cannot retrieve contributors at this time
64 lines (53 sloc) 1.76 KB
date title tags cover coverAlt
2019-02-27T19:55:50.330Z
Conditions Are Power-Law Distributed - A JS Example
conditions
data-science
meta
power-law
cover.png
Our own codebase follows a power law distribution of conditions

I read Kent Beck's Conditions Are Power-Law Distributed: An Example today, and it made perfect sense. In a codebase there are lots of conditions used once, and few conditions used many times, the distribution following a power law.

Kent had some Bash code that he wrote to extract the distributions, but written for Python ifs, and I needed to adjust it to our Javascript/Typescript codebase (Currently of 304K lines):

$ grep -R --include='*.js' 'if ' app | \
 perl -nle 'print $1 if /.*if\s*\((.*)\) /' | \
 sort | uniq -c | sort -n -r | \
 awk '{ print $1}' | sort -n | uniq -c
2696 1
 327 2
  85 3
  42 4
  22 5
   8 6
   8 7
   3 8
   5 9
   2 10
   4 12
   1 13
   4 15
   2 19
   2 21
   1 22
   1 33
   1 35
   1 155

My goto tool for plotting is R (and R Studio) so let's output to a text file named data.txt.

Opening it in R Studio I ran the following code (And isn't R amazing?)

library(ggplot2)

data <- read.table("data.txt", col.names = c("Count", "Appearances"))
ggplot(data, aes(Appearances, Count)) + geom_point() +
    scale_x_continuous(trans='log2') + scale_y_continuous(trans='log2')

This already adjusts the scales to be log2 based and we get a very similar power law distribution to Kent Beck's results.

Our own codebase conditions are power law distributed

Awesome!

You can’t perform that action at this time.