# Analysis example for exploring power law distributions

In this notebook we explore power law distributions to gain familiarity with them.

See https://www.nature.com/articles/srep00812 to recognize that mixtures of power laws do not, themselves, yield a power law! Also, if you take the bottom half of data from a power law, you will not get a power law.

The original analysis is in github repo: https://github.com/carbocation/jupyter/blob/master/powerlaws.ipynb

## Setup

In [None]:
lapply(c('poweRlaw'),
       function(pkg) { if(! pkg %in% installed.packages()) { install.packages(pkg)} } )

In [None]:
GetRanks <- function(x) {
    return(1+length(x)-seq(1, length(x)))
}

# Generate a vector with 5k power-law distributed values 
# with scaling factor of 3, starting at 1
x <- poweRlaw::rpldis(50000, 1, 3, discrete_max=10000)

## Plot the full distribution

In [None]:
(function() {
    # Plot them. Note that the order (with regard to the rank of the x values) matters.
    plot(GetRanks(x), sort(x), log="xy", xlab="Rank", ylab="Value", main="Log scale")
})()

## Take the bottom 95% of the distribution and see if it still looks like a power law (no)

In [None]:
(function() {
    N <- 0.95*length(x)
    #partials <- sort(x)[(length(x)-N):(length(x)-1)]
    partials <- sort(x)[1:N]
    plot(GetRanks(partials), sort(partials), log="xy", xlab="Rank", ylab="Value", main="Log scale")
})()

## Take the top 5% of the distribution and see if it still looks like a power law (yes)

In [None]:
(function() {
    N <- 0.05*length(x)
    partials <- sort(x)[(length(x)-N):(length(x)-1)]
    plot(GetRanks(partials), sort(partials), log="xy", xlab="Rank", ylab="Value", main="Log scale")
})()

# Provenance

In [None]:
devtools::session_info()

Copyright 2018 The Broad Institute, Inc., Verily Life Sciences, LLC All rights reserved.

This software may be modified and distributed under the terms of the BSD license. See the LICENSE file for details.