Skip to content
Dan edited this page May 23, 2013 · 17 revisions

The objective of this appendix is to describe the mechanics of information recovery from dynamic econometric models. All source code for our paper is open and written in Clojure, a new dialect of Lisp. This appendix will describe the estimation procedures through detailed documentation of the code.

Consider a time series of length T. Within the full time series there will be T - D + 1 sub-blocks of length D. The following illustration represents the D! ordinal patterns for sub-blocks when D = 3:

Note that the patterns do not reflect the values within the sub-blocks, but rather the ordinal relationships between the values in the sub-blocks. The permutation entropy method relies on the ordinal sequencing of sub-blocks, and the subsequent counts of these ordinal sub-sequences. The relative frequencies of sub-sequences reveals information about the underlying dynamics of the data series. Consider an N length series, independently and identically distributed standard normal. The following is a simple plot of a series with N = 100.

The functional programming framework is particularly well suited to the analysis of permutation entropy. The final procedure is built from smaller component functions. The first function returns a sequence of indices that indicate the rank of each value within the supplied sub-sequence sub-ts.

(defn- ordinal-idx
  "Returns a sequence of indices that rank the values of the supplied
  time series in ascending order.  If there are equal values, the
  lexicographic ordering kicks in and the order at which the values
  appear is used to order the indices."
  [sub-ts]
  (let [indexed-vector (map-indexed vector sub-ts)]
    (map first
         (sort-by second indexed-vector))))

For example, the value sub-sequence (9 4 6) supplied to ordinal-idx would return the ordinal sub-sequence (2 0 1), which ranks the values according to their magnitude. Each ordinal sub-sequence represents a unique bucket; and there are D! unique buckets. The permutation-count function maps the ordinal-idx function across partitions of length D of the original series. Each partition of the full series, then, is categorized as one of the possible D! ordinal sub-sequences. The function permutation-count accepts the D length and the full series ts and returns the frequencies of the ordinal sub-sequences.

(defn permutation-count
  "Returns a map of the ordinal sequences of length `D` and their
  count; note that the offset is fixed at 1"
  [D ts]
  (let [subs (partition D 1 ts)]
    (frequencies (map ordinal-idx subs))))

The following histograms tally the frequencies of the D! = 6 ordinal sub-sequences for D = 3 and T equal to 100, 10,000, and 1,000,000, respectively. It is clear that in the limit, no ordinal sub-sequence is more likely than any other, which is consistent with the data generating process.

Another metric to assess the distribution of the sub-sequence frequencies is the measure of permutation entropy. The permutation entropy measure ranges from zero to one, where higher values indicate greater entropy. A value of one, for example, indicates a uniform distribution, and a value of zero indicates a single, repeated value within a range of values. The code to calculate the permutation entropy is presented below.

(defn- log-fn [x]
  (* x (log2 x)))

(defn- to-freq
  "Returns the normalized frequency of the supplied column"
  [coll]
  (let [total (reduce + coll)]
    (map #(/ % total) coll)))

(defn permutation-entropy
  "Normalixed permutation entropy based on the Shannon entropy
  distribution"
  [D ts]
  (let [pi-seq (to-freq (vals (permutation-count D ts)))
        scale-by (* -1 (/ 1 (log2 (factorial D))))]
    (* scale-by
       (reduce + (map log-fn pi-seq)))))

The measure of permutation entropy for the full sequences of length T = 100, 10,000, 1,000,000 are, respectively, 0.985, 0.999, and 1 (to machine precision). The general principles of permutation entropy are fairly well understood. The empirical applications, especially within economics or broader social sciences, are relatively unexplored. Consider, now, an empirical distribution of residuals, plotted here, along with the sub-sequence frequencies for D = 3:

The permutation entropy value is 0.991 of the empirical residuals. The two sub-sequences that are significantly more likely than the others are (0 1 2) and (2 1 0), or the sub-sequences that are in some sense smooth. Perhaps this behavior indicates some level of autocorrelation, rather than a more random walk. The same pattern can be seen for D = 4. A final empirical application is the Dow Jones Industrial Average (DJIA) for each day between May 26, 1896 and March 4, 2013 for a total of 29,253 observations. The normalized permutation entropy with D = 4 is 0.924. The raw time series is plotted below.

Suppose, now, that we split the raw time series into 365-day blocks, offset by one day. There are 28,889 blocks of 365 days within this time series. We can then apply the same function permutation-entropy with D = 4 to each of these 28,889 partitions:

The code runs in about 5 seconds. The results provide some insight into broad trends in US fiscal policy. The turmoil in the stock market during the depression is apparent, as is the steady demand for manufactured products during World War I. The deregulation during the Reagan administration in the 1980s, and during the second Bush administration is also apparent.

Clone this wiki locally