Skip to content
Dan edited this page May 23, 2013 · 17 revisions

The objective of this appendix is to describe the mechanics of information recovery from dynamic econometric models. All source code for our paper is open and written in Clojure, a new dialect of Lisp. This appendix will describe the estimation procedures through detailed documentation of the code.

Consider a time series of length T. Within the full time series there will be T - D + 1 sub-blocks of length D. The following illustration represents the D! ordinal patterns for sub-blocks when D = 3:

Note that the patterns do not reflect the values within the sub-blocks, but rather the ordinal relationships between the values in the sub-blocks. The permutation entropy method relies on the ordinal sequencing of sub-blocks, and the subsequent counts of these ordinal sub-sequences. The relative frequencies of sub-sequences reveals information about the underlying dynamics of the data series. Consider an N length series, independently and identically distributed standard normal. The following is a simple plot of a series with N = 100.

The functional programming framework is particularly well suited to the analysis of permutation entropy. The final procedure is built from smaller component functions. The first function returns a sequence of indices that indicate the rank of each value within the supplied sub-sequence sub-ts.

(defn- ordinal-idx
  "Returns a sequence of indices that rank the values of the supplied
  time series in ascending order.  If there are equal values, the
  lexicographic ordering kicks in and the order at which the values
  appear is used to order the indices."
  [sub-ts]
  (let [indexed-vector (map-indexed vector sub-ts)]
    (map first
         (sort-by second indexed-vector))))

For example, the value sub-sequence (9 4 6) supplied to ordinal-idx would return the ordinal sub-sequence (2 0 1), which ranks the values according to their magnitude. Each ordinal sub-sequence represents a unique bucket; and there are D! unique buckets. The permutation-count function maps the ordinal-idx function across partitions of length D of the original series. Each partition of the full series, then, is categorized as one of the possible D! ordinal sub-sequences. The function permutation-count accepts the D length and the full series ts and returns the frequencies of the ordinal sub-sequences.

(defn permutation-count
  "Returns a map of the ordinal sequences of length `D` and their
  count; note that the offset is fixed at 1"
  [D ts]
  (let [subs (partition D 1 ts)]
    (frequencies (map ordinal-idx subs))))

The following histograms tally the frequencies of the D! = 6 ordinal sub-sequences for D = 3 and N equal to 100, 10,000, and 1,000,000, respectively. It is clear that in the limit, no ordinal sub-sequence is more likely than any other, which is consistent with the data generating process.

Another metric to assess the distribution of the sub-sequence frequencies is the measure of permutation entropy. The permutation entropy measure ranges from zero to one, where higher values indicate greater entropy. A value of one, for example, indicates a uniform distribution, and a value of zero indicates a single, repeated value within a range of values. The code to calculate the permutation entropy is presented below.

(defn- log-fn [x]
  (* x (i/log2 x)))

(defn- to-freq
  "Returns the normalized frequency of the supplied column"
  [coll]
  (let [total (reduce + coll)]
    (map #(/ % total) coll)))

(defn permutation-entropy
  "Normalixed permutation entropy based on the Shannon entropy
  distribution"
  [D ts]
  (let [pi-seq (to-freq (vals (permutation-count D ts)))
        scale-by (* -1 (/ 1 (i/log2 (i/factorial D))))]
    (* scale-by
       (reduce + (map log-fn pi-seq)))))

Clone this wiki locally