# Pertubration Complexity Index (PCI)

## Definition
This metric was introduced by Casali et al. in 2013. It is used with TMS stimulation to have a metric describing the brain response to the stimuli complexity.

The metric definition is the Lempel-Ziv complexity of a stream of '0' and '1' scaled by the source entropy and the length of the sequence. This get the metric between 0 and 1 where 1 is when the stream of data is maximally complex.

PCI = (Lempel-Ziv Complexity) x (log(Length of stream,base 2)/(Length of stream x Source Entropy)

Source Entropy = -p*log(p, base 2) - (1-p)*log(1-p, base 2)

p = fractions of 1 in stream

## Implementation
The core of the algorithm being the Lempel-Ziv complexity we need to have a solid implementation of that. We can use my previous code in the Lempel-Ziv Jupyter notebook, however its safer to use a battle-tested implementation. One good implementation that I have found is this one: https://github.com/Naereen/Lempel-Ziv_Complexity. 
To install it we need to use pip as shown in the README.md `pip install lempel_ziv_complexity`.

In [128]:
from lempel_ziv_complexity import lempel_ziv_complexity
stream = '1001111011000010100111101100001010011110110000101001111011000010'
complexity = lempel_ziv_complexity(stream)
print("Complexity = " + str(complexity))# This should give us a complexity of around 7

Complexity = 7


We need then to implement the PCI function which will make use of the Lempel-Ziv complexity + other helper functions

In [136]:
from math import log
import numpy as np
from lempel_ziv_complexity import lempel_ziv_complexity

# Here we assume that the matrix is made like this Matrix(X,T) where X is the features and T are the time point
# and that the data is arranged like this Matrix(row,column)
def matrix_to_string(matrix):
    num_row, num_col = matrix.shape
    array = matrix.flatten("F")
    string_array = ''.join(str(n) for n in array)
    return string_array

# This function accepts a matrix(row,col) where row = features X and col = time T
def perturbation_complexity_index(matrix):
    # Here we conver the matrix into a stream by concatenating column wise
    stream = matrix_to_string(matrix)
    
    # Variable initialization
    length_stream = len(stream)
    p = stream.count("1")/length_stream
    source_entropy = -p*log(p, 2) - (1-p)*log((1-p), 2)
    complexity = lempel_ziv_complexity(stream) # Here we could use the cython version for speedup
    
    # PCI formula
    pci = complexity*(log(length_stream, 2)/(length_stream*source_entropy))
    return pci
    

In [137]:
matrix = np.around(np.random.rand(100,100)).astype(int)
pci = perturbation_complexity_index(matrix)
print("PCI for random matrix = " + str(pci)) # Here we can see that the PCI will be very close to 1 (maximum complexity) as the input stream is random

PCI for random matrix = 1.0432829347892154
