# Tutorial: how to generate power-law surrogates


This notebook is designed very briefly to illustrate generation of different types of power-law surrogates from synthetic data and an empirical time series.

## Import source code and basic libraries

In [80]:
from constrained_power_law_surrogates import gen_typ_p_l_surrogate, gen_power_law_surr_list, ident_cut_off_const, estim_scale_exp
# 
import numpy as np
import sys

## Input sequence

Store your input sequence into the variable *seq*

Read it from data

In [67]:
seqtext = open("time-series/words.txt").read().split("\n")
seq = [int(x) for x in seqtext[:10]]

Generate from a power-law with chosen exponent and lower cut-off:

In [68]:
gamma = 2; x_min = 3; N = 10;
seq = gen_typ_p_l_surrogate(gamma, x_min, N)

## Constrained methods

### This is the list and small description of the methods to generate (surrogate) time series

In [69]:
surrogate_methods = ['obse','know','cons','typi','boot','shuf','mark','ordi']
method_description={
        'obse':'Original observation',
    'know':'Power-law surrogate with known scale exponent',
    'cons':'Constrained power-law surrogate',
    'typi':'Typical power-law surrogate',
    'boot':'Bootstrap surrogate',
    'shuf':'Shuffle surrogate',
    'mark':'Constrained Markov order power-law surrogate',
    'ordi':'Constrained ordinal pattern power-law surrogate',
    'unknown':'Unknown input surr_method surrogate'
}

You can choose one of the methods to generate the surrogate of the input sequence *seq* as:

In [70]:
gen_power_law_surr_list(seq,surr_method='cons')

[[25, 1, 27, 2, 7, 20, 1, 12, 28, 1]]

Here are surrogates from all the methods:

In [71]:
for method in surrogate_methods:
    print("Using method:",method," to generate:",method_description[method])
    surr_list = gen_power_law_surr_list(seq, surr_method='shuf', x_min=x_min, num_surr=2)
    print(surr_list,"\n")


Using method: obse  to generate: Original observation
[[3, 7, 4, 3, 5, 16, 14, 5, 3, 15], [16, 15, 14, 3, 3, 7, 3, 5, 4, 5]] 

Using method: know  to generate: Power-law surrogate with known scale exponent
[[4, 3, 14, 5, 7, 15, 5, 16, 3, 3], [5, 3, 5, 14, 16, 4, 3, 7, 3, 15]] 

Using method: cons  to generate: Constrained power-law surrogate
[[3, 3, 5, 14, 15, 3, 7, 5, 4, 16], [7, 5, 14, 4, 15, 16, 5, 3, 3, 3]] 

Using method: typi  to generate: Typical power-law surrogate
[[3, 15, 3, 14, 7, 3, 16, 5, 5, 4], [3, 5, 7, 3, 3, 15, 16, 14, 4, 5]] 

Using method: boot  to generate: Bootstrap surrogate
[[3, 3, 5, 3, 7, 16, 15, 14, 5, 4], [15, 16, 5, 3, 4, 5, 14, 3, 3, 7]] 

Using method: shuf  to generate: Shuffle surrogate
[[4, 5, 5, 7, 16, 3, 3, 15, 14, 3], [4, 5, 14, 15, 7, 16, 3, 3, 3, 5]] 

Using method: mark  to generate: Constrained Markov order power-law surrogate
[[16, 14, 5, 5, 4, 3, 7, 3, 15, 3], [4, 3, 15, 7, 16, 5, 14, 3, 3, 5]] 

Using method: ordi  to generate: Constrained ord

Here is how you plot the surrogates:

## Estimating the cut-offs

In [82]:
# Load empirical data from text file
# Generate surrogates from this empirical data using different methods
# Juxtapose each surrogate with original observation.
# 

NMax = 1024# Maximum length of time series

# Load observation from text file
# file_name_str = 'normed-flares.txt'; dist_str = 'flares'; #x_min = 1;, N = 1,711, NG = 1,711
file_name_str = 'time-series/energy.txt'; dist_str = 'time-series/earthquakes'; #x_min = 1;, N = 59,555, NG = 59,555

data = np.loadtxt(file_name_str)

#Estimate lower cut-off x_min by minimise KS-distance from maximum likelihood power-law
(x_min, _, _, _, _) = ident_cut_off_const(data)
seq = [int(np.round(val)) for val in data if val >= x_min]
N0 = len(seq)

N0 = len(data)
if (N0 > NMax):
    N = NMax
    start_point = np.random.randint(0, N0 - N)
    seq = seq[start_point:(start_point + N)]
else:
    N = N0
    
print(str(N0) + ' observations above estimated lower cut-off x_min = ' + str(x_min) + ', of which we will consider ' + str(N) + ' consecutive values starting from a randomly chosen point.');

#The list of surrogates to be considered, and their names
surr_method_list = ['obse', 'know', 'typi', 'cons', 'boot', 'shuf', 'mark', 'ordi']
surr_method_name_list = ['Observed', 'Known exponent', 'Typical', 'Constrained', 'Bootstrapped', 'Shuffled', 'Markov', 'Ordinal pattern']

scale_exp = 2#Known scale exponent (Known scale exponent "know" surrogates only)
o = 1#Markov order (Markov order surrogates "mark" only)
b = 3#Use bins of logarithmic width log(b) (Markov order surrogates "mark" only)
L = 16#Length of ordinal patterns (ordinal pattern surrogates "ordi" only)
num_trans = 10**5#Number of transitions (Markov order "mark" and ordinal pattern surrogates "ordi" only)

#Generate and plot realisation of each type of surrogate
num_surr_types = len(surr_method_list)
import matplotlib.pyplot as plt
fig, ax = plt.subplots(num_surr_types, 2, sharex='col', sharey='none', figsize=(2*6, 6*num_surr_types))
for i in range(num_surr_types):
    surr_method = surr_method_list[i]
    surr_method_name = surr_method_name_list[i]
    surr_val_seq_list = gen_power_law_surr_list(seq, surr_method=surr_method, x_min=x_min, num_surr=1, scale_exp=scale_exp, o=o, b=b, L=L, num_trans=num_trans)
    surr_val_seq = surr_val_seq_list[0]
    scale_exp_m_l = estim_scale_exp(surr_val_seq, x_min)#Find maximum likelihood scale exponent of surrogate realisation
    match_seq = [seq[ii] if surr_val_seq[ii] == seq[ii] else np.nan for ii in range(N)]
    ax_subplot = ax[i][0]
    ax_subplot.scatter(list(range(N)), seq)
    ax_subplot.scatter(list(range(N)), surr_val_seq)
    ax_subplot.scatter(list(range(N)), match_seq)
    ax_subplot.set_yscale('log')
    ax_subplot.legend(labels=['Observed', surr_method_name, 'Equal'])
    ax_subplot.set_ylabel('Value')
    ax_subplot = ax[i][1]
    ax_subplot.axis('off')
ax[-1][0].set_xlabel('Time')
plt.show()

KeyboardInterrupt: 