# Running simulation template

This notebook runs the template notebook `simulation_template.ipynb`, which performs simulations for nearest neighbours (ssh1) and second neighbours (ssh2) systems.

## SSH1

In [1]:
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

template = TEMPLATE_NOTEBOOK
parameters = {"model_kw":{"criterion":"entropy"}, \
             "allowed_windings":[0,1], "val_split":0.9, "features_to_use":None, "shuffle_features":False,\
             "n_experiments":100, "start_n":0, "fit_params":None, "shuffle_rows": True,"pred_params":None,\
             "random_features":False, "store_in_lists":False, "save_eigenvector":True,\
             "save_hamiltonian":True, "save_accuracy":True, "save_models":True,\
             }
kernel_name = KERNEL_NAME

CPU times: user 709 ms, sys: 271 ms, total: 980 ms
Wall time: 582 ms


### periodic_100_6561

#### ZEROTH scenario: SSH1 Real space lattice sites 

In [2]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

### parameters
parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_0TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 83649369
### Fourier features
parameters["fourier_features_to_use"] = None
parameters["fourier_mode"] = None
parameters["fourier_real"] = None
parameters["fourier_normalize"]= None
parameters["fourier_fillna"] = None

output_file = SSH1_0TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True);





The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=58.0, style=ProgressStyle(description_wid…


CPU times: user 4.94 s, sys: 108 ms, total: 5.05 s
Wall time: 16min 20s
CPU times: user 4.94 s, sys: 109 ms, total: 5.05 s
Wall time: 16min 20s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T00:27:28.645139',
     'end_time': '2020-07-16T00:27:28.700637',
     'duration': 0.055498,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T00:27:28.720714',
     'end_time': '2020-07-16T00:27:28.736607'

#### FIRST scenario: Best real space lattice sites in $S_1$

In the first scenario, we run simulations using the $S_1 = (0, 1, 2, 3, 5, 7, 19, 35, 45, 48, 49, 50)$ best lattices in real space, as determined by the information entropy signatures.



In [6]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

json_dir = os.path.join(SSH1_PERIODIC_0TH_SCENARIO_100_6561_SIMULATION_DIR,"feature_importances")
filename = os.path.join(json_dir,"sorted_feature_importance.json")
with open(filename) as f:
    json_data = json.load(f)
#feature_importances = json.load(os.path.join(json_dir,"feature_importance.json"))
feature_importances = {int(k): v for k,v in json_data.items()}
feature_importances = list(feature_importances.keys())
print("feature_iportances")
print(feature_importances)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
feature_iportances
[0, 50, 51, 3, 1, 53, 99, 49, 55, 2, 5, 52, 98, 57, 48, 35, 45, 69, 19, 95, 9, 7, 97, 27, 77, 4, 59, 41, 85, 91, 79, 54, 11, 71, 46, 47, 61, 63, 96, 29, 43, 21, 87, 13, 31, 15, 23, 75, 39, 81, 73, 65, 56, 33, 6, 83, 93, 37, 25, 17, 89, 94, 44, 8, 67, 76, 92, 58, 42, 34, 90, 26, 24, 68, 84, 40, 16, 28, 88, 80, 70, 66, 32, 10, 30, 82, 12, 78, 18, 74, 20, 60, 86, 62, 38, 14, 72, 64, 22, 36]
CPU times: user 1.08 ms, sys: 40 µs, total: 1.12 ms
Wall time: 779 µs
CPU times: user 2.44 ms, sys: 92 µs, total: 2.53 ms
Wall time: 2.14 ms


In [8]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

### Collecting lattice sites
N_half = 50
N_sites = 6
S_1=[]
for s in feature_importances:
    if s<=N_half:
        S_1.append(s)
    if len(S_1)==N_sites:
        break
S_1 = sorted(S_1)

### parameters
parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_1ST_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = S_1 #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 9427454
### Fourier features
parameters["fourier_features_to_use"] = None
parameters["fourier_mode"] = None
parameters["fourier_real"] = None
parameters["fourier_normalize"]= None
parameters["fourier_fillna"] = None

output_file = SSH1_1ST_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True);





The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=58.0, style=ProgressStyle(description_wid…


CPU times: user 3.96 s, sys: 51.8 ms, total: 4.01 s
Wall time: 2min 54s
CPU times: user 3.96 s, sys: 51.8 ms, total: 4.01 s
Wall time: 2min 54s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T01:39:48.839803',
     'end_time': '2020-07-16T01:39:48.896370',
     'duration': 0.056567,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T01:39:48.916240',
     'end_time': '2020-07-16T01:39:48.932659'

#### SECOND scenario: Best real space lattice sites in $S'_1$ 

In the second scenario we run simulations using the best $S'_1 = [0, 1, 2, 3, 49, 50]$ real space lattice sites, as determined by the entropy signatures

In [9]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

json_dir = os.path.join(SSH1_PERIODIC_0TH_SCENARIO_100_6561_SIMULATION_DIR,"feature_importances")
filename = os.path.join(json_dir,"sorted_feature_importance.json")
with open(filename) as f:
    json_data = json.load(f)
#feature_importances = json.load(os.path.join(json_dir,"feature_importance.json"))
feature_importances = {int(k): v for k,v in json_data.items()}
feature_importances = list(feature_importances.keys())
print("feature_iportances")
print(feature_importances)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
feature_iportances
[0, 50, 51, 3, 1, 53, 99, 49, 55, 2, 5, 52, 98, 57, 48, 35, 45, 69, 19, 95, 9, 7, 97, 27, 77, 4, 59, 41, 85, 91, 79, 54, 11, 71, 46, 47, 61, 63, 96, 29, 43, 21, 87, 13, 31, 15, 23, 75, 39, 81, 73, 65, 56, 33, 6, 83, 93, 37, 25, 17, 89, 94, 44, 8, 67, 76, 92, 58, 42, 34, 90, 26, 24, 68, 84, 40, 16, 28, 88, 80, 70, 66, 32, 10, 30, 82, 12, 78, 18, 74, 20, 60, 86, 62, 38, 14, 72, 64, 22, 36]
CPU times: user 1.28 ms, sys: 9 µs, total: 1.29 ms
Wall time: 867 µs
CPU times: user 4.48 ms, sys: 14 µs, total: 4.5 ms
Wall time: 4 ms


In [25]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

### Collecting lattice sites
N_half = 50
N_sites = 4

S_1_prime=[]
for s in feature_importances:
    if s<=N_half:
        S_1_prime.append(s)
    if len(S_1_prime)==N_sites:
        break
S_1_prime = sorted(S_1_prime)

### parameters
parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_2ND_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = S_1_prime #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 5836934
### Fourier features
parameters["fourier_features_to_use"] = None
parameters["fourier_mode"] = None
parameters["fourier_real"] = None
parameters["fourier_normalize"]= None
parameters["fourier_fillna"] = None

output_file = SSH1_2ND_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True);





The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=58.0, style=ProgressStyle(description_wid…


CPU times: user 4 s, sys: 55.9 ms, total: 4.06 s
Wall time: 2min 37s
CPU times: user 4.01 s, sys: 55.9 ms, total: 4.06 s
Wall time: 2min 37s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T02:17:13.963928',
     'end_time': '2020-07-16T02:17:13.984608',
     'duration': 0.02068,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T02:17:14.001379',
     'end_time': '2020-07-16T02:17:14.018014',

#### THIRD scenario: DCT of EVEN-symmetric REAL signal (51 features) 

The THIRD scenario uses as features the DCT computed assuming that the real space vectors are REAL and EVEN-symmetric around 0 and $\frac{N}{2}$. The wavevector space components are computed using FIRST HALF real lattice sites and assuming EVEN symmetry, as is the case with the DCT. The number of resulting features is $\frac{N}{2}+1 = 51$

In [3]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_3RD_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 2462394756

### Fourier features
parameters["fourier_features_to_use"] = None#list(range(0,51,1))
parameters["fourier_mode"] = "dct"
parameters["fourier_real"] = None
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH1_3RD_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=58.0, style=ProgressStyle(description_wid…


CPU times: user 4.96 s, sys: 96.9 ms, total: 5.06 s
Wall time: 17min 33s
CPU times: user 4.97 s, sys: 97 ms, total: 5.06 s
Wall time: 17min 33s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T00:50:11.513116',
     'end_time': '2020-07-16T00:50:11.569782',
     'duration': 0.056666,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T00:50:11.589782',
     'end_time': '2020-07-16T00:50:11.606533'

#### FOURTH scenario: DCT of EVEN-symmetric REAL signal, using only sites $S_1$ in FIRST HALF real space and eigenmodes $K_1$ in wavevector space

In [27]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

json_dir = os.path.join(SSH1_PERIODIC_3RD_SCENARIO_100_6561_SIMULATION_DIR,"feature_importances")
filename = os.path.join(json_dir,"sorted_feature_importance.json")
with open(filename) as f:
    json_data = json.load(f)
#feature_importances = json.load(os.path.join(json_dir,"feature_importance.json"))
wavevector_feature_importances = {int(k): v for k,v in json_data.items()}
wavevector_feature_importances = list(wavevector_feature_importances.keys())
print("wavevector_feature_iportances")
print(wavevector_feature_importances)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
wavevector_feature_iportances
[2, 1, 49, 36, 48, 14, 30, 13, 37, 38, 16, 0, 34, 15, 12, 32, 18, 20, 3, 40, 10, 35, 31, 19, 39, 11, 50, 8, 24, 47, 28, 22, 17, 46, 26, 9, 7, 42, 4, 41, 33, 29, 23, 27, 43, 21, 5, 6, 44, 45, 25]
CPU times: user 1.29 ms, sys: 28 µs, total: 1.32 ms
Wall time: 848 µs
CPU times: user 2.69 ms, sys: 58 µs, total: 2.75 ms
Wall time: 2.26 ms


In [28]:
### Collecting lattice sites

N_half = 50
N_sites = 6

S_1=[]
for s in feature_importances:
    if s<=N_half:
        S_1.append(s)
    if len(S_1)==N_sites:
        break
S_1 = sorted(S_1)

### Collecting wavevector lattice sites

K_1=[]
for s in wavevector_feature_importances:
    #if s<=N_half:
    K_1.append(s)
    if len(K_1)==N_sites:
        break
K_1 = sorted(K_1)

parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_4TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = S_1#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 253638464

### Fourier features
parameters["fourier_features_to_use"] = K_1
parameters["fourier_mode"] = "dct"
parameters["fourier_real"] = None
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH1_4TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=58.0, style=ProgressStyle(description_wid…




{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T02:34:28.767641',
     'end_time': '2020-07-16T02:34:28.824499',
     'duration': 0.056858,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T02:34:28.844288',
     'end_time': '2020-07-16T02:34:28.860462'

#### FIFTH scenario: DCT of EVEN-symmetric REAL signal, using only sites $S'_1$ in FIRST HALF real space and eigenmodes $K'_1$ in wavevector space


In [35]:
### Collecting lattice sites

N_half = 50
N_sites = 4

S_1_prime=[]
for s in feature_importances:
    if s<=N_half:
        S_1_prime.append(s)
    if len(S_1_prime)==N_sites:
        break
S_1_prime = sorted(S_1_prime)

### Collecting wavevector lattice sites

K_1_prime=[]
for s in wavevector_feature_importances:
    #if s<=N_half:
    K_1_prime.append(s)
    if len(K_1_prime)==N_sites:
        break
K_1_prime = sorted(K_1_prime)

parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_5TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = S_1_prime #[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 1846483

### Fourier features
parameters["fourier_features_to_use"] = K_1_prime
parameters["fourier_mode"] = "dct"
parameters["fourier_real"] = None
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH1_5TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=58.0, style=ProgressStyle(description_wid…




{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T03:12:21.045938',
     'end_time': '2020-07-16T03:12:21.101958',
     'duration': 0.05602,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T03:12:21.122060',
     'end_time': '2020-07-16T03:12:21.138079',

#### SIXTH scenario: DST of Odd-symmetric REAL signal (49 features)


In [4]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_6TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 324398476

### Fourier features
parameters["fourier_features_to_use"] = None
parameters["fourier_mode"] = "dst"
parameters["fourier_real"] = "imag"
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH1_6TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=58.0, style=ProgressStyle(description_wid…


CPU times: user 5.26 s, sys: 82.1 ms, total: 5.34 s
Wall time: 24min 38s
CPU times: user 5.26 s, sys: 82.2 ms, total: 5.35 s
Wall time: 24min 38s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T01:11:17.259188',
     'end_time': '2020-07-16T01:11:17.279911',
     'duration': 0.020723,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T01:11:17.296864',
     'end_time': '2020-07-16T01:11:17.312892'

#### SEVENTH scenario: DST of ODD-symmetric REAL signal, using only sites $S_1$ in real space and K_1 in wavevector space

In [29]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

json_dir = os.path.join(SSH1_PERIODIC_6TH_SCENARIO_100_6561_SIMULATION_DIR,"feature_importances")
filename = os.path.join(json_dir,"sorted_feature_importance.json")
with open(filename) as f:
    json_data = json.load(f)
#feature_importances = json.load(os.path.join(json_dir,"feature_importance.json"))
wavevector_feature_importances = {int(k): v for k,v in json_data.items()}
wavevector_feature_importances = list(wavevector_feature_importances.keys())
print("wavevector_feature_iportances")
print(wavevector_feature_importances)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
wavevector_feature_iportances
[30, 18, 15, 28, 19, 35, 0, 33, 31, 1, 39, 17, 41, 9, 13, 29, 37, 11, 10, 7, 16, 22, 26, 47, 43, 5, 21, 20, 40, 14, 32, 23, 12, 3, 42, 25, 45, 36, 8, 27, 48, 2, 34, 6, 44, 38, 46, 4, 24]
CPU times: user 603 µs, sys: 13 µs, total: 616 µs
Wall time: 388 µs
CPU times: user 1.91 ms, sys: 40 µs, total: 1.95 ms
Wall time: 1.64 ms


In [30]:
### Collecting lattice sites

N_half = 50
N_sites = 6

S_1=[]
for s in feature_importances:
    if s<=N_half:
        S_1.append(s)
    if len(S_1)==N_sites:
        break
S_1 = sorted(S_1)

### Collecting wavevector lattice sites
K_1=[]
for s in wavevector_feature_importances:
    #if s<=N_half:
    K_1.append(s)
    if len(K_1)==N_sites:
        break
K_1 = sorted(K_1)

parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_7TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = S_1#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 34896

### Fourier features
parameters["fourier_features_to_use"] = K_1
parameters["fourier_mode"] = "dst"
parameters["fourier_real"] = "imag"
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH1_7TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=58.0, style=ProgressStyle(description_wid…




{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T02:42:17.993365',
     'end_time': '2020-07-16T02:42:18.013454',
     'duration': 0.020089,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T02:42:18.030277',
     'end_time': '2020-07-16T02:42:18.046705'

#### EIGHTH scenario: DST of ODD-symmetric REAL signal, using only sites $S'_1$ in FIRST HALF real space and $K'_1$ in wavevector space

In [34]:
### Collecting lattice sites
N_half = 50
N_sites = 4
S_1_prime=[]
for s in feature_importances:
    if s<=N_half:
        S_1_prime.append(s)
    if len(S_1_prime)==N_sites:
        break
S_1_prime = sorted(S_1_prime)

### Collecting wavevector lattice sites

K_1_prime=[]
for s in wavevector_feature_importances:
    #if s<=N_half:
    K_1_prime.append(s)
    if len(K_1_prime)==N_sites:
        break
K_1_prime = sorted(K_1_prime)

parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_8TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = S_1_prime #[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 4303546

### Fourier features
parameters["fourier_features_to_use"] = K_1_prime
parameters["fourier_mode"] = "dst"
parameters["fourier_real"] = "imag"
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH1_8TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=58.0, style=ProgressStyle(description_wid…




{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T03:06:50.546171',
     'end_time': '2020-07-16T03:06:50.566515',
     'duration': 0.020344,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-16T03:06:50.583301',
     'end_time': '2020-07-16T03:06:50.599340'

## SSH2

In [1]:
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

template = TEMPLATE_NOTEBOOK
parameters = {"model_kw":{"criterion":"entropy", "n_estimators":25, "n_jobs":-1}, \
             "allowed_windings":[-1,0,1,2], "val_split":0.5, "features_to_use":None, "shuffle_features":False,\
             "n_experiments":100, "start_n":0, "fit_params":None, "shuffle_rows": True,"pred_params":None,\
             "random_features":False, "store_in_lists":False, "save_eigenvector":True,\
             "save_hamiltonian":True, "save_accuracy":True, "save_models":True,\
             }
kernel_name = KERNEL_NAME

CPU times: user 736 ms, sys: 249 ms, total: 985 ms
Wall time: 588 ms


### periodic_100_6561


#### ZEROTH scenario: SSH2 Real space lattice sites 

In [None]:
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

### parameters
parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_0TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 83649369
### Fourier features
parameters["fourier_features_to_use"] = None
parameters["fourier_mode"] = None
parameters["fourier_real"] = None
parameters["fourier_normalize"]= None
parameters["fourier_fillna"] = None

output_file = SSH2_0TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True);


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=58.0, style=ProgressStyle(description_wid…

#### FIRST scenario: Best real space lattice sites in $S_2$

In the first scenario, we run simulations using the $S_2 = [0, 1, 2, 3, 4, 5, 6, 7, 46, 48, 49, 50]$ best lattices in real space, as determined by the information entropy signatures.


In [None]:
json_dir = os.path.join(SSH2_PERIODIC_0TH_SCENARIO_100_6561_SIMULATION_DIR,"feature_importances")
filename = os.path.join(json_dir,"sorted_feature_importance.json")
with open(filename) as f:
    json_data = json.load(f)
#feature_importances = json.load(os.path.join(json_dir,"feature_importance.json"))
feature_importances = {int(k): v for k,v in json_data.items()}
feature_importances = list(feature_importances.keys())
print("feature_iportances")
print(feature_importances)

In [12]:
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

### Collecting lattice sites

N_half = 50
N_sites = 25

S_2=[]
for s,_ in feature_importances:
    if s<=N_half:
        S_2.append(s)
    if len(S_2)==N_sites:
        break
S_2 = sorted(S_2)

### parameters
parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_1ST_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = S_2 #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 244854
### Fourier features
parameters["fourier_features_to_use"] = None
parameters["fourier_mode"] = None
parameters["fourier_real"] = None
parameters["fourier_normalize"]= None
parameters["fourier_fillna"] = None

output_file = SSH2_1ST_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)





The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…


CPU times: user 5.66 s, sys: 138 ms, total: 5.8 s
Wall time: 38min 43s
CPU times: user 5.67 s, sys: 138 ms, total: 5.81 s
Wall time: 38min 43s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T18:44:41.916568',
     'end_time': '2020-07-09T18:44:41.970306',
     'duration': 0.053738,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T18:44:41.989919',
     'end_time': '2020-07-09T18:44:42.005133'

#### SECOND scenario: Best real space lattice sites in $S'_2$ 

In the second scenario we run simulations using the best $S'_2 = [0, 1, 2, 3, 4, 5]$ real space lattice sites, as determined by the entropy signatures

In [13]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

### Collecting latttice sites
### Collecting lattice sites
feature_importances = [(98, 0.030775680224168177), (1, 0.03069088538607333), (0, 0.026794700674389625), (99, 0.026249868968330863),
 (96, 0.020897799386158718), (3, 0.020489140983260747), (2, 0.01992793686700233), (97, 0.019545673056848806),
 (94, 0.01923075195671657), (5, 0.019045271320840478), (4, 0.017291096293802714), (95, 0.017048847481531043),
 (48, 0.015353109828878693), (51, 0.014934669606650726), (50, 0.013899164937009753), (92, 0.01371775078111256),
 (7, 0.013462982352684496), (49, 0.013421274946371955), (6, 0.012883612083351053), (93, 0.012769882886851382),
 (46, 0.012337897368774504), (53, 0.01212310608565037), (90, 0.010997653223096683), (9, 0.010834481547638681),
 (52, 0.010812653542591071), (47, 0.010655015434978536), (44, 0.010465047210191402), (55, 0.010281045213583478),
 (8, 0.009828027039565111), (91, 0.00959694782849967), (10, 0.009302976753229216), (88, 0.009278280526791806),
 (89, 0.00919392033505742), (11, 0.009179509784012994), (54, 0.009047708913725763), (45, 0.008918866304158874),
 (42, 0.008529473018171781), (57, 0.008364823778403564), (86, 0.008246572727366506), (56, 0.00821180818167621),
 (13, 0.00816970905230619), (12, 0.008114276523127217), (43, 0.008081824889517098), (87, 0.00794484045520985),
 (40, 0.007701200518589432), (59, 0.007612698497346249), (58, 0.0075668053024482256), (84, 0.007536839705708784),
 (14, 0.00747972131229634), (85, 0.007477795352594296), (41, 0.007466758344364884), (15, 0.007434672006973498),
 (16, 0.007302080712545534), (83, 0.007225210080693078), (38, 0.007221329558484434), (25, 0.007184232143171722),
 (61, 0.007168350645728535), (67, 0.007152534098465049), (34, 0.007124371893597504), (65, 0.007121552553277619),
 (32, 0.0071142212435522665), (60, 0.007104063693545338), (75, 0.007080742078305485), (82, 0.007060642189080975),
 (27, 0.007037596851734687), (73, 0.007036728229713947), (66, 0.007012846480636373), (39, 0.0070128347301864555),
 (62, 0.007012131745055074), (74, 0.007009007463465764), (72, 0.006999191033856658), (33, 0.00699226356888025),
 (37, 0.006972519992091061), (18, 0.006956490676749397), (17, 0.006945800270561406), (24, 0.0069234690872524915),
 (26, 0.006922955676325168), (22, 0.006918202675106495), (20, 0.006916923010632973), (64, 0.006914878448676203),
 (81, 0.00691250007864052), (77, 0.006906017355299236), (36, 0.006904235174864243), (79, 0.006892372418539275),
 (35, 0.006880046330414811), (63, 0.006861090174322749), (29, 0.006847900311648197), (68, 0.006802621205071987), 
 (31, 0.006783038952794761), (70, 0.006769967143038218), (78, 0.006723531131868591), (30, 0.006719894872152803),
 (69, 0.006715314101826717), (21, 0.0067062765156384666), (23, 0.006688320823666467), (28, 0.006683088644608946),
 (80, 0.006666799865989856), (71, 0.006626751430553458), (19, 0.006624407453970405), (76, 0.006619598384038603)]

N_half = 50
N_sites = 12

S_2_prime=[]
for s,_ in feature_importances:
    if s<=N_half:
        S_2_prime.append(s)
    if len(S_2_prime)==N_sites:
        break
S_2_prime = sorted(S_2_prime)

### parameters
parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_2ND_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = S_2_prime #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 93474
### Fourier features
parameters["fourier_features_to_use"] = None
parameters["fourier_mode"] = None
parameters["fourier_real"] = None
parameters["fourier_normalize"]= None
parameters["fourier_fillna"] = None

output_file = SSH2_2ND_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)





The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…


CPU times: user 4.9 s, sys: 165 ms, total: 5.06 s
Wall time: 25min 8s
CPU times: user 4.91 s, sys: 166 ms, total: 5.07 s
Wall time: 25min 8s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T19:23:25.644293',
     'end_time': '2020-07-09T19:23:25.691556',
     'duration': 0.047263,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T19:23:25.710313',
     'end_time': '2020-07-09T19:23:25.725348'

#### THIRD scenario:DFT of REAL signal, computed from ALL lattice sites (51 features)

The third scenario uses as features the REAL part of the the first HALF of components of the DFT of real space eigenvectors, computed from ALL real space lattice sites. This leads to $N=51$ engineered features. Note that in this case we use ALL $N=100$ features in real space to compute the DFT.

In [14]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_3RD_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 467

### Fourier features
parameters["fourier_features_to_use"] = list(range(0,51,1))
parameters["fourier_mode"] = "dft"
parameters["fourier_real"] = "real"
parameters["normalize"]= False
parameters["fillna"] = False

output_file = SSH2_3RD_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…


CPU times: user 5.55 s, sys: 171 ms, total: 5.72 s
Wall time: 44min 41s
CPU times: user 5.55 s, sys: 171 ms, total: 5.72 s
Wall time: 44min 41s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T19:48:34.006343',
     'end_time': '2020-07-09T19:48:34.057332',
     'duration': 0.050989,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T19:48:34.076614',
     'end_time': '2020-07-09T19:48:34.091910'

#### FOURTH scenario: DCT of EVEN-symmetric REAL signal (51 features) 

The fourth scenario uses as features the DCT computed assuming that the real space vectors are REAL and EVEN-symmetric around 0 and $\frac{N}{2}$. The wavevector space components are computed using all EVEN and ODD real lattice sites from the first half of the real space lattice. The number of resulting features is $\frac{N}{2}+1 = 51$

In [15]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_4TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 548936312

### Fourier features
parameters["fourier_features_to_use"] = list(range(0,51,1))
parameters["fourier_mode"] = "dct"
parameters["fourier_real"] = None
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH2_4TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…


CPU times: user 7.26 s, sys: 275 ms, total: 7.53 s
Wall time: 1h 4min 42s
CPU times: user 7.26 s, sys: 275 ms, total: 7.53 s
Wall time: 1h 4min 42s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T20:33:15.290899',
     'end_time': '2020-07-09T20:33:15.310534',
     'duration': 0.019635,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T20:33:15.326586',
     'end_time': '2020-07-09T20:33:15.342543'

#### FIFTH scenario: DCT of EVEN-symmetric REAL signal, using only EVEN components of DCF (2 5features) 

The fifth scenario uses as features only the EVEN components of the DCT computed assuming that the real space vectors are REAL and EVEN-symmetric around 0 and $\frac{N}{2}$. The wavevector space components are computed using all EVEN and ODD real lattice sites from the first half of the real space lattice. The resulting number of features is $\frac{N}{4}+1=26$

In [16]:
%%time
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_5TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 34896

### Fourier features
parameters["fourier_features_to_use"] = list(range(0,51,2))
parameters["fourier_mode"] = "dct"
parameters["fourier_real"] = None
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH2_5TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…


CPU times: user 5.43 s, sys: 177 ms, total: 5.6 s
Wall time: 45min 19s
CPU times: user 5.43 s, sys: 177 ms, total: 5.61 s
Wall time: 45min 19s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T21:37:58.083611',
     'end_time': '2020-07-09T21:37:58.137510',
     'duration': 0.053899,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T21:37:58.157130',
     'end_time': '2020-07-09T21:37:58.172539'

#### SIXTH scenario: DCT of EVEN-symmetric REAL signal, using only eigenmodes $K_2$ in wavevector space

The sixth scenario uses as features only the $K_2 = (0, 1, 2, 3, 4, 5, 6, 8, 42, 44, 46, 48)$ components of the DCT computed assuming that the real space vectors are REAL and EVEN-symmetric around 0 and $\frac{N}{2}$. The wavevector space components are computed using all EVEN and ODD real lattice sites from the first half of the real space lattice. The resulting number of features is $~\frac{N}{8}~12$

In [17]:
### Collecting wavevector lattice sites
wave_vector_feature_importances = [(0, 0.950806830424763), (2, 0.01331131304222723), (98, 0.01331131304222723), (1, 0.004476201323265753),
 (99, 0.004476201323265753), (4, 0.0035857537589848837), (96, 0.0035857537589848837), (3, 0.001257935893002076),
 (97, 0.001257935893002076), (6, 0.0007069831412851035), (94, 0.0007069831412851035), (46, 0.0002467272079133921),
 (54, 0.0002467272079133921), (8, 0.00018241203787459185), (92, 0.00018241203787459185), (48, 0.0001451951516648948),
 (52, 0.0001451951516648948), (44, 9.843040009282692e-05), (56, 9.843040009282692e-05), (5, 9.19011466551899e-05),
 (95, 9.19011466551899e-05), (42, 6.866192435836294e-05), (58, 6.866192435836294e-05), (18, 3.3323093081472305e-05),
 (82, 3.3323093081472305e-05), (12, 3.106487312033803e-05), (88, 3.106487312033803e-05), (47, 2.3816419839103435e-05),
 (53, 2.3816419839103435e-05), (34, 2.3241123361927566e-05), (66, 2.3241123361927566e-05), (16, 2.2323627288064983e-05),
 (84, 2.2323627288064983e-05), (32, 2.2124235763368948e-05), (68, 2.2124235763368948e-05), (30, 2.10288772451345e-05),
 (70, 2.10288772451345e-05), (10, 2.090529400303802e-05), (90, 2.090529400303802e-05), (38, 1.9410706597490878e-05), 
 (62, 1.9410706597490878e-05), (36, 1.875724022079967e-05), (64, 1.875724022079967e-05), (26, 1.825645649535029e-05),
 (74, 1.825645649535029e-05), (28, 1.7862830398443465e-05), (72, 1.7862830398443465e-05), (40, 1.6194565946892457e-05),
 (60, 1.6194565946892457e-05), (33, 1.5392780128594145e-05), (67, 1.5392780128594145e-05), (19, 1.4185213987116413e-05),
 (81, 1.4185213987116413e-05), (7, 1.356204999924156e-05), (93, 1.356204999924156e-05), (35, 1.1170423908057128e-05),
 (65, 1.1170423908057128e-05), (14, 1.112046577249778e-05), (86, 1.112046577249778e-05), (20, 9.752018202687417e-06),
 (80, 9.752018202687417e-06), (15, 9.367245953490434e-06), (85, 9.367245953490434e-06), (31, 8.822300302292681e-06),
 (69, 8.822300302292681e-06), (25, 6.643388933308398e-06), (75, 6.643388933308398e-06), (17, 6.609081046562858e-06),
 (83, 6.609081046562858e-06), (45, 5.544693390346116e-06), (55, 5.544693390346116e-06), (50, 5.207356471316938e-06),
 (29, 4.884862846567487e-06), (71, 4.884862846567487e-06), (27, 4.410013160570784e-06), (73, 4.410013160570784e-06),
 (24, 3.959576099120265e-06), (76, 3.959576099120265e-06), (49, 3.3599159617141762e-06), (51, 3.3599159617141762e-06),
 (9, 2.5089373221024502e-06), (91, 2.5089373221024502e-06), (21, 1.5004285162157062e-06), (79, 1.5004285162157062e-06),
 (43, 5.907772852569324e-07), (57, 5.907772852569324e-07), (37, 2.924375698681946e-07), (63, 2.924375698681946e-07),
 (23, 1.629456807410269e-07), (77, 1.629456807410269e-07), (41, 1.37098461515355e-07), (59, 1.37098461515355e-07),
 (11, 1.1995854099374903e-07), (89, 1.1995854099374903e-07), (22, 4.723512969971105e-08), (78, 4.723512969971105e-08),
 (39, 8.485226090679877e-09), (61, 8.485226090679877e-09), (13, 4.405272447523249e-09), (87, 4.405272447523249e-09)]

N_half = 50
N_sites = 25

K_2=[]
for s,_ in wave_vector_feature_importances:
    if s<=N_half:
        K_2.append(s)
    if len(K_2)==N_sites:
        break
K_2 = sorted(K_2)



parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_6TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 34896

### Fourier features
parameters["fourier_features_to_use"] = K_2
parameters["fourier_mode"] = "dct"
parameters["fourier_real"] = None
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH2_6TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…




{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T22:23:17.702667',
     'end_time': '2020-07-09T22:23:17.758006',
     'duration': 0.055339,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T22:23:17.777923',
     'end_time': '2020-07-09T22:23:17.793157'

#### SEVENTH scenario: DCT of EVEN-symmetric REAL signal, using only eigenmodes $K'_2$ in wavevector space

The seventh scenario uses as features only the $K'_2 = (0, 1, 2, 3, 4, 5)$ components of the DCT computed assuming that the real space vectors are REAL and EVEN-symmetric around 0 and $\frac{N}{2}$. The wavevector space components are computed using all EVEN and ODD real lattice sites from the first half of the real space lattice. The resulting number of features is $~\frac{N}{16} ~ 6$

In [18]:
### Collecting wavevector lattice sites
wave_vector_feature_importances = [(0, 0.950806830424763), (2, 0.01331131304222723), (98, 0.01331131304222723), (1, 0.004476201323265753),
 (99, 0.004476201323265753), (4, 0.0035857537589848837), (96, 0.0035857537589848837), (3, 0.001257935893002076),
 (97, 0.001257935893002076), (6, 0.0007069831412851035), (94, 0.0007069831412851035), (46, 0.0002467272079133921),
 (54, 0.0002467272079133921), (8, 0.00018241203787459185), (92, 0.00018241203787459185), (48, 0.0001451951516648948),
 (52, 0.0001451951516648948), (44, 9.843040009282692e-05), (56, 9.843040009282692e-05), (5, 9.19011466551899e-05),
 (95, 9.19011466551899e-05), (42, 6.866192435836294e-05), (58, 6.866192435836294e-05), (18, 3.3323093081472305e-05),
 (82, 3.3323093081472305e-05), (12, 3.106487312033803e-05), (88, 3.106487312033803e-05), (47, 2.3816419839103435e-05),
 (53, 2.3816419839103435e-05), (34, 2.3241123361927566e-05), (66, 2.3241123361927566e-05), (16, 2.2323627288064983e-05),
 (84, 2.2323627288064983e-05), (32, 2.2124235763368948e-05), (68, 2.2124235763368948e-05), (30, 2.10288772451345e-05),
 (70, 2.10288772451345e-05), (10, 2.090529400303802e-05), (90, 2.090529400303802e-05), (38, 1.9410706597490878e-05), 
 (62, 1.9410706597490878e-05), (36, 1.875724022079967e-05), (64, 1.875724022079967e-05), (26, 1.825645649535029e-05),
 (74, 1.825645649535029e-05), (28, 1.7862830398443465e-05), (72, 1.7862830398443465e-05), (40, 1.6194565946892457e-05),
 (60, 1.6194565946892457e-05), (33, 1.5392780128594145e-05), (67, 1.5392780128594145e-05), (19, 1.4185213987116413e-05),
 (81, 1.4185213987116413e-05), (7, 1.356204999924156e-05), (93, 1.356204999924156e-05), (35, 1.1170423908057128e-05),
 (65, 1.1170423908057128e-05), (14, 1.112046577249778e-05), (86, 1.112046577249778e-05), (20, 9.752018202687417e-06),
 (80, 9.752018202687417e-06), (15, 9.367245953490434e-06), (85, 9.367245953490434e-06), (31, 8.822300302292681e-06),
 (69, 8.822300302292681e-06), (25, 6.643388933308398e-06), (75, 6.643388933308398e-06), (17, 6.609081046562858e-06),
 (83, 6.609081046562858e-06), (45, 5.544693390346116e-06), (55, 5.544693390346116e-06), (50, 5.207356471316938e-06),
 (29, 4.884862846567487e-06), (71, 4.884862846567487e-06), (27, 4.410013160570784e-06), (73, 4.410013160570784e-06),
 (24, 3.959576099120265e-06), (76, 3.959576099120265e-06), (49, 3.3599159617141762e-06), (51, 3.3599159617141762e-06),
 (9, 2.5089373221024502e-06), (91, 2.5089373221024502e-06), (21, 1.5004285162157062e-06), (79, 1.5004285162157062e-06),
 (43, 5.907772852569324e-07), (57, 5.907772852569324e-07), (37, 2.924375698681946e-07), (63, 2.924375698681946e-07),
 (23, 1.629456807410269e-07), (77, 1.629456807410269e-07), (41, 1.37098461515355e-07), (59, 1.37098461515355e-07),
 (11, 1.1995854099374903e-07), (89, 1.1995854099374903e-07), (22, 4.723512969971105e-08), (78, 4.723512969971105e-08),
 (39, 8.485226090679877e-09), (61, 8.485226090679877e-09), (13, 4.405272447523249e-09), (87, 4.405272447523249e-09)]

N_half = 50
N_sites = 12

K_2_prime=[]
for s,_ in wave_vector_feature_importances:
    if s<=N_half:
        K_2_prime.append(s)
    if len(K_2_prime)==N_sites:
        break
K_2_prime = sorted(K_2_prime)


parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_7TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 34896

### Fourier features
parameters["fourier_features_to_use"] = K_2_prime
parameters["fourier_mode"] = "dct"
parameters["fourier_real"] = None
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH2_7TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…




{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T23:11:02.183147',
     'end_time': '2020-07-09T23:11:02.202518',
     'duration': 0.019371,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T23:11:02.988831',
     'end_time': '2020-07-09T23:11:03.149742'

#### EIGHTH scenario: DCT of EVEN-symmetric REAL signal, using only sites $S_2$ in real space and eigenmodes $K_2$ in wavevector space

In [19]:
### Collecting lattice sites
feature_importances = [(98, 0.030775680224168177), (1, 0.03069088538607333), (0, 0.026794700674389625), (99, 0.026249868968330863),
 (96, 0.020897799386158718), (3, 0.020489140983260747), (2, 0.01992793686700233), (97, 0.019545673056848806),
 (94, 0.01923075195671657), (5, 0.019045271320840478), (4, 0.017291096293802714), (95, 0.017048847481531043),
 (48, 0.015353109828878693), (51, 0.014934669606650726), (50, 0.013899164937009753), (92, 0.01371775078111256),
 (7, 0.013462982352684496), (49, 0.013421274946371955), (6, 0.012883612083351053), (93, 0.012769882886851382),
 (46, 0.012337897368774504), (53, 0.01212310608565037), (90, 0.010997653223096683), (9, 0.010834481547638681),
 (52, 0.010812653542591071), (47, 0.010655015434978536), (44, 0.010465047210191402), (55, 0.010281045213583478),
 (8, 0.009828027039565111), (91, 0.00959694782849967), (10, 0.009302976753229216), (88, 0.009278280526791806),
 (89, 0.00919392033505742), (11, 0.009179509784012994), (54, 0.009047708913725763), (45, 0.008918866304158874),
 (42, 0.008529473018171781), (57, 0.008364823778403564), (86, 0.008246572727366506), (56, 0.00821180818167621),
 (13, 0.00816970905230619), (12, 0.008114276523127217), (43, 0.008081824889517098), (87, 0.00794484045520985),
 (40, 0.007701200518589432), (59, 0.007612698497346249), (58, 0.0075668053024482256), (84, 0.007536839705708784),
 (14, 0.00747972131229634), (85, 0.007477795352594296), (41, 0.007466758344364884), (15, 0.007434672006973498),
 (16, 0.007302080712545534), (83, 0.007225210080693078), (38, 0.007221329558484434), (25, 0.007184232143171722),
 (61, 0.007168350645728535), (67, 0.007152534098465049), (34, 0.007124371893597504), (65, 0.007121552553277619),
 (32, 0.0071142212435522665), (60, 0.007104063693545338), (75, 0.007080742078305485), (82, 0.007060642189080975),
 (27, 0.007037596851734687), (73, 0.007036728229713947), (66, 0.007012846480636373), (39, 0.0070128347301864555),
 (62, 0.007012131745055074), (74, 0.007009007463465764), (72, 0.006999191033856658), (33, 0.00699226356888025),
 (37, 0.006972519992091061), (18, 0.006956490676749397), (17, 0.006945800270561406), (24, 0.0069234690872524915),
 (26, 0.006922955676325168), (22, 0.006918202675106495), (20, 0.006916923010632973), (64, 0.006914878448676203),
 (81, 0.00691250007864052), (77, 0.006906017355299236), (36, 0.006904235174864243), (79, 0.006892372418539275),
 (35, 0.006880046330414811), (63, 0.006861090174322749), (29, 0.006847900311648197), (68, 0.006802621205071987), 
 (31, 0.006783038952794761), (70, 0.006769967143038218), (78, 0.006723531131868591), (30, 0.006719894872152803),
 (69, 0.006715314101826717), (21, 0.0067062765156384666), (23, 0.006688320823666467), (28, 0.006683088644608946),
 (80, 0.006666799865989856), (71, 0.006626751430553458), (19, 0.006624407453970405), (76, 0.006619598384038603)]

N_half = 50
N_sites = 25

S_2=[]
for s,_ in feature_importances:
    if s<=N_half:
        S_2.append(s)
    if len(S_2)==N_sites:
        break
S_2 = sorted(S_2)

### Collecting wavevector lattice sites
wave_vector_feature_importances = [(0, 0.950806830424763), (2, 0.01331131304222723), (98, 0.01331131304222723), (1, 0.004476201323265753),
 (99, 0.004476201323265753), (4, 0.0035857537589848837), (96, 0.0035857537589848837), (3, 0.001257935893002076),
 (97, 0.001257935893002076), (6, 0.0007069831412851035), (94, 0.0007069831412851035), (46, 0.0002467272079133921),
 (54, 0.0002467272079133921), (8, 0.00018241203787459185), (92, 0.00018241203787459185), (48, 0.0001451951516648948),
 (52, 0.0001451951516648948), (44, 9.843040009282692e-05), (56, 9.843040009282692e-05), (5, 9.19011466551899e-05),
 (95, 9.19011466551899e-05), (42, 6.866192435836294e-05), (58, 6.866192435836294e-05), (18, 3.3323093081472305e-05),
 (82, 3.3323093081472305e-05), (12, 3.106487312033803e-05), (88, 3.106487312033803e-05), (47, 2.3816419839103435e-05),
 (53, 2.3816419839103435e-05), (34, 2.3241123361927566e-05), (66, 2.3241123361927566e-05), (16, 2.2323627288064983e-05),
 (84, 2.2323627288064983e-05), (32, 2.2124235763368948e-05), (68, 2.2124235763368948e-05), (30, 2.10288772451345e-05),
 (70, 2.10288772451345e-05), (10, 2.090529400303802e-05), (90, 2.090529400303802e-05), (38, 1.9410706597490878e-05), 
 (62, 1.9410706597490878e-05), (36, 1.875724022079967e-05), (64, 1.875724022079967e-05), (26, 1.825645649535029e-05),
 (74, 1.825645649535029e-05), (28, 1.7862830398443465e-05), (72, 1.7862830398443465e-05), (40, 1.6194565946892457e-05),
 (60, 1.6194565946892457e-05), (33, 1.5392780128594145e-05), (67, 1.5392780128594145e-05), (19, 1.4185213987116413e-05),
 (81, 1.4185213987116413e-05), (7, 1.356204999924156e-05), (93, 1.356204999924156e-05), (35, 1.1170423908057128e-05),
 (65, 1.1170423908057128e-05), (14, 1.112046577249778e-05), (86, 1.112046577249778e-05), (20, 9.752018202687417e-06),
 (80, 9.752018202687417e-06), (15, 9.367245953490434e-06), (85, 9.367245953490434e-06), (31, 8.822300302292681e-06),
 (69, 8.822300302292681e-06), (25, 6.643388933308398e-06), (75, 6.643388933308398e-06), (17, 6.609081046562858e-06),
 (83, 6.609081046562858e-06), (45, 5.544693390346116e-06), (55, 5.544693390346116e-06), (50, 5.207356471316938e-06),
 (29, 4.884862846567487e-06), (71, 4.884862846567487e-06), (27, 4.410013160570784e-06), (73, 4.410013160570784e-06),
 (24, 3.959576099120265e-06), (76, 3.959576099120265e-06), (49, 3.3599159617141762e-06), (51, 3.3599159617141762e-06),
 (9, 2.5089373221024502e-06), (91, 2.5089373221024502e-06), (21, 1.5004285162157062e-06), (79, 1.5004285162157062e-06),
 (43, 5.907772852569324e-07), (57, 5.907772852569324e-07), (37, 2.924375698681946e-07), (63, 2.924375698681946e-07),
 (23, 1.629456807410269e-07), (77, 1.629456807410269e-07), (41, 1.37098461515355e-07), (59, 1.37098461515355e-07),
 (11, 1.1995854099374903e-07), (89, 1.1995854099374903e-07), (22, 4.723512969971105e-08), (78, 4.723512969971105e-08),
 (39, 8.485226090679877e-09), (61, 8.485226090679877e-09), (13, 4.405272447523249e-09), (87, 4.405272447523249e-09)]

N_half = 50
N_sites = 25

K_2=[]
for s,_ in wave_vector_feature_importances:
    if s<=N_half:
        K_2.append(s)
    if len(K_2)==N_sites:
        break
K_2 = sorted(K_2)



parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_8TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = S_2#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 34896

### Fourier features
parameters["fourier_features_to_use"] = K_2
parameters["fourier_mode"] = "dct"
parameters["fourier_real"] = None
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH2_8TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…




{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T23:45:28.719231',
     'end_time': '2020-07-09T23:45:28.773343',
     'duration': 0.054112,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-09T23:45:28.792760',
     'end_time': '2020-07-09T23:45:28.808052'

#### NINTH scenario: DCT of EVEN-symmetric REAL signal, using only sites $S'_2$ in real space and eigenmodes $K'_2$ in wavevector space

In [21]:
### Collecting lattice sites
feature_importances = [(98, 0.030775680224168177), (1, 0.03069088538607333), (0, 0.026794700674389625), (99, 0.026249868968330863),
 (96, 0.020897799386158718), (3, 0.020489140983260747), (2, 0.01992793686700233), (97, 0.019545673056848806),
 (94, 0.01923075195671657), (5, 0.019045271320840478), (4, 0.017291096293802714), (95, 0.017048847481531043),
 (48, 0.015353109828878693), (51, 0.014934669606650726), (50, 0.013899164937009753), (92, 0.01371775078111256),
 (7, 0.013462982352684496), (49, 0.013421274946371955), (6, 0.012883612083351053), (93, 0.012769882886851382),
 (46, 0.012337897368774504), (53, 0.01212310608565037), (90, 0.010997653223096683), (9, 0.010834481547638681),
 (52, 0.010812653542591071), (47, 0.010655015434978536), (44, 0.010465047210191402), (55, 0.010281045213583478),
 (8, 0.009828027039565111), (91, 0.00959694782849967), (10, 0.009302976753229216), (88, 0.009278280526791806),
 (89, 0.00919392033505742), (11, 0.009179509784012994), (54, 0.009047708913725763), (45, 0.008918866304158874),
 (42, 0.008529473018171781), (57, 0.008364823778403564), (86, 0.008246572727366506), (56, 0.00821180818167621),
 (13, 0.00816970905230619), (12, 0.008114276523127217), (43, 0.008081824889517098), (87, 0.00794484045520985),
 (40, 0.007701200518589432), (59, 0.007612698497346249), (58, 0.0075668053024482256), (84, 0.007536839705708784),
 (14, 0.00747972131229634), (85, 0.007477795352594296), (41, 0.007466758344364884), (15, 0.007434672006973498),
 (16, 0.007302080712545534), (83, 0.007225210080693078), (38, 0.007221329558484434), (25, 0.007184232143171722),
 (61, 0.007168350645728535), (67, 0.007152534098465049), (34, 0.007124371893597504), (65, 0.007121552553277619),
 (32, 0.0071142212435522665), (60, 0.007104063693545338), (75, 0.007080742078305485), (82, 0.007060642189080975),
 (27, 0.007037596851734687), (73, 0.007036728229713947), (66, 0.007012846480636373), (39, 0.0070128347301864555),
 (62, 0.007012131745055074), (74, 0.007009007463465764), (72, 0.006999191033856658), (33, 0.00699226356888025),
 (37, 0.006972519992091061), (18, 0.006956490676749397), (17, 0.006945800270561406), (24, 0.0069234690872524915),
 (26, 0.006922955676325168), (22, 0.006918202675106495), (20, 0.006916923010632973), (64, 0.006914878448676203),
 (81, 0.00691250007864052), (77, 0.006906017355299236), (36, 0.006904235174864243), (79, 0.006892372418539275),
 (35, 0.006880046330414811), (63, 0.006861090174322749), (29, 0.006847900311648197), (68, 0.006802621205071987), 
 (31, 0.006783038952794761), (70, 0.006769967143038218), (78, 0.006723531131868591), (30, 0.006719894872152803),
 (69, 0.006715314101826717), (21, 0.0067062765156384666), (23, 0.006688320823666467), (28, 0.006683088644608946),
 (80, 0.006666799865989856), (71, 0.006626751430553458), (19, 0.006624407453970405), (76, 0.006619598384038603)]

N_half = 50
N_sites = 12

S_2_prime=[]
for s,_ in feature_importances:
    if s<=N_half:
        S_2_prime.append(s)
    if len(S_2_prime)==N_sites:
        break
S_2_prime = sorted(S_2_prime)

### Collecting wavevector lattice sites
wave_vector_feature_importances = [(0, 0.950806830424763), (2, 0.01331131304222723), (98, 0.01331131304222723), (1, 0.004476201323265753),
 (99, 0.004476201323265753), (4, 0.0035857537589848837), (96, 0.0035857537589848837), (3, 0.001257935893002076),
 (97, 0.001257935893002076), (6, 0.0007069831412851035), (94, 0.0007069831412851035), (46, 0.0002467272079133921),
 (54, 0.0002467272079133921), (8, 0.00018241203787459185), (92, 0.00018241203787459185), (48, 0.0001451951516648948),
 (52, 0.0001451951516648948), (44, 9.843040009282692e-05), (56, 9.843040009282692e-05), (5, 9.19011466551899e-05),
 (95, 9.19011466551899e-05), (42, 6.866192435836294e-05), (58, 6.866192435836294e-05), (18, 3.3323093081472305e-05),
 (82, 3.3323093081472305e-05), (12, 3.106487312033803e-05), (88, 3.106487312033803e-05), (47, 2.3816419839103435e-05),
 (53, 2.3816419839103435e-05), (34, 2.3241123361927566e-05), (66, 2.3241123361927566e-05), (16, 2.2323627288064983e-05),
 (84, 2.2323627288064983e-05), (32, 2.2124235763368948e-05), (68, 2.2124235763368948e-05), (30, 2.10288772451345e-05),
 (70, 2.10288772451345e-05), (10, 2.090529400303802e-05), (90, 2.090529400303802e-05), (38, 1.9410706597490878e-05), 
 (62, 1.9410706597490878e-05), (36, 1.875724022079967e-05), (64, 1.875724022079967e-05), (26, 1.825645649535029e-05),
 (74, 1.825645649535029e-05), (28, 1.7862830398443465e-05), (72, 1.7862830398443465e-05), (40, 1.6194565946892457e-05),
 (60, 1.6194565946892457e-05), (33, 1.5392780128594145e-05), (67, 1.5392780128594145e-05), (19, 1.4185213987116413e-05),
 (81, 1.4185213987116413e-05), (7, 1.356204999924156e-05), (93, 1.356204999924156e-05), (35, 1.1170423908057128e-05),
 (65, 1.1170423908057128e-05), (14, 1.112046577249778e-05), (86, 1.112046577249778e-05), (20, 9.752018202687417e-06),
 (80, 9.752018202687417e-06), (15, 9.367245953490434e-06), (85, 9.367245953490434e-06), (31, 8.822300302292681e-06),
 (69, 8.822300302292681e-06), (25, 6.643388933308398e-06), (75, 6.643388933308398e-06), (17, 6.609081046562858e-06),
 (83, 6.609081046562858e-06), (45, 5.544693390346116e-06), (55, 5.544693390346116e-06), (50, 5.207356471316938e-06),
 (29, 4.884862846567487e-06), (71, 4.884862846567487e-06), (27, 4.410013160570784e-06), (73, 4.410013160570784e-06),
 (24, 3.959576099120265e-06), (76, 3.959576099120265e-06), (49, 3.3599159617141762e-06), (51, 3.3599159617141762e-06),
 (9, 2.5089373221024502e-06), (91, 2.5089373221024502e-06), (21, 1.5004285162157062e-06), (79, 1.5004285162157062e-06),
 (43, 5.907772852569324e-07), (57, 5.907772852569324e-07), (37, 2.924375698681946e-07), (63, 2.924375698681946e-07),
 (23, 1.629456807410269e-07), (77, 1.629456807410269e-07), (41, 1.37098461515355e-07), (59, 1.37098461515355e-07),
 (11, 1.1995854099374903e-07), (89, 1.1995854099374903e-07), (22, 4.723512969971105e-08), (78, 4.723512969971105e-08),
 (39, 8.485226090679877e-09), (61, 8.485226090679877e-09), (13, 4.405272447523249e-09), (87, 4.405272447523249e-09)]

N_half = 50
N_sites = 12

K_2_prime=[]
for s,_ in wave_vector_feature_importances:
    if s<=N_half:
        K_2_prime.append(s)
    if len(K_2_prime)==N_sites:
        break
K_2_prime = sorted(K_2_prime)



parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_9TH_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = S_2_prime#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 34896

### Fourier features
parameters["fourier_features_to_use"] = K_2_prime
parameters["fourier_mode"] = "dct"
parameters["fourier_real"] = None
parameters["fourier_normalize"]= False
parameters["fourier_fillna"] = False

output_file = SSH2_9TH_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…




{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-10T01:52:44.639619',
     'end_time': '2020-07-10T01:52:44.686726',
     'duration': 0.047107,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-10T01:52:44.705605',
     'end_time': '2020-07-10T01:52:44.720785'