# Running simulation template

This notebook runs the template notebook `simulation_template.ipynb`, which performs simulations for nearest neighbours (ssh1) and second neighbours (ssh2) systems.

In [1]:
%%time
%load_ext autoreload
%autoreload 2
from simulation import *

CPU times: user 734 ms, sys: 259 ms, total: 992 ms
Wall time: 582 ms


## SSH1

In [2]:
template = TEMPLATE_NOTEBOOK
parameters = {"model_kw":{"criterion":"entropy"}, \
             "allowed_windings":[0,1], "val_split":0.9, "features_to_use":None, "shuffle_features":False,\
             "n_experiments":100, "start_n":0, "fit_params":None, "shuffle_rows": True,"pred_params":None,\
             "random_features":False, "store_in_lists":False, "save_eigenvector":True,\
             "save_hamiltonian":True, "save_accuracy":True, "save_models":True,\
             }
kernel_name = KERNEL_NAME

### periodic_100_6561

#### First scenario:DFT of REAL signal, computed from ALL lattice sites (51 features)

The first scenario uses as features the REAL part of the the first HALF of components of the DFT of real space eigenvectors, computed from ALL real space lattice sites. This leads to $N=51$ engineered features. Note that in this case we use ALL $N=100$ features in real space to compute the DFT.

In [3]:
%%time
parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_1ST_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 467

### Engineered features
parameters["fourier_features"] = list(range(0,51,1))
parameters["mode"] = "dft"
parameters["real"] = True
parameters["normalize"]= False
parameters["fillna"] = False

output_file = SSH1_1ST_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…


CPU times: user 5.56 s, sys: 128 ms, total: 5.69 s
Wall time: 34min 5s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-08T04:59:30.186082',
     'end_time': '2020-07-08T04:59:30.201246',
     'duration': 0.015164,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-08T04:59:30.216900',
     'end_time': '2020-07-08T04:59:30.231738'

#### Second scenario: DCT of EVEN-symmetric REAL signal (51 features) 

The second scenario uses as features the DCT computed assuming that the real space vectors are REAL and EVEN-symmetric around 0 and $\frac{N}{2}$. The wavevector space components are computed using all EVEN and ODD real lattice sites from the first half of the real space lattice. The number of resulting features is $\frac{N}{2}+1 = 51$

In [4]:
%%time
parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_2ND_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 548

### Engineered features
parameters["fourier_features"] = list(range(0,51,1))
parameters["mode"] = "dct"
parameters["real"] = False
parameters["normalize"]= False
parameters["fillna"] = False

output_file = SSH1_2ND_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…


CPU times: user 4.66 s, sys: 95.5 ms, total: 4.75 s
Wall time: 17min 38s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-08T05:33:35.571140',
     'end_time': '2020-07-08T05:33:35.625922',
     'duration': 0.054782,
     'status': 'completed'}},
   'source': '# Simulation template \n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with features generated from fourier components in the first scenario. This scenario is characterized by using only the EVEN wavevector space eigenmodes, computed from ALL real space components.'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-07-08T05:33:35.645225',
     'end_time': '2020-07-08T05:33:35.661956'

#### Third scenario: DCT of EVEN-symmetric REAL signal, using only EVEN components of DCF (2 5features) 

The third scenario uses as features only the EVEN components of the DCT computed assuming that the real space vectors are REAL and EVEN-symmetric around 0 and $\frac{N}{2}$. The wavevector space components are computed using all EVEN and ODD real lattice sites from the first half of the real space lattice. The resulting number of features is $\frac{N}{4}+1=26$

In [5]:
%%time
parameters["csv_path"] = SSH1_PERIODIC_100_6561_CSV 
parameters["model_name"] = "DecisionTreeClassifier"
parameters["simulation_dir"] = SSH1_PERIODIC_3RD_SCENARIO_100_6561_SIMULATION_DIR
parameters["features_to_use"] = None#[0, 3, 50, 51] #[0, 1, 3, 50, 51, 53] 
parameters["random_state"] = 34896

### Engineered features
parameters["fourier_features"] = list(range(0,51,2))
parameters["mode"] = "dct"
parameters["real"] = False
parameters["normalize"]= False
parameters["fillna"] = False

output_file = SSH1_3RD_SCENARIO_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=54.0, style=ProgressStyle(description_wid…




PapermillExecutionError: 
---------------------------------------------------------------------------
Exception encountered at "In [41]":
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-41-133273929b96> in <module>
----> 1 print(simulation.fourier_matrix[:,50])

IndexError: index 50 is out of bounds for axis 1 with size 26


#### Fourth scenario: Best $\frac{N}{2} = 50$ real space lattice sites

In this scenario, we run simulations using the 50 best lattices in real space, as determined by the information entropy signatures.

#### periodic_140_6561

In [4]:
%%time
#parameters["csv_path"] = SSH1_PERIODIC_140_6561_CSV 
#parameters["model_name"] = "DecisionTreeClassifier"
#parameters["simulation_dir"] = SSH1_PERIODIC_LESS_140_6561_SIMULATION_DIR
#parameters["random_state"] = 147
#output_file = SSH1_PERIODIC_LESS_140_6561_OUTPUT_FILE
#pm.execute_notebook(template,
#                    output_file,
#                    parameters=parameters,
#                    kernel_name=kernel_name,
#                    nest_asyncio=True)

CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 5.01 µs


#### periodic_180_6561

In [5]:
%%time
#parameters["csv_path"] = SSH1_PERIODIC_180_6561_CSV #
#parameters["model_name"] = "DecisionTreeClassifier"
#parameters["simulation_dir"] = SSH1_PERIODIC_LESS_180_6561_SIMULATION_DIR
#parameters["random_state"] = 257
#output_file = SSH1_PERIODIC_LESS_180_6561_OUTPUT_FILE
#pm.execute_notebook(template,
#                    output_file,
#                    parameters=parameters,
#                    kernel_name=kernel_name,
#                    nest_asyncio=True)

CPU times: user 1e+03 ns, sys: 0 ns, total: 1e+03 ns
Wall time: 3.58 µs


#### periodic_220_6561

In [6]:
%%time
#parameters["csv_path"] = SSH1_PERIODIC_220_6561_CSV 
#parameters["model_name"] = "DecisionTreeClassifier"
#parameters["simulation_dir"] = SSH1_PERIODIC_LESS_220_6561_SIMULATION_DIR
#parameters["random_state"] = 383
#output_file = SSH1_PERIODIC_LESS_220_6561_OUTPUT_FILE
#pm.execute_notebook(template,
#                    output_file,
#                    parameters=parameters,
#                    kernel_name=kernel_name,
#                    nest_asyncio=True)

CPU times: user 2 µs, sys: 1 µs, total: 3 µs
Wall time: 4.53 µs


## SSH2

In [7]:
template = TEMPLATE_NOTEBOOK
parameters = {"model_kw":{"criterion":"entropy", "n_estimators":25, "n_jobs":-1}, \
             "allowed_windings":[-1,0,1,2], "val_split":0.5, "features_to_use":None, "shuffle_features":False,\
             "n_experiments":100, "start_n":0, "fit_params":None, "shuffle_rows": True,"pred_params":None,\
             "random_features":False, "store_in_lists":False, "save_eigenvector":True,\
             "save_hamiltonian":True, "save_accuracy":True, "save_models":True,\
             }
kernel_name = KERNEL_NAME

#### periodic_100_6561

In [8]:
SSH2_PERIODIC_LESS_100_6561_OUTPUT_FILE

'zzz_simulation_output_ssh2_periodic_less_100_6561.ipynb'

In [9]:
%%time
parameters["csv_path"] = SSH2_PERIODIC_100_6561_CSV 
parameters["model_name"] = "RandomForestClassifier"
parameters["simulation_dir"] = SSH2_PERIODIC_LESS_100_6561_SIMULATION_DIR
parameters["features_to_use"] = [0, 1, 3, 48, 50, 51, 96, 98, 99] #[0, 1, 48, 50, 98, 99] #[0, 1, 98, 99] 
parameters["random_state"] = 430
output_file = SSH2_PERIODIC_LESS_100_6561_OUTPUT_FILE
pm.execute_notebook(template,
                    output_file,
                    parameters=parameters,
                    kernel_name=kernel_name,
                    nest_asyncio=True)

HBox(children=(FloatProgress(value=0.0, description='Executing', max=25.0, style=ProgressStyle(description_wid…


CPU times: user 3.87 s, sys: 98.3 ms, total: 3.97 s
Wall time: 25min 3s


{'cells': [{'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-05-06T02:30:28.112533',
     'end_time': '2020-05-06T02:30:28.124557',
     'duration': 0.012024,
     'status': 'completed'}},
   'source': '# Simulation template (less features)\n\nIn this notebook we run the machine learning analysis of topological phase transitions occurring  in both nearest-neighbours SSH models (ssh1) and second neighbours models (ssh2) as decribed in the paper [Machine learning topological phases in real space](https://arxiv.org/abs/1901.01963). Here the simulation is run with fewer lattice sites'},
  {'cell_type': 'markdown',
   'metadata': {'tags': [],
    'papermill': {'exception': False,
     'start_time': '2020-05-06T02:30:28.132543',
     'end_time': '2020-05-06T02:30:28.139586',
     'duration': 0.007043,
     'status': 'completed'}},
   'source': '## Defining parameters'},
  {'cell_type': 'code',
   'execution_count': 1,
   'met

#### periodic_140_6561

In [10]:
%%time
#parameters["csv_path"] = SSH2_PERIODIC_LESS_140_6561_CSV 
#parameters["model_name"] = "RandomForestClassifier"
#parameters["simulation_dir"] = SSH2_PERIODIC_140_6561_SIMULATION_DIR
#parameters["random_state"] = 782
#output_file = SSH2_PERIODIC_LESS_140_6561_OUTPUT_FILE
#pm.execute_notebook(template,
#                    output_file,
#                    parameters=parameters,
#                    kernel_name=kernel_name,
#                    nest_asyncio=True#)

CPU times: user 1 µs, sys: 0 ns, total: 1 µs
Wall time: 3.1 µs


#### periodic_180_6561

In [11]:
%%time
#parameters["csv_path"] = SSH2_PERIODIC_LESS_180_6561_CSV 
#parameters["model_name"] = "RandomForestClassifier"
#parameters["simulation_dir"] = SSH2_PERIODIC_180_6561_SIMULATION_DIR
#parameters["random_state"] = 11
#output_file = SSH2_PERIODIC_LESS_180_6561_OUTPUT_FIL#E
#pm.execute_notebook(template,
#                    output_file,
#                    parameters=parameters,
#                    kernel_name=kernel_name,
#                    nest_asyncio=True)

CPU times: user 2 µs, sys: 0 ns, total: 2 µs
Wall time: 3.34 µs


#### periodic_220_6561

In [12]:
%%time
#parameters["csv_path"] = SSH2_PERIODIC_LESS_220_6561_CSV 
#parameters["model_name"] = "RandomForestClassifier"
#parameters["simulation_dir"] = SSH2_PERIODIC_220_6561_SIMULATION_DIR
#parameters["random_state"] = 401
#output_file = SSH2_PERIODIC_LESS_220_6561_OUTPUT_FILE
#pm.execute_notebook(template,
#                    output_file,
#                    parameters=parameters,
#                    kernel_name=kernel_name,
#                    nest_asyncio=True)

CPU times: user 1 µs, sys: 0 ns, total: 1 µs
Wall time: 3.34 µs
