# 1 Generate Noisy Data and Fitting

This is an optional notebook that is used to generate noisy data for the manuscript. We note that we provide the output of this notebook directly as a download. 

This notebook is used to generate and fit all the data required for the paper. We will generate data for the following noise cases: 

1, 2, 3, 4, 5, 6, 7, where each case corresponds to a different noise level. 

## Imports

In [None]:
%load_ext autoreload
%autoreload 2

from m3util.util.IO import download_and_unzip
from belearn.dataset.dataset import BE_Dataset
import numpy as np


## Loading data for SHO fitting


In [None]:
# Download the data file from Zenodo
url = 'https://zenodo.org/record/7774788/files/PZT_2080_raw_data.h5?download=1'

# Specify the filename and the path to save the file
filename = '/data_raw.h5'
save_path = './Data'

# download the file
download_and_unzip(filename, url, save_path)

In [None]:
data_path = save_path + '/' + filename

# instantiate the dataset object
dataset = BE_Dataset(data_path)

# print the contents of the file
dataset.print_be_tree()

## Generates Noisy Data

This function will generate noisy records and save them as an h5_main file in the USID format. This allows the data to be computed with the Pycroscopy SHO Fitter. 

In [None]:
# calculates the standard deviation and uses that for the noise
noise_STD = np.std(dataset.get_original_data)

# prints the standard deviation
print(noise_STD)

In [None]:
dataset.generate_noisy_data_records(noise_levels = np.arange(1,9), 
                                    verbose=True, 
                                    noise_STD=noise_STD)

## SHO fits on all the datasets

This will take some time, Each fit takes about 10 minutes to complete. 

In [None]:
out = [f"Noisy_Data_{i}" for i in np.arange(1,9)]
out.append("Raw_Data")

for data in out:
    print(f"Fitting {data}")
    dataset.SHO_Fitter(dataset = data, h5_sho_targ_grp = f"{data}_SHO_Fit", max_mem=1024*64, max_cores= 20)

### Checks the results to make sure it was saved correctly

In [None]:
# print the contents of the file
dataset.print_be_tree()