<h1>Running and saving multiple simulations</h1>
<div class="text">
    In this notebook, I will explain how to create simulations of coeval cubes and save only the statistics of these coeval cubes.
    <br>
    There are two functions to create run these simulations:
     <ul>
      <li>psbi.create_batched_data_2() which saves the $\ell^1$ and $\ell^2$ summaries, and the astrophysical paramters</li>
      <li>psbi.create_batched_data_0() which saves the statatistics PS, S1, S2, PS3d and the astrophysical parameters</li>
     </ul> 
</div>

In [1]:
import os
import sys
import numpy as np
sys.path.insert(1, os.path.abspath('../')) # Note that this line is useless with a regular pip installation of PyWST.
import pietrosbi as psbi



<h2>Function for the statistics PS, S1, S2, PS3d</h2>

In [2]:
# Number of bacthes for the simulations
n_batches = 2

# Number of simulation per batch
n_per_batch = 2

# So total number of simulations n = n_batches * n_per_batch

# Redshif, number of pixels and dimension of the cube
z = 9
n_pixels = 32
dim = 300

# NUmber of bins for the statistics
bins = 6

# These three paramters are ignored since it doesn't calculate the ell summaries
l1 = True
l2 = True
wavelet_type = 'morl'

# Starting seed
start_seed = 1

# True: if you want to save the statistics. Remember to change the directory to the oneyou want
# False: if you don't want to save the statistics
save_files = False

# This function doesn't return anything
# If you want to anlayse the statistics, you have to first save the files
# The cleaning of the 21cmfast-cache is done inside the function
psbi.create_batched_data_0(n_batches = n_batches, 
                           n_per_batch = n_per_batch, 
                           z = z, 
                           n_pixels = n_pixels, 
                           dim = dim, 
                           bins = bins, 
                           l1 = l1, 
                           l2 = l2, 
                           wavelet_type = wavelet_type, 
                           start_seed = start_seed, 
                           save_files = save_files)




<h2>Importing the .npy files with the statistics PS, S1, S2 and PS3d</h2>

In [3]:
# Path to the folder with all the .npy files
path_to_folder = '/travail/pguidi/correct_data/'

# Loading the 'observation' and the true paramaters
data_obs_S1 = np.load(path_to_folder + 'data_obs_S1_1000.npy')
print('Obs. S1 has been loaded', type(data_obs_S1), data_obs_S1.shape)
data_obs_PS = np.load(path_to_folder + 'data_obs_PS_1000.npy')
print('Obs. PS has been loaded', type(data_obs_PS), data_obs_PS.shape)
data_obs_PS3d = np.load(path_to_folder + 'data_obs_PS3d_1000.npy')
print('Obs. PS3d has been loaded', type(data_obs_PS3d), data_obs_PS3d.shape)
data_obs_S2 = np.load(path_to_folder + 'data_obs_S2_1000.npy')
print('Obs. S2 has been loaded', type(data_obs_S2), data_obs_S2.shape)
param_true = np.load(path_to_folder + 'param_true_1000.npy')[:2]
print('Obs. param. has been loaded', type(param_true), param_true.shape)

# Loading the simulated S1, PS, S2, PS3d and the list of parameters used in these simulations
data_S1 = psbi.import_data_test(path_to_folder + 'data_summary_S1_1000_[0-9].npy')
print('Simulated S1 have been loaded', type(data_S1), data_S1.shape)
data_PS = psbi.import_data_test(path_to_folder + 'data_summary_PS_1000_[0-9].npy')
print('Simulated PS have been loaded', type(data_PS), data_PS.shape)
data_PS3d = psbi.import_data_test(path_to_folder + 'data_summary_PS3d_1000_[0-9].npy')
print('Simulated PS3d have been loaded', type(data_PS3d), data_PS3d.shape)
data_S2 = psbi.import_data_test(path_to_folder + 'data_summary_S2_1000_[0-9].npy')
print('Simulated S2 have been loaded', type(data_S2), data_S2.shape)
param_list = psbi.import_data_test(path_to_folder + 'param_list_1000_[0-9].npy')[:,:2]
print('Simulated params have been loaded', type(param_list), param_list.shape)


Obs. S1 has been loaded <class 'numpy.ndarray'> (32, 6)
Obs. PS has been loaded <class 'numpy.ndarray'> (32, 6)
Obs. PS3d has been loaded <class 'numpy.ndarray'> (6,)
Obs. S2 has been loaded <class 'numpy.ndarray'> (32, 6, 6)
Obs. param. has been loaded <class 'numpy.ndarray'> (2,)
['/travail/pguidi/correct_data/data_summary_S1_1000_0.npy', '/travail/pguidi/correct_data/data_summary_S1_1000_1.npy', '/travail/pguidi/correct_data/data_summary_S1_1000_2.npy', '/travail/pguidi/correct_data/data_summary_S1_1000_3.npy', '/travail/pguidi/correct_data/data_summary_S1_1000_4.npy', '/travail/pguidi/correct_data/data_summary_S1_1000_5.npy', '/travail/pguidi/correct_data/data_summary_S1_1000_6.npy', '/travail/pguidi/correct_data/data_summary_S1_1000_7.npy', '/travail/pguidi/correct_data/data_summary_S1_1000_8.npy', '/travail/pguidi/correct_data/data_summary_S1_1000_9.npy']
Simulated S1 have been loaded <class 'numpy.ndarray'> (1000, 32, 6)
['/travail/pguidi/correct_data/data_summary_PS_1000_0.npy'

In [4]:
# This function calculates the l1 and l2 summaries for the simulated PS, S1 and S2.
# It also modifies the array from np.arrays with float64 values to torch.tensors with float32 values
# Float32 is needed for torch.tensor
l1l2_summary_PS, l1l2_summary_S1, l1l2_summary_S2, data_PS3d, param_list = psbi.l1l2_and_torchify_data(data_PS, 
                                                                                                       data_S1, 
                                                                                                       data_S2, 
                                                                                                       data_PS3d, 
                                                                                                       param_list,
                                                                                                       n_pixels, 
                                                                                                       wavelet_type,
                                                                                                       l1,
                                                                                                       l2
                                                                                                      )


# This function will give a runtime warning due to a division by 0.
# This happens during the normalization of slices where the brightness temperature is 0 everywhere,
# so PS, S1 and S2 will be 0 and when I divide S1/PS I will basicallly have 0/0 = nan.
# Inside that function I assign the value 0 to all the nan.

  data_S2_new = data_S2 / data_S1_rev
  data_S1_new = data_S1 / np.sqrt(data_PS)


In [5]:
# This function calculates the l1 and l2 summaries for the 'observed' PS, S1 and S2.
# It also modifies the array from np.arrays with float64 values to torch.tensors with float32 values
# Float32 is needed for torch.tensor
l1l2_summary_obs_PS, l1l2_summary_obs_S1, l1l2_summary_obs_S2, data_obs_PS3d, param_true = psbi.l1l2_and_torchify_obs_data(data_obs_PS, 
                                                                                                                           data_obs_S1, 
                                                                                                                           data_obs_S2, 
                                                                                                                           data_obs_PS3d, 
                                                                                                                           param_true,
                                                                                                                           n_pixels,
                                                                                                                           wavelet_type,
                                                                                                                           l1, 
                                                                                                                           l2                                                                                                                          
                                                                                                                          )

# This function will sometimes give a runtime warning due to a division by 0.
# This happens during the normalization of slices where the brightness temperature is 0 everywhere,
# so PS, S1 and S2 will be 0 and when I divide S1/PS I will basicallly have 0/0 = nan.
# Inside that function I assign the value 0 to all the nan.

<h2>Function for the $\ell^1$ and $\ell^2$ summaries</h2>

In [6]:
# Number of bacthes for the simulations
n_batches = 2

# Number of simulation per batch
n_per_batch = 2

# So total number of simulations n = n_batches * n_per_batch

# Redshif, number of pixels and dimension of the cube
z = 9
n_pixels = 32
dim = 300

# NUmber of bins for the statistics
bins = 6

# True if you want the l1 summary, otherwise False
l1 = True

# True if you want the l2 summary, otherwise False
l2 = True

# Type wavelet for the line of sight decomposition
wavelet_type = 'morl'

# Starting seed
start_seed = 1

# True: if you want to save the statistics. Remember to change the directory to the oneyou want
# False: if you don't want to save the statistics
save_files = False

# This function doesn't return anything
# If you want to anlayse the l1 and l2 summaries, you have to first save the files
# The cleaning of the 21cmfast-cache is done inside the function
psbi.create_batched_data_2(n_batches = n_batches, 
                           n_per_batch = n_per_batch, 
                           z = z, 
                           n_pixels = n_pixels, 
                           dim = dim, 
                           bins = bins, 
                           l1 = l1, 
                           l2 = l2, 
                           wavelet_type = wavelet_type, 
                           start_seed = start_seed, 
                           save_files = save_files)


  self.S2 =self.S2 / S1_rev[:,:,np.newaxis]
  self.S1 = self.S1 / np.sqrt(self.PS)


<h2>Importing the .npy files with the $\ell^1$ and $\ell^2$ summaries</h2>

In [7]:
# Path to the folder with the .npy files of the l1 and l2 summaries
path_to_folder = '/travail/pguidi/data_for_sbi_l1l2/'

# Importing the .npy files of the 'observed' data and the true parameters
# This function also converts np.arrays with float64 to torch.tensor with float32
data_obs_S1 = psbi.import_obs_l1l2(path_to_folder + 'data_obs_S1_l1l2_1000.npy')
print('Obs. S1 has been loaded', type(data_obs_S1), data_obs_S1.shape)
data_obs_PS = psbi.import_obs_l1l2(path_to_folder + 'data_obs_PS_l1l2_1000.npy')
print('Obs. PS has been loaded', type(data_obs_PS), data_obs_PS.shape)
data_obs_PS3d = psbi.import_obs_l1l2(path_to_folder + 'data_obs_PS3d_1000.npy')
print('Obs. PS3d has been loaded', type(data_obs_PS3d), data_obs_PS3d.shape)
data_obs_S2 = psbi.import_obs_l1l2(path_to_folder + 'data_obs_S2_l1l2_1000.npy')
print('Obs. S2 has been loaded', type(data_obs_S2), data_obs_S2.shape)
param_true = np.load(path_to_folder + 'param_true_1000.npy')[:2]
param_true = psbi.torchify(param_true)
print('Obs. param. has been loaded', type(param_true), param_true.shape)

# Importing the .npy files of the simulated data and the list of parameters used for these simulations
# This function also converts np.arrays with float64 to torch.tensor with float32
data_S1 = psbi.import_data_l1l2(path_to_folder + 'data_summary_S1_*.npy')
print('Simulated S1 have been loaded', type(data_S1), data_S1.shape)
data_PS = psbi.import_data_l1l2(path_to_folder + 'data_summary_PS_*.npy')
print('Simulated PS have been loaded', type(data_PS), data_PS.shape)
data_PS3d = psbi.import_data_l1l2(path_to_folder + 'data_summary_PS3d_*.npy')
print('Simulated PS3d have been loaded', type(data_PS3d), data_PS3d.shape)
data_S2 = psbi.import_data_l1l2(path_to_folder + 'data_summary_S2_*.npy')
print('Simulated S2 have been loaded', type(data_S2), data_S2.shape)
param_list = psbi.import_data_l1l2(path_to_folder + 'param_list_*.npy')
print('Simulated params have been loaded', type(param_list), param_list.shape)


Obs. S1 has been loaded <class 'torch.Tensor'> torch.Size([60])
Obs. PS has been loaded <class 'torch.Tensor'> torch.Size([60])
Obs. PS3d has been loaded <class 'torch.Tensor'> torch.Size([6])
Obs. S2 has been loaded <class 'torch.Tensor'> torch.Size([360])
Obs. param. has been loaded <class 'torch.Tensor'> torch.Size([2])
Simulated S1 have been loaded <class 'torch.Tensor'> torch.Size([1000, 60])
Simulated PS have been loaded <class 'torch.Tensor'> torch.Size([1000, 60])
Simulated PS3d have been loaded <class 'torch.Tensor'> torch.Size([1000, 6])
Simulated S2 have been loaded <class 'torch.Tensor'> torch.Size([1000, 360])
Simulated params have been loaded <class 'torch.Tensor'> torch.Size([1000, 2])
