In [1]:
import numpy as np, matplotlib.pyplot as plt

## Importing the data for each redshift

In [2]:
#z = 6.905
z = 7.305
# z = 8.397
# z = 10.110

n_igm = np.load('../dataset/rho_z%.3f.npy' %z)  # density of intergalactic medium
n_src = np.load('../dataset/nsrc_z%.3f.npy' %z) # number of sources per volume
xi = np.load('../dataset/xHII_z%.3f.npy' %z) # ionization fraction

Definition of additional parameters:

$\color{red}{\text{Not completely sure about max_mfp, to ask}}$

In [3]:
box_res = 2.381 # = how many megaparsecs in one pixel edge (so a length in cMpc)
max_mfp = 57.14 # max volume covered by source ionization (i.e. subvolume of side 24+1+24, in cMpc)

Sizes analysis of the dataset:

In [4]:
print (n_igm.shape)
print (n_src.shape)
print (xi.shape)

pixel_per_edge = n_igm.shape[0] # = 300

(300, 300, 300)
(300, 300, 300)
(300, 300, 300)


All parameters are then evaluations on the coeval cube of 300 pixel per dimension.

Before passing to the normalization, it is important to recall that $n_{img}$ and $n_{src}$ are computed adopting two different units (the former is a density in the CGS unit system, the latter is instead a quantity per comoving mega-parsec). Then, it is reasonable to convert them in the same unit system. Formally:

\begin{equation}
[ n_{igm}] = \frac{g}{cm^3} = 10^{15} \frac{g}{km^3} = 10^{15} \cdot 3.09^3 \cdot 10 ^ {39} \frac{g}{pc^3} = 2.95 \cdot 10^{55} \frac{g}{pc^3}
\end{equation}

The following transformation is then needed.

In [5]:
n_igm = 2.95 * 10**55 * n_igm

## Normalization

In [6]:
print('Chosen redshift:   ', z)
print('Mean Ionisation fraction:  ', np.mean(xi))

Chosen redshift:    7.305
Mean Ionisation fraction:   0.7533418720408674


We now analyze more in detail what happens in each direction; we first choose to observe the behaviour of the 2 features and the output on the three orthogonal axes cutting the cube at its center of gravity.

In [7]:
x0 = int(pixel_per_edge/2) # = 150, central pixel of the cube edge

n_igm_x = n_igm[:,x0,x0] 
n_src_x = n_src[:,x0,x0]
xi_x = xi[:,x0,x0]

n_igm_y = n_igm[x0,:,x0] 
n_src_y = n_src[x0,:,x0]
xi_y = xi[x0,:,x0]

n_igm_z = n_igm[x0,x0,:] 
n_src_z = n_src[x0,x0,:]
xi_z = xi[x0,x0,:]

print ('\n                                   MEAN VALUE ALONG EACH DIRECTION   \n')
print ('                   x                            y                               z')
print ('n_igm    ', np.mean(n_igm_x), '     ', np.mean(n_igm_y), '         ', np.mean(n_igm_z))
print ('n_src    ', np.mean(n_src_x), '         ', np.mean(n_src_y), '            ', np.mean(n_src_z))
print ('xi       ', np.mean(xi_x), '         ', np.mean(xi_y), '            ', np.mean(xi_z))


                                   MEAN VALUE ALONG EACH DIRECTION   

                   x                            y                               z
n_igm     1.1564589789595674e+25       1.1130535012706073e+25           1.1797878933858357e+25
n_src     20918.686235631307           17146.738710606893              19699.80898203532
xi        0.7047237362224594           0.6585446459674628              0.7675350039049594


It emerges that the scale of the two features $n_{igm}$ and $n_{src}$ is extremely different. Therefore, the preprocessing of the data must clearly include a normalization. As there is no significant difference of scale among the various dimensions, a first attempt can be normalizing the whole features using the total mean and standard variation $\sigma$.

In [8]:
mean_n_igm = np.mean(n_igm)
std_n_igm = np.std(n_igm)
mean_n_src = np.mean(n_src)
std_n_src = np.std(n_src)
mean_n_xi = np.mean(xi)
std_n_xi = np.std(xi)

n_igm_norm = (n_igm - mean_n_igm) / std_n_igm
n_src_norm = (n_src - mean_n_src) / std_n_src
xi_norm = (xi - mean_n_xi) / std_n_xi


print ('\n              MIN AND MAX VALUES AFTER NORMALIZATION   \n')
print ('                   x                            y ')
print ('n_igm    ', n_igm_norm.min(), '         ', n_igm_norm.max())
print ('n_src    ', n_src_norm.min(), '        ', n_src_norm.max())
print ('xi       ', xi_norm.min(), '         ', xi_norm.max())


              MIN AND MAX VALUES AFTER NORMALIZATION   

                   x                            y 
n_igm     -2.168842398716399           15.120823603045926
n_src     -0.5434308937395814          156.6221826854396
xi        -2.523045976013896           0.9082910299731384
