installing essential packages:

In [None]:
!pip install git+https://github.com/vafaei-ar/ccgpack.git
!pip install healpy

importing essential packages:

In [None]:
import numpy as np
import healpy as hp
import ccgpack as ccg
import torch
from torch.utils.data import TensorDataset

In this step, I import one of the images (in this example, alms15) and save it in 2 parameters named "filename_l" and "filename_nl" (note that I've already downloaded images and stored them in alms folder). In the following, the alm_l (Gaussian part) and alm_nl (non-Gaussian part) can be derived from them: 

In [None]:
filename_l = '/content/drive/MyDrive/alms/alms15/alm_l_0015_v3.fits/alm_l_0015_v3.fits' # you have to enter the address of alm_l file
filename_nl = '/content/drive/MyDrive/alms/alms15/alm_nl_0015_v3.fits/alm_nl_0015_v3.fits' # you have to enter the address of alm_nl file
alm_l  = hp.fitsfunc.read_alm(filename_l, hdu=1) # reading alm_l file using healpy
alm_nl = hp.fitsfunc.read_alm(filename_nl, hdu=1) # reading alm_nl file using healpy

f_NL is desired level of non-Gaussianity which can be any arbitrary number (in this example is 100).

In [None]:
f_NL = 100
alm = alm_l + f_NL * alm_nl # the principal formula for adding non_Gaussianity
map = hp.sphtfunc.alm2map(alm, nside=2048, lmax=1024, fwhm=0.0, verbose=True) # Computes a Healpix map given the alm (nside is the resolution of the map)
map = hp.reorder(map,r2n=1)
map -= np.min(map) # shifting minimum to zero
map /= (np.max(map)/255.)
patch = ccg.sky2patch(map,8) # each map divides into 64 part
print (patch.shape)



(768, 256, 256)


As you can see, we created 768 different patches with f_NL = 100 from alms15. 
To create a complete data, you have to save all of these patches (with a definite f_NL) into a single file using ```np.concatenate``` function. 





After that, I create labels using ```np.full```. For example:

In [None]:
lbl = np.full((768,1),0) 
# if the number of your data is bigger than 768, you have to change this number here!
# I choose the label of 100 to be one (in pytorch, the range of your labels depends on the number of network's output)

At the end, I make a tensor-base data set using ```TensorDataset```

In [None]:
dataset = TensorDataset(torch.tensor(image).unsqueeze(1), torch.tensor(lbl).squeeze())
# Note: CNN layesr only accept 4D arrays, so we have to unsqueeze images
# Note: the CrossEntropyLoss only accepts 1D labels, so I have to squeeze additional dimension

As explained above, our dataset has generated. But we could adopt another approach to create them. For examplem, we can save images one by one:

In [None]:
for i in range(768):
  im = patch[i]
  np.save('/content/.../1{}'.format(i),im) # again, my label is 1 which corresponds to 100 (you can use this trick to load labels too)

This is helpful when you want extract label from the image's name. Although, in this way, you must define a customized data loader