# Create Noise Set

Based on Python 3.
Required software:
1. pandas
2. numpy
3. sigproc

In this notebook we create the csv file containing the noise files. Ideally you have a set of real pulsar observation. If not you will have to create them with sigprocs fast_fake for example.
The files need to have the same parameters as the simulated files which means if you use downsampling in the 'create_training_set.pynb' you will also to downsample your noise files in the same way.

If your set contains real pulsars these have to be manually labelled later.

In [1]:
import pandas as pd
import numpy as np
import glob
import os

In [9]:
# Adjust the path and file mask according to your input data
raw_files = glob.glob('/data/lkuenkel/data/PMPS/1997AUGT/raw/*.sf')[:40]

In [3]:
output_path = '/data/lkuenkel/data/pipeline_test/noise_data/'
!mkdir {output_path}

mkdir: cannot create directory ‘/data/lkuenkel/data/pipeline_test/noise_data/’: File exists


In [4]:
# Parameters for downsampling
t_downsample = 10
f_downsample = 2
nbits = 8

dummy_path = '/data/lkuenkel/data/pipeline_test/dummy.fil'
set_name = 'noise_sample'

In [5]:
%%capture
new_files = []
for file in raw_files:
    # No conversion to filterbank needed if data is already sigproc filterbank
    !filterbank {file} > {dummy_path}
    file_name = file.split('/')[-1]
    out_path = output_path + os.path.splitext(file_name)[0] + '.fil'
    !decimate -c {f_downsample} -t {t_downsample} -n {nbits} {dummy_path} > {out_path}
    new_files.append(out_path)

In [6]:
psr_names = ['',] * len(new_files)
periods = [np.nan,] * len(new_files)
dms = [np.nan,] * len(new_files)
labels = [2,] * len(new_files)
snrs = [np.nan,] * len(new_files)

In [7]:
data_dict = {'JNAME':psr_names, 'P0':periods, 'DM':dms, 'Label':labels, 'FileName':new_files, 
             'SNR': snrs}
df = pd.DataFrame(data=data_dict)

Label 2 indicates a real observation without a known pulsar. Known pulsars should be given the label 3.

In [8]:
df.to_csv(f'../datasets/noiseset_{set_name}.csv')
print(f"Created: ../datasets/noiseset_{set_name}.csv")
print(f"To use the set use the option: --path_noise noiseset_{set_name}.csv")

Created: ../datasets/noiseset_noise_sample.csv
To use the set use the option: --path_noise noiseset_noise_sample.csv
