# Open spectrum from root-files 

# Not an exercise for students! 
Author:

J. Angevaare // <j.angevaare@nikhef.nl> // 2020-05-25

Until now we have only been dealing with small files that make it easy to see what is going on. Perhaps we want at some point to get more data from the stoomboot computing cluster or the appended root-file as in this folder. This notebook will show how to and we make an exemplary coincidence plot for Ti-44 using much more data than in the previous tutorials.

Below we:
 - locate a file on the stoomboot computing cluster
 - open it using uproot
 - show a calibrated spectrum
 - show a Ti-44 coincidence plot

# (ONLY ON NIKEF CLUSTER)
## Locating file:
check where files live on stoomboot (NB: will only work on stoomboot not on your machine!)


In [1]:
import socket

In [52]:
def on_stoomboot():
    '''Check that you can do stuff here otherwise raise an error'''
    host = socket.gethostname()
    if not 'nikhef' in host:
        raise ValueError(f'You are not on stoomboot but on {host}. '
                          'You can not do this operation! Perhaps continue below')
        

In [3]:
# this is where files are stored (let's "grep" something from january this year)
on_stoomboot()
!ls /data/modulation/Raw_Data/combined/ | grep 202001

mx_n_20200104_1055
mx_n_20200106_0821
mx_n_20200110_0847
mx_n_20200114_0815
mx_n_20200117_0916
mx_n_20200120_0644
mx_n_20200124_1141


In [128]:
# this is where processed root files live, lets look at the top folder of the list above
on_stoomboot()
!ls -lthr /dcache/xenon/tmons/Modulation/processed/mx_n_20200104_1055 | tail -5

-rw-r--r--.  1 jorana xenon  80M Jan  6 11:01 mx_n_20200104_1055_000051.root
-rw-r--r--.  1 jorana xenon  80M Jan  6 11:02 mx_n_20200104_1055_000052.root
-rw-r--r--.  1 jorana xenon  80M Jan  6 11:02 mx_n_20200104_1055_000053.root
-rw-r--r--.  1 jorana xenon  80M Jan  6 11:03 mx_n_20200104_1055_000054.root
-rw-r--r--.  1 jorana xenon  38M Jan  6 11:03 mx_n_20200104_1055_000055.root


In [129]:
# Let's copy that last one as it is not too big
on_stoomboot()
!cp '/dcache/xenon/tmons/Modulation/processed/mx_n_20200104_1055/mx_n_20200104_1055_000055.root' ../data/.

In [130]:
on_stoomboot()
!ls -lthr ../data

total 5.5G
-rw-r--r--. 1 jorana xenon 3.9M Jun  7 21:33 Co60_sample.csv
drwxr-xr-x. 2 jorana xenon 4.0K Jun  7 22:01 bg
drwxr-xr-x. 2 jorana xenon 4.0K Jun  7 22:01 ti44
drwxr-xr-x. 2 jorana xenon 4.0K Jun  7 22:01 co60
drwxr-xr-x. 2 jorana xenon 4.0K Jun  7 22:01 cs137
-rw-r--r--. 1 jorana xenon  64M Jun  7 22:01 bg_dat.zip
-rw-r--r--. 1 jorana xenon 3.3G Jun  7 22:02 ti44_dat.zip
-rw-r--r--. 1 jorana xenon 843M Jun  7 22:02 co60_dat.zip
-rw-r--r--. 1 jorana xenon 1.3G Jun  7 22:02 cs137_dat.zip
drwxr-xr-x. 2 jorana xenon   80 Jun  7 22:36 copy_bg
-rw-r--r--. 1 jorana xenon  58K Jun  7 22:39 copy_bg.zip
-rw-r--r--. 1 jorana xenon  38M Jun  7 22:54 mx_n_20200104_1055_000055.root


## (BONUS)
### Generate very large files
As you may have seen above, this is only one of the many files we have available (there is about 50 TB so more than you want to imagine). Below, we will show what you can do to get more data but be aware, this may not be required, will take more time and make your computer unhappy.


In [47]:
# Load some packages
import os
import tqdm
import socket

In [50]:
# Load all the data  on this path
# NB: only works on stoomboot!
# NB: takes 20 minutes!
on_stoomboot()
root_folder = '/dcache/xenon/tmons/Modulation/processed/mx_n_20200104_1055'
store_columns = ['channel', 'integral','time']
sources = {'bg':[0,1], 'ti44':[2,3], 'co60':[4,5], 'cs137':[6,7]}
for root_file in tqdm.tqdm(os.listdir(root_folder)):
    if not '.root' in root_file:
        # This is not a rootfile
        continue
    else:
        idx = root_file.split('_')[-1].split('.root')[0]
    path = os.path.join(root_folder, root_file)
    file = uproot.open(path)
    tree = file['T;2']
    data = tree.pandas.df()
    for source, channels in sources.items():
        mask = ( 
            (data['channel'] == channels[0] ) | (data['channel'] == channels[1] ) 
            & (data['istestpulse'] == 0) 
            & (data['error'] == 0) 
        )
        save_dir = f'../data/{source}'
        if not os.path.exists(save_dir):
            os.mkdir(save_dir)
        save_name = f'{save_dir}/{source}_chunck_{idx}.csv'
        data[mask][store_columns].to_csv(save_name,index=False)
    # Double check that we free up memory
    del data, mask

NameError: name 'on_stoomboot' is not defined

In [51]:
# Alright, let me zip this data for you. After that you can download it on your own laptop
import zipfile

In [56]:
# Let's zip al the dat a in these files
on_stoomboot()
for source in sources.keys():
    save_dir = f'../data/{source}'
    zipObj = zipfile.ZipFile(f'../data/{source}_dat.zip', 'w')
    for f in tqdm.tqdm(os.listdir(save_dir)):
        if '.csv' in f:
            path = os.path.join(save_dir, f)
            zipObj.write(path)

100%|██████████| 56/56 [00:00<00:00, 133.19it/s]
100%|██████████| 56/56 [00:20<00:00,  2.67it/s]
100%|██████████| 56/56 [00:05<00:00,  9.64it/s]
100%|██████████| 56/56 [00:08<00:00,  6.85it/s]


In [63]:
# Great our data is zipped and at:
!ls ../data | grep zip

bg_dat.zip
co60_dat.zip
cs137_dat.zip
ti44_dat.zip
