# Check what data we have

Tunnell, Feburary 2016

This tutorial describes how to use determine what datasets we have.

## Introduction

hax keeps track of the datasets that have been taken.  For XENON100, this is just a CSV file.  For XENON1T, it will be a database.  We're having manpower issues getting XENON100 data reprocessed so this notebook will give you an idea as to which datasets were processed.

## Boilerplate startup

In [1]:
%run boiler_plate.py
%matplotlib inline 



## Specify path

Finally, you may want to set some special the hax configuration options. For example, use:

    hax.config.CONFIG['main_data_paths'].append(['/path/to/my/secret/data'])
    hax.runs.update_datasets()
    
to add `/path/to/my/secret/data` to the paths hax searches for datasets.

In [5]:
## Specify your own data location
hax.config.CONFIG['main_data_paths'] = ['/tmp/data/good/']
hax.runs.update_datasets()

## Using standard data

We can query for datasets where we know the location of the ROOT file and the dataset is marked 'standard' (i.e. good for analysis).

In [11]:
query = 'location != "" & category == "standard"'
datasets = hax.runs.DATASETS.query(query)
datasets

Unnamed: 0,name,source,position,trigger,anode,cathode,shield,livetime,corrected_livetime,events,corrected_events,processed,category,comment,run,tpc,location
447,xe100_110207_1015,Cs137,red,S1,4.0,16,closed,1811.180,1811.2,50000,50000,0.4.5,standard,"Trigger 60/51, holdoff 1ms, PMT 177 Off, PMT89...",10,xenon100,/tmp/data/good/xe100_110207_1015.root
473,xe100_110210_1100,AmBe,other,S1,4.4,16,closed,202.160,202.2,5491,5491,0.4.5,standard,"Trg 60/51, holdoff 1ms, HE veto (2x10dB/4/100/40)",10,xenon100,/tmp/data/good/xe100_110210_1100.root
475,xe100_110210_1412,AmBe,other,S1,4.4,16,closed,2025.080,1172.0,47227,27961,0.4.5,standard,"Trg 60/51, holdoff 1ms, HE veto (2x10dB/4/100/40)",10,xenon100,/tmp/data/good/xe100_110210_1412.root
483,xe100_110211_0532,AmBe,other,S1,4.4,16,closed,2274.460,2274.5,51229,51229,0.4.5,standard,"Trg 60/51, holdoff 1ms, HE veto (2x10dB/4/100/40)",10,xenon100,/tmp/data/good/xe100_110211_0532.root
488,xe100_110211_1650,Cs137,red,S1,4.0,16,closed,1774.500,1774.5,50000,50000,0.4.5,standard,"Trigger 60/51, holdoff 1ms",10,xenon100,/tmp/data/good/xe100_110211_1650.root
496,xe100_110216_1055,Cs137,red,S1,2.2,16,closed,1909.350,1909.3,50000,50000,0.4.5,standard,"Trigger 60/51, holdoff 1ms",10,xenon100,/tmp/data/good/xe100_110216_1055.root
506,xe100_110223_1107,Cs137,red,S1,2.2,16,closed,1920.290,1920.3,50000,50000,0.4.5,standard,"Trigger 60/51, holdoff 1ms",10,xenon100,/tmp/data/good/xe100_110223_1107.root
517,xe100_110303_1040,Cs137,red,S1,4.4,16,closed,1390.710,1390.7,50000,50000,0.4.5,standard,"Trigger 60/51, holdoff 1ms",10,xenon100,/tmp/data/good/xe100_110303_1040.root
547,xe100_110315_1246,Cs137,red,S1,4.4,16,closed,1599.910,1599.9,50000,50000,0.4.5,standard,"Trigger 60/51, holdoff 1ms",10,xenon100,/tmp/data/good/xe100_110315_1246.root
555,xe100_110317_1132,Cs137,red,S1,4.4,16,closed,1524.240,817.2,50000,26410,0.4.5,standard,"Trigger 60/51, holdoff 1ms",10,xenon100,/tmp/data/good/xe100_110317_1132.root


## What data is there

Here you can see what types of data are processed.

In [12]:
datasets['source'].unique()

array(['Cs137', 'AmBe', 'Co60'], dtype=object)

# AmBe

These are very important for analysis.  Let's make a run list:

In [15]:
datasets[datasets['source'] == 'AmBe']

Unnamed: 0,name,source,position,trigger,anode,cathode,shield,livetime,corrected_livetime,events,corrected_events,processed,category,comment,run,tpc,location
473,xe100_110210_1100,AmBe,other,S1,4.4,16,closed,202.16,202.2,5491,5491,0.4.5,standard,"Trg 60/51, holdoff 1ms, HE veto (2x10dB/4/100/40)",10,xenon100,/tmp/data/good/xe100_110210_1100.root
475,xe100_110210_1412,AmBe,other,S1,4.4,16,closed,2025.08,1172.0,47227,27961,0.4.5,standard,"Trg 60/51, holdoff 1ms, HE veto (2x10dB/4/100/40)",10,xenon100,/tmp/data/good/xe100_110210_1412.root
483,xe100_110211_0532,AmBe,other,S1,4.4,16,closed,2274.46,2274.5,51229,51229,0.4.5,standard,"Trg 60/51, holdoff 1ms, HE veto (2x10dB/4/100/40)",10,xenon100,/tmp/data/good/xe100_110211_0532.root
1614,xe100_120404_0804,AmBe,other,S1,4.4,16,closed,843.912,843.9,14032,17000,0.4.5,standard,,10,xenon100,/tmp/data/good/xe100_120404_0804.root


## Co60

Let's make a list of the datasets.

In [19]:
datasets_co60 = datasets[datasets['source'] == 'Co60']['name'].values
datasets_co60

array(['xe100_110323_1114'], dtype=object)

But this wasn't too helpful since it's a `numpy` array.  Let's have a list instead.

In [21]:
list(datasets_co60)

['xe100_110323_1114']