# dunedata.ipynb
This notebook uses dunerun to search for some DUNE data. 

## Introduction

First connect to a Jupyter machine where DUNE cvmfs is available e.g. <https://analytics-hub.fnal.gov>, open a terminal and and follow the [instructions](https://github.com/dladams/dunerun#readme) to install *dunerun*.
Here we assume the package is installed at ~/proc/install/dev01/dunerun but any location is fine.  

Copy this noteboook dunerun.ipynb from the installation area to a directory of your choosing and open the copy on Jupyter.
Since you are reading this, you may have already done so.  

Run the *dunerun* setup (change the location to match your installation) and import dunerun and other packages of interest:

In [1]:
!echo $(hostname): $(date)
%run ~/proc/install/dev01/dunerun/python/setup.py
import sys
import os
import dunerun
print(f"dunerun version is {dunerun.version()}")
dsw = None

help(dunerun.DuneRun)

jupyter-dladams: Wed Mar 23 14:15:30 UTC 2022
dunerun version is 1.2.0
Help on class DuneRun in module dunerun.src.dunerun:

class DuneRun(builtins.object)
 |  DuneRun(senv='', sopt='', shell=False, dbg=0, lev=2, precoms=[])
 |  
 |  Methods defined here:
 |  
 |  __del__(self)
 |  
 |  __init__(self, senv='', sopt='', shell=False, dbg=0, lev=2, precoms=[])
 |      Ctor for class that runs dune commands.
 |      senv = '' - Run in bash (no dune set up).
 |             'dune' - Set up dune env but no products.
 |             'dunesw' - Set up dunesw.
 |      sopt = string passed to setup, e.g. 'e20:prof' for dunesw.
 |      precoms = Commands run before environment setup.
 |      shell = If true, all calls to run use the same shell.
 |      lev = Output level for the executed commands:
 |            0 - All output is discarded
 |            1 - stderr only
 |            2 - stdout and stderr [default]
 |      dbg = Output level (in addition to stdout, stderr) for run commands:
 |       

## List releases

Uncomment the following if you want to see a list of available versions of dunesw.

In [2]:
#dunerun.DuneRun('dune').run('ups list -aK+ dunesw')

## VOMS proxy

Access to DUNE data requires a VOMS proxy which can be otained with kerberos credentials. The following lines run system commands to check if we have such credentials and such a proxy. A benefit of running this notebook is that we are left with a VMS proxy.

The following block generates the credentials and proxy if they are needed. If the output level (*lev*) is increased, you may see a 'Setup AC' warning. It is annoying but harmless. It follows from taking kx509 from cvmfs because it is not installed on the FNAL analysis servers.

In [3]:
proxy = dunerun.DuneProxy(tmin=60)    # Proxy manager
if not proxy.check_proxy():
    print('>>>>> VOMS proxy not found.')
    assert(False)
print(f"Proxy time remaining is {proxy.time()} sec")

Proxy time remaining is 41746 sec


## Set up a release

Use *DuneRun* to start a process using a DUNE release.

In [4]:
if dsw is None:
    dsw = dunerun.DuneRun('dunesw', 'v09_46_00_00:e20:prof', shell=True)
    dsw.run('duneHelp')

Setting up dunesw v09_46_00_00 e20:prof


Welcome to dunetpc 
Some available commands:
              duneHelp - Display information about the current setup of dunetpc.
                   lar - Run the art/larsoft event looop e.g. to process event data.
  product_sizes_dumper - Display the products and size in an event data file.
               fcldump - Display the resolved configuration for a fcl file.
               liblist - List available plugin libraries.
        pdChannelRange - Display protoDUNE channel grops and ranges.
           duneRunData - Display run data for a run.
           duneTestFcl - Test some high-level fcl configurations.
Use option "-h" with any of these for more information.


## Finding data

The samweb command can be used to find DUNE data. If enabled, the first block shows some help messages.  

The next blocks showing how to list all the raw data files for a protDUNE run, find the URL(s) for one of those files and then open the file with Root.

In [5]:
linesep = '-------------------------------------------------------------'
if False:
    dsw.run('samweb --help-commands')
    print(linesep)
    dsw.run('samweb list-files -h')
    print(linesep)
    dsw.run('samweb list-files --help-dimensions')
    print(linesep)

In [6]:
run = 5240
query = f"data_tier raw and DUNE_data.is_fake_data 0 and run_number {run}"
if True:
    print(f">>>>> Raw data files for run {run}")
    print(f">>>>> Query: {query}")
    dsw.run(f'samweb list-files "{query}"')

>>>>> Raw data files for run 5240
>>>>> Query: data_tier raw and DUNE_data.is_fake_data 0 and run_number 5240
np04_raw_run005240_0003_dl5.root
np04_raw_run005240_0004_dl7.root
np04_raw_run005240_0004_dl3.root
np04_raw_run005240_0003_dl11.root
np04_raw_run005240_0004_dl12.root
np04_raw_run005240_0004_dl10.root
np04_raw_run005240_0003_dl6.root
np04_raw_run005240_0001_dl7.root
np04_raw_run005240_0001_dl12.root
np04_raw_run005240_0003_dl7.root
np04_raw_run005240_0003_dl4.root
np04_raw_run005240_0003_dl12.root
np04_raw_run005240_0003_dl3.root
np04_raw_run005240_0003_dl1.root
np04_raw_run005240_0003_dl9.root
np04_raw_run005240_0003_dl10.root
np04_raw_run005240_0003_dl2.root
np04_raw_run005240_0003_dl8.root
np04_raw_run005240_0002_dl1.root
np04_raw_run005240_0002_dl4.root
np04_raw_run005240_0002_dl12.root
np04_raw_run005240_0002_dl3.root
np04_raw_run005240_0002_dl2.root
np04_raw_run005240_0002_dl11.root
np04_raw_run005240_0002_dl6.root
np04_raw_run005240_0002_dl9.root
np04_raw_run005240_0002_

In [7]:
fnam = 'np04_raw_run005240_0001_dl1.root'
dsw.run(f"samweb get-file-access-url {fnam} --schema=root")

root://castorpublic.cern.ch//castor/cern.ch/neutplatform/protodune/rawdata/np04/detector/None/raw/06/68/39/48/np04_raw_run005240_0001_dl1.root
root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune/np04/beam/detector/None/raw/06/68/39/48/np04_raw_run005240_0001_dl1.root


## Reading data

Finally we read the file to check that we really have access. First open the file in root:

In [8]:
furl = 'root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune/np04/beam/detector/None/raw/06/68/39/48/np04_raw_run005240_0001_dl1.root'
print('>>>>> Opening file with Root.')
dsw.run(f"root.exe -q {furl}")

>>>>> Opening file with Root.
   ------------------------------------------------------------------
  | Welcome to ROOT 6.22/08                        https://root.cern |
  | (c) 1995-2020, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Mar 10 2021, 14:20:04                 |
  | From tags/v6-22-08@v6-22-08                                      |
  | Try '.help', '.demo', '.license', '.credits', '.quit'/'.q'       |
   ------------------------------------------------------------------


Attaching file root://fndca1.fnal.gov/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune/np04/beam/detector/None/raw/06/68/39/48/np04_raw_run005240_0001_dl1.root as _file0...
(TFile *) 0x49208a0


Then run the art utility that counts the events in the file:

In [9]:
print('>>>>> Counting events.')
dsw.run(f"count_events --hr {furl}")

>>>>> Counting events.
root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune/np04/beam/detector/None/raw/06/68/39/48/np04_raw_run005240_0001_dl1.root	1 run, 1 subrun, 107 events, and 0 results.
Counted events successfully for 1 specified files.


And look at the sizes of the file's data products:

In [10]:
print('>>>>> Dump data sizes.')
dsw.run(f"product_sizes_dumper -f 0 {furl}")

>>>>> Dump data sizes.

Size on disk for the file: root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune/np04/beam/detector/None/raw/06/68/39/48/np04_raw_run005240_0001_dl1.root
Total size on disk: 8243603809

     Size in bytes   Fraction TTree/TKey Name
        8216169210      0.997 Events
          26992712      0.003 RootFileDB
            185952      0.000 EventMetaData
             12825      0.000 MetaData
              9163      0.000 EventHistory
              7977      0.000 Runs
              5586      0.000 FileIndex
              1687      0.000 Parentage
              1142      0.000 RunMetaData
              1124      0.000 ResultsMetaData
              1124      0.000 SubRuns
              1116      0.000 SubRunMetaData
              1102      0.000 ResultsTree
------------------------------
        8243390720      1.000 Total

Details for each TTree that occupies more than the fraction 0 of the size on disk.


Details for branch: Events

    

In [11]:
!echo $(hostname): $(date)

jupyter-dladams: Wed Mar 23 14:15:44 UTC 2022
