# dunedata.ipynb
This notebook uses dunerun to search for some DUNE data. 

## Introduction

First connect to a Jupyter machine where DUNE cvmfs is available e.g. <https://analytics-hub.fnal.gov>, open a terminal and and follow the [instructions](https://github.com/dladams/dunerun#readme) to install *dunerun*.
Here we assume the package is installed at ~/proc/install/common/dunerun but any location is fine.  

Copy this noteboook dunerun.ipynb from the installation area to a directory of your choosing and open the copy on Jupyter.
Since you are reading this, you may have already done so.  

Run the *dunerun* setup (change the location to match your installation) and import dunerun and other packages of interest:

In [1]:
%run ~/proc/install/common/dunerun/python/setup.py
import sys
import os
import dunerun
print(f"dunerun version is {dunerun.version()}")
dsw = None
check_proxy = True                  # Set this False to skip the proxy check and generation
proxy = dunerun.DuneProxy(nopassword_kinit=False, tmin=60)    # Proxy manager
print(f"Proxy time remaining is {proxy.time()} sec")

dunerun version is 1.1.1
Proxy time remaining is 38521 sec


## List releases

Uncomment the following if you want to see a list of available versions of dunesw.

In [2]:
#dunerun.DuneRun('dune').run('ups list -aK+ dunesw')

## VOMS proxy

Access to DUNE data requires a VOMS proxy which can be otained with kerberos credentials. The following lines run system commands to check if we have such credentials and such a proxy. A benefit of running this notebook is that we are left with a VMS proxy.

The following block generates the credentials and proxy if they are needed. If the output level (*lev*) is increased, you may see a 'Setup AC' warning. It is annoying but harmless. It follows from taking kx509 from cvmfs because it is not installed on the FNAL analysis servers.

In [3]:
linesep = '-------------------------------------------------------------'

if check_proxy:
    if not proxy.is_valid():
        print('>>>>> VOMS proxy not found.')
        if not proxy.have_kerberos_credentials():
            print('>>>>> Kerberos credentials needed and not found.')
            if not proxy.nopassword_kinit:
                raise Exception('>>>>> Run kinit in a terminal and restart notebook.')
            print('>>>>> Credentials will be obtained.')
    else:
        print('>>>>> Proxy is valid.')
else:
    print('>>>>> Proxy is not checked.')

>>>>> Proxy is valid.


In [4]:
if check_proxy and not proxy.is_valid():
    print('>>>>> Obtaining proxy.')
    proxy.get_proxy(lev=0)
    if not proxy.is_valid():
        raise Exception(">>>>> ERROR: Proxy is not valid.")
    print('>>>>> Proxy has been obtained')

## Set up a release

Use *DuneRun* to start a process using a DUNE release.

In [5]:
if dsw is None:
    dsw = dunerun.DuneRun('dunesw', 'v09_45_00_00:e20:prof', shell=True)
    dsw.run('duneHelp')

TypeError: __init__() missing 2 required positional arguments: 'args' and 'returncode'

## Finding data

The samweb command can be used to find DUNE data. If enabled, the first block shows some help messages.  

The next blocks showing how to list all the raw data files for a protDUNE run, find the URL(s) for one of those files and then open the file with Root.

In [None]:
if False:
    dsw.run('samweb --help-commands')
    print(linesep)
    dsw.run('samweb list-files -h')
    print(linesep)
    dsw.run('samweb list-files --help-dimensions')
    print(linesep)

In [None]:
run = 5240
query = f"data_tier raw and DUNE_data.is_fake_data 0 and run_number {run}"
if True:
    print(f">>>>> Raw data files for run {run}")
    print(f">>>>> Query: {query}")
    dsw.run(f'samweb list-files "{query}"')

In [None]:
fnam = 'np04_raw_run005240_0001_dl1.root'
dsw.run(f"samweb get-file-access-url {fnam} --schema=root")

## Reading data

Finally we read the file to check that we really have access. First open the file in root:

In [None]:
furl = 'root://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune/np04/beam/detector/None/raw/06/68/39/48/np04_raw_run005240_0001_dl1.root'
print('>>>>> Opening file with Root.')
dsw.run(f"root.exe -q {furl}")

Then run the art utility that counts the events in the file:

In [None]:
print('>>>>> Counting events.')
dsw.run(f"count_events --hr {furl}")

And look at the sizes of the file's data products:

In [None]:
print('>>>>> Dump data sizes.')
dsw.run(f"product_sizes_dumper -f 0 {furl}")