# GRID_LRT Testbed Notebook

## 1. Setting up the Environment

The GRID LOFAR TOOLS have several infrastructure requirements. They are as follows:

1. ASTRON LOFAR staging credentials
2. PiCaS database access
3. Valid GRID proxy


Here, we'll test that all of the above are enabled and work:

In [8]:
import os
import GRID_LRT
print(GRID_LRT.__file__)
import subprocess
from GRID_LRT.get_picas_credentials import picas_cred
from GRID_LRT.Staging import stage_all_LTA
from GRID_LRT.Staging import state_all
from GRID_LRT.Staging import stager_access
from GRID_LRT.Staging.srmlist import srmlist
from GRID_LRT import Token
pc=picas_cred()

/home/apmechev/software/lib/python2.6/site-packages/GRID_LRT-0.2-py2.6.egg/GRID_LRT/__init__.pyc


This should give a confirmation of that your LOFAR ASTRON credentials were properly read:

`2017-12-04 17:15:29.097902 stager_access: Parsing user credentials from /home/apmechev/.awe/Environment.cfg
2017-12-04 17:15:29.097973 stager_access: Creating proxy`

Next, we check that your PiCaS User and Database are set properly. You can also verify your password

In [4]:
print(pc.user)
print(pc.database)

apmechev
sksp_unittest


Next, we'll use the test srm.txt to show off our staging chops:

Stage the test srm.txt file. You'll get a StageID that you can use later.

In [5]:
test_srm_file='/home/apmechev/t/GRID_LRT/GRID_LRT/tests/srm_50_sara.txt'

os.path.exists(test_srm_file)
with open(test_srm_file,'r') as f:
    file_contents = f.read()
    print(file_contents.split()[0:3]) 
stageID=stage_all_LTA.main(test_srm_file) # NOTE! You (oll get two emails every time you do this!
print(stageID)

['srm://srm.grid.sara.nl:8443/pnfs/grid.sara.nl/data/lofar/ops/projects/lc2_038/229507/L229507_SB100_uv.dppp.MS_3d78b8f1.tar', 'srm://srm.grid.sara.nl:8443/pnfs/grid.sara.nl/data/lofar/ops/projects/lc2_038/229507/L229507_SB101_uv.dppp.MS_acbb43a6.tar', 'srm://srm.grid.sara.nl:8443/pnfs/grid.sara.nl/data/lofar/ops/projects/lc2_038/229507/L229507_SB102_uv.dppp.MS_69304702.tar']
files are on SARA
Setting up 51 srms to stage
staged with stageID  18534
18534


You can now re-run the cell below to check the current status of your staging request:

In [6]:
print(stage_all_LTA.get_stage_status(stageID)) #crashes (py2.7?)
#The code below can also show you a more detailed status
statuses=stager_access.get_progress()

print(statuses)

new
{'18530': {'Status': 'in progress', 'File count': '1', 'User id': '2331', 'Location': 'fz-juelich', 'Files done': '0', 'Flagged abort': 'false', 'Percent done': '0'}, '18534': {'Status': 'new', 'File count': '51', 'User id': '2331', 'Location': 'sara', 'Files done': '0', 'Flagged abort': 'false', 'Percent done': '0'}, '18529': {'Status': 'in progress', 'File count': '1', 'User id': '2331', 'Location': 'fz-juelich', 'Files done': '0', 'Flagged abort': 'false', 'Percent done': '0'}}


In [None]:
statuses=stage_all_LTA.get_stage_status(stageID)
## When the staging completes, your stageID magically disappears from the database
# Neat, huh?
if not statuses:
    print("Staging status no longer in LTA Database") #This happens because bad programming
else:
    print("Staging request "+str(stageID)+" has status: "+str(statuses))

You can also check the status of the srms two different ways (with srmls and with gfal)

In [7]:
print(state_all.__file__)
staged_status = state_all.main(test_srm_file) #Only works for Sara and Poznan files!

#You can also supress the printing of statuses
staged_status1 = state_all.main(test_srm_file, printout=False)




/home/apmechev/software/lib/python2.6/site-packages/GRID_LRT-0.2-py2.6.egg/GRID_LRT/Staging/state_all.py
files are on SARA
0/pnfs/grid.sara.nl/data/lofar/ops/projects/lc2_038/229507/L229507_SB100_uv.dppp.MS_3d78b8f1.tar [32mONLINE_AND_NEARLINE[0m
1/pnfs/grid.sara.nl/data/lofar/ops/projects/lc2_038/229507/L229507_SB101_uv.dppp.MS_acbb43a6.tar [32mONLINE_AND_NEARLINE[0m
2/pnfs/grid.sara.nl/data/lofar/ops/projects/lc2_038/229507/L229507_SB102_uv.dppp.MS_69304702.tar [32mONLINE_AND_NEARLINE[0m
3/pnfs/grid.sara.nl/data/lofar/ops/projects/lc2_038/229507/L229507_SB103_uv.dppp.MS_a47a629e.tar [32mONLINE_AND_NEARLINE[0m
4/pnfs/grid.sara.nl/data/lofar/ops/projects/lc2_038/229507/L229507_SB104_uv.dppp.MS_2da7d035.tar [32mONLINE_AND_NEARLINE[0m
5/pnfs/grid.sara.nl/data/lofar/ops/projects/lc2_038/229507/L229507_SB105_uv.dppp.MS_7ad084dd.tar [32mONLINE_AND_NEARLINE[0m
6/pnfs/grid.sara.nl/data/lofar/ops/projects/lc2_038/229507/L229507_SB106_uv.dppp.MS_9c68322c.tar [32mONLINE_AND_NEARLINE

## 2. Srmlist()

A dedicated class exists to handle lists of srmfiles. This class is a child of the python 'list' class and thus has all the capabilites of a list with some bells and whistles. 

It contains as properties the OBSID and LTA location of the files. 

Additionally, it can create generators that convert the srm:// links to gsiftp:// links, as well as staging links (Ones that can be fed into the state_all.py script)

In [9]:
s_list=srmlist() #Empty list of srms

## 3. Tokens! 
### 3. a) The manual way

Next we'll interface with PiCaS and start making tokens for our Observation:

here we need a string to link all the tokens in one Observation. We'll use the string 'demo_'+username in the sksp_dev database

In [None]:
uname = os.environ['USER']
th = Token.Token_Handler(t_type="jupyter_demo_"+uname, uname=pc.user, pwd=pc.password, dbn='sksp_dev')

#Create the overview_view (has the number of todo, done, error, running, [...] tokens)
th.add_overview_view()

#Add the satus views (By default 'todo', 'locked', 'done', 'error')
th.add_status_views()

#Manually create a token:
manual_keys = {'manual_key':'manual_value','manual_int':1024}
man_token_1 = th.create_token(keys=manual_keys, append="manual") #will return the id of the manual token
print('manual_token_ID = ' + man_token_1)

We can also manually create a Token with an automatic attachment:

In [None]:
manual_keys = {'manual_key':'manual_value','manual_int':0}
man_token_2 = th.create_token(keys=manual_keys, 
                            append="manual_with_attach",
                            attach=[open(test_srm_file),'srm_at_token_create.txt']) 

##We can also attach files after the token's been created:
th.add_attachment(man_token_2, open(test_srm_file), 'srm_added_later.txt')

#Double check that both files were attached. Returns a list of filenames:
man_2_attachies = th.list_attachments(man_token_2)
print("The two attached files are: "+str(man_2_attachies))

# We can also of course download attachments:
saved_attach=th.get_attachment(man_token_2,man_2_attachies[0],savename=man_2_attachies[0])
print("")
print('The attachemnt '+str(man_2_attachies[0])+" was saved at "+saved_attach)

assert(os.path.exists(saved_attach))
os.remove(saved_attach)
assert(not os.path.exists(saved_attach))


We can also list the views and the tokens from each view:

In [None]:
print(th.views.keys()) #the views member of th is a dictionary of views 
locked_tokens = th.list_tokens_from_view('locked')

print(type(locked_tokens)) #It's not a list!!
print("There are "+str(len(locked_tokens))+" 'locked' tokens")


todo_tokens = th.list_tokens_from_view('todo') 
# It's not a list because it procedurally pings CouchDB, ~generator
#Use the help below to browse how it works!!
##help(todo_tokens)

print("There are "+str(len(todo_tokens))+" 'todo' tokens")
print("")
print("They are:")
for i in todo_tokens:
    print("CouchDB token keys: "+str(i.keys()),"Token ID: "+i.id)


You can set all tokens in a view to a Status, say 'locked'. This automatically locks the tokens!!

In [None]:

print('Lock status of the token: '+str(th.db[man_token_2]['lock'])+".")
print('Scrub count of the token: '+str(th.db[man_token_2]['scrub_count'])+".")
print("There are "+str(len(th.list_tokens_from_view('todo')))+" 'todo' tokens")
print("There are "+str(len(th.list_tokens_from_view('locked')))+" 'locked' tokens")
print("")
print("Setting status to locked for all todo tokens")
th.set_view_to_status(view_name='todo',status='locked') #Sets all todo tokens to "locked"

todo_tokens = th.list_tokens_from_view('todo') 
print("")

print("There are "+str(len(todo_tokens))+" 'todo' tokens")
### No more todo tokens!


locked_tokens = th.list_tokens_from_view('locked')
print("There are "+str(len(locked_tokens))+" 'locked' tokens")
##Now they're all locked!

print('Lock status of the token: '+str(th.db[man_token_2]['lock'])+".")
#You can reset all tokens from a view back to 'todo'. This increments the scrub_count field


resetted_tokens=th.reset_tokens('locked')
print("")
print("Resetting the locked tokens")
print('Scrub count of the token: '+str(th.db[man_token_2]['scrub_count'])+".")
print("There are "+str(len(th.list_tokens_from_view('todo')))+" 'todo' tokens")
print("There are "+str(len(th.list_tokens_from_view('locked')))+" 'locked' tokens")



Finally, you can create your own view. Views collect tokens that satisfy a certain boolean expression (where the token is referenced as 'doc'

For example: 

The todo view satsifies: `'doc.lock ==  0 && doc.done == 0 '` 

The locked view satisfies: `'doc.lock > 0 && doc.done == 0 '`

The done view satsifies: `'doc.status == "done" '`

In [None]:
th.add_view(v_name="demo_view",cond='doc.manual_int == 0 ') #Only one of our tokens has manual_int==0
print(th.views['demo_view']) #new view is here!

assert(len(th.list_tokens_from_view('demo_view'))==1)
print("There is "+str(len(th.list_tokens_from_view('demo_view')))+" tokens in the demo_view")

#Creating 2 more tokens for this view. If append isn't changed, the id is the same, so
#new tokens won't be created! But you can imagine a loop will make creation easy right?
_ = th.create_token(keys=manual_keys, 
                            append="manual_with_attach_1",  
                            attach=[open(test_srm_file),'srm_at_token_create.txt']) 
_ = th.create_token(keys=manual_keys, 
                            append="manual_with_attach_2",
                            attach=[open(test_srm_file),'srm_at_token_create.txt'])
print("There are "+str(len(th.list_tokens_from_view('demo_view')))+" tokens in the demo_view")
assert(len(th.list_tokens_from_view('demo_view'))==3)


Now we can delete all tokens in this view easily!

In [None]:
th.delete_tokens('demo_view')
assert(len(th.list_tokens_from_view('demo_view'))==0)
print("There are "+str(len(th.list_tokens_from_view('demo_view')))+" tokens in the demo_view")




On the login node, you sholdn't lock tokens, that's responsibility of the launcher script. After the jobs finish, you can iterate over the 'error' view and reset the tokens if you wish. This makes re-running failed jobs easy, You just have to re-submit the jdl to the Workload Manager!

### 3b) The automatic way!


When you need to create tokens in bulk, you can do so using a .yaml file and a python dictionary.

Now introducing Token Sets: Just an easy way to create tokens from a dictionary using a yaml file!

In [None]:
ts=Token.TokenSet(th=th) #You need a Token_handler object to create tokensets 
                         #(TokenHandler manages the authentification, views and token_type selection)