# Yoda-Python Interface Danger zone

The python-irods application programming interface (API, from here we will use the word 'interface') is very useful for working directly with Yoda data and metadata from within Python. The interface gives you a lot of power and is really useful to avoid unnecessary copies of your data. However, there are some risks: if used incorrectly you can cause **harm to your projects data/metadata.**
In this notebook, we show some of that and the possible implications. Use it to learn, not to cause harm to your data 😉

The following piece of code can be used to import the tools to start an iRODSSession (an active connection to Yoda).

In [1]:
import getpass
import os
import json
from pathlib import Path

from irods.session import iRODSSession

## Connecting to iRODS

### Login to Yoda

To connect to Yoda via Python you will need to create an environment file. First create hidden folder called `.irods` in your home directory:

In [2]:
env_dir = str(Path.home()) + '/.irods/'
if not os.path.exists(env_dir):
  os.mkdir(env_dir)
  print("Folder %s created!" % env_dir)
else:
  print("Folder %s already exists" % env_dir)

Folder /Users/staig001/.irods/ already exists


Now create the environment file. The contents of the file are specific for your faculty and can be found [here (Step 2. Configuring iCommands)](https://www.uu.nl/en/research/yoda/guide-to-yoda/i-am-using-yoda/using-icommands-for-large-datasets). You can create the file manually or adapt the cell below. Copy and paste the info relevant for your faculty in the file (or the code cell below) and adjust the `irods_user_name` to your email address.

In [4]:
env_file = env_dir + 'irods_environment.json'

if os.path.exists(env_file):
    print("File %s already exists" % env_file)
else:
    dictionary = {}
        # REPLACE THIS PART WITH THE INFO FOR YOUR FACULTY
        #    {   
        #    "irods_host": "science.data.uu.nl",   
        #    "irods_port": 1247,    "irods_home": "/nluu6p/home",   
        #    "irods_user_name": "exampleuser@uu.nl",   
        #    "irods_zone_name": "nluu6p",   
        #    "irods_authentication_scheme": "pam",   
        #    "irods_encryption_algorithm": "AES-256-CBC",   
        #    "irods_encryption_key_size": 32,   
        #    "irods_encryption_num_hash_rounds": 16,   
        #    "irods_encryption_salt_size": 8,   
        #    "irods_client_server_policy": "CS_NEG_REQUIRE",
        #    "irods_client_server_negotiation": "request_server_negotiation"
        #    }
    with open(env_file, 'w') as outfile:  
        json.dump(dictionary, outfile)

File /Users/staig001/.irods/irods_environment.json already exists


You will also need to create a [Data Access Password](https://www.uu.nl/en/research/yoda/using-data-access-passwords). When you have a data access password, run the cell below and Enter your Data Access Password in the pop up window that asks for your password:"

In [None]:
passwd = getpass.getpass("Enter your Yoda data access password")

<div class=\"alert alert-block alert-danger\"><b>Warning:</b> This password is your secret! You can enter it safely in the pop up window but don't put the actual password in this notebook, especially when you plan to put the notebook on e.g. GitHub.</div>

Now run the cell below to start your iRODS session (active connection to Yoda)!

In [None]:
with open(os.path.expanduser("~/.irods/irods_environment.json"), "r") as f:
    ienv = json.load(f)
session = iRODSSession(**ienv, password=passwd)

## Changing Yoda metadata and archiving

To be able to proceed with the following steps, there are some steps you have to perform via the Yoda portal:

1) Create a [collection](TODO: link to glossary) with two files
2) Fill in the metadata form in the web-portal. (TODO, maybe specify what to fill)
3) Submit this collection to Vault (TODO is this explained somewhere? Insert a link)

*TODO, add instructions for the above part so the user can follow and run the code below with minimal instructions (make sure the collPath name is in these instructions)*

Now let's have a look at the collection that we just created via the Python interface:

In [None]:
homeCollPath = '/' + session.zone + '/home/research-test-christine'
collPath = "blabla"
coll = session.collections.get(homeCollPath +'/'+ collPath)
print(coll)

We can access the single metadata items and have a look at them:

In [None]:
for item in coll.metadata.items():
    print(item.name, item.value, item.units)

We see that there is a lot of metadata. Among the items that are printed you can find your metadata that you added in the steps above.

What happens if we change that metadata via the Python interface? 
First we will remove an item **Family_Name Staiger usr_6_s**. 

*TODO, give instructions above so that the we know what the user has to fill below and give very clear instructions for that*

In [None]:
coll.metadata.remove("Family_Name", "Staiger", "usr_6_s")

In [None]:
for item in coll.metadata.items():
    print(item.name, item.value, item.units)

Now open the web interface and check what has happened.

*TODO, add instructions where exactly to check the metadata*

Now let's try to add it again via the web portal.

![alt text](Yoda_metadata_web.png "Title")

If we now try to add the metadata item again via the Python interface Yoda will not allow us that.

In [None]:
coll.metadata.add("Family_Name", "Staiger", "usr_6_s")

**Note, that only the metadata that is rendered in the web interface will be archived!!!**. The metadata in the Yoda database will be ignored in thos step.

## Changing Yoda metadata for workflows
One of the metadata items in the list above is actually not only metadata but also a trigger for certain actions:
`org_action_log ["1663576969", "submitted for vault", "c.staiger@uu.nl#nluu12p"] None`.
This metadata indicates that the data has been submitted to the vault and your data manager received a message.

![alt text](submitted_vault.png "Title")
What happens if we change this metadata?

In [None]:
m_item = coll.metadata.items()[0]
print(m_item)

In [None]:
coll.metadata.remove(m_item)

Since the collection is no longer markerd as "submitted", the data manager does not get the drop down menu's choice to accept or reject the collection for archiving any longer and the procedure is interrupted:
![alt text](resubmit.png "Title")

The same is true for the metadata tag that triggers the replication of your data to another data centre.

## Accession Control Lists
Yoda uses accession control lists (ACLs) to manage who can do what on the data.

In [None]:
objPath = "/nluu12p/home/research-test-christine/books/AdventuresSherlockHolmes.txt"
obj = session.data_objects.get(objPath)

In [None]:
[vars(p) for p in session.permissions.get(obj)]

The data is owned by your research group,  `research-test-christine`, 3rd entry. This is set up by the Yoda group manager module. Let's see if we can retract `own` right and hide our data object from the group:

In [None]:
from irods.access import iRODSAccess
acl = iRODSAccess('own', obj.path, 'research-test-christine', session.zone)

In [None]:
session.permissions.set(acl)
[vars(p) for p in session.permissions.get(obj)]

*Yes*, we can ess up Yoda's group ACL structure.