# User-defined metadata

Next to system metadata, iRODS allows you to create own metadata with data objects and collections.

You can use that metadata to describe your data andlater search for this data; and it can help you keeping the overview of what was the input for an analysis and what is the outcome.

Technically, iRODS offers metadata as key-value-units triple. Let's investigate this:

## Add metadata to data objects

As always: first we have to create an iRODS session:

In [1]:
from ibridges.interactive import interactive_auth
session = interactive_auth()

Auth without password


Now we can retrieve a data object and insect its metadata.

In [5]:
from ibridges import get_dataobject
from ibridges.path import IrodsPath
from ibridges.meta import MetaData

irods_coll_path = IrodsPath(session, '~').joinpath('demo')
obj = get_dataobject(session, irods_coll_path.joinpath('demofile.txt'))

obj_metadata = MetaData(obj)
print(obj_metadata)




Most probably you will see no metadata in the above cell. **Note, that system metadata and user-defined metadata are two different entities in a data object!**
With the command `MetaData(obj)` we only retrieve the user-defined metadata.

<img src="img/DataObject4.png" width="400">

Now we can add some own metadata. The metadata comes as key-value-units triple:

In [6]:
obj_metadata.add('Key', 'Value', 'Units')
print(obj_metadata)

 - {name: Key, value: Value, units: Units}



Sometimes we do not really have `units`, so we can leave this part empty:

In [9]:
obj_metadata.add('Author', 'Christine')
print(obj_metadata)

 - {name: Author, value: Christine, units: None}
 - {name: Key, value: Value, units: Units}



We can also add a second author:

In [10]:
obj_metadata.add('Author', 'Raoul')
print(obj_metadata)

 - {name: Author, value: Christine, units: None}
 - {name: Author, value: Raoul, units: None}
 - {name: Key, value: Value, units: Units}



You see, that keys in **iRODS metadata keys can have different values**. That is different from python dictionaries where one key can only have one value. **How then to overwrite a value?**

## Overwrite metadata

If youw wish to *overwrite* a value, you can first add the new metadata key-value-units triple as above and subsequently remove the old one, you need to specify the whole triple if the metadata contains a units part. As you see the follwing command will fail:

In [12]:
obj_metadata.delete('Key', 'Value')

KeyError: "Cannot delete metadata with key 'Key', value 'Value' and units 'None' since it does not exist."

While this one will succeed:

In [13]:
obj_metadata.delete('Key', 'Value', 'Units')

You can also set all existing values to **one** new value:

In [16]:
print(obj_metadata)
obj_metadata.set('Author', 'Maarten')
print(obj_metadata)

 - {name: Author, value: Christine, units: None}
 - {name: Author, value: Raoul, units: None}

 - {name: Author, value: Maarten, units: None}



## Add metadata to collections

The same functionality we saw above, we can use for collections:

In [17]:
from ibridges import get_collection
coll = get_collection(session, irods_coll_path)
coll_metadata = MetaData(coll)
print(coll_metadata)




In [18]:
coll_metadata.add('TypeOfCollection', 'Results')
print(coll_metadata)

 - {name: TypeOfCollection, value: Results, units: None}



## Which metadata can help you keeping an overview?

iRODS metadata can help you keeping an overview while youare working with data and maybe many files which have relations to each other. There are ontologies which define keywords and links between keywords like the **[prov-o Ontology](https://www.w3.org/TR/prov-o/#prov-o-at-a-glance)**.

Let's see how we can annotate our test data, so that we know that it is test data.

In [19]:
from datetime import datetime
coll_metadata.add('prov:wasGeneratedBy', 'Christine')
coll_metadata.add('CollectionType', 'testcollection')
obj_metadata.add('prov:SoftwareAgent', 'iRODS jupyter Tutorial')
obj_metadata.add('DataType', 'testdata')

Now we have some more descriptive metadata that gives us hints, in which context the data was created:

In [20]:
print(coll_metadata)
print(obj_metadata)

 - {name: CollectionType, value: testcollection, units: None}
 - {name: TypeOfCollection, value: Results, units: None}
 - {name: prov:wasGeneratedBy, value: Christine, units: None}

 - {name: Author, value: Maarten, units: None}
 - {name: DataType, value: testdata, units: None}
 - {name: prov:SoftwareAgent, value: iRODS jupyter Tutorial, units: None}

