# Real Example 1 - CREATION SCRIPT - Import data and create entities

At this moment, after finishing the whole tutorial, we know all the posibilities that pyBIS offer us, and we are able to work with most of the entity types present in the instance. The next logical step, is to learn how we can apply all the acquired knowledge to possible real-life situations.

To do so, we will see **three** different examples of scripts: one for creation and modification, another one for getting information about the instance (number of objects, vocabularies, etc), and the other one for maintenance. In this part of the workshop, we will start with some base code, that we will be extending step by step.

In this example, we are going to work with a well-known dataset, the [**Iris Flower Classification Dataset**](https://en.wikipedia.org/wiki/Iris_flower_data_set), and learn how will be the procedure of working with this dataset using openBIS, following the workflow of an experiment.

To begin with it, **just run the cell below, changing the "your_username" placeholder for your BAM user name**. This will be used both for the connection and for selecting space.

In [None]:
user = "mmusterm"

### Connect to pyBIS

In [None]:
import getpass
from pybis import Openbis
o = Openbis('https://schulung.datastore.bam.de')
o.login(user, getpass.getpass(), save_token=True)

### Create dummy data

In [None]:
space_code = user.upper()
project_code = 'IRIS_PROJECT'
collection_code = 'IRIS_EXPERIMENT'
object_code = 'TESTING_IRIS'

my_space = o.get_space(space_code)

try:
    my_project = my_space.get_project(project_code)
except ValueError:
    my_project = o.new_project(space=my_space, code=project_code)
    my_project.save()

try:
    my_collection = my_space.get_collection(collection_code)
except ValueError:
    my_collection = o.new_collection(project=project_code, code=collection_code, type='DEFAULT_EXPERIMENT')
    my_collection.save()

my_object = my_space.get_objects(code=object_code, project=project_code, collection=my_collection, type='EXPERIMENTAL_STEP')[0]
if not my_object:
    my_object = o.new_object(code=object_code, collection=my_collection, type='EXPERIMENTAL_STEP')
    my_object.save()

## Upload a dataset and attach to an experimental step

In [None]:
o.get_dataset_types() #list dataset types to select the desired one

In [None]:
my_space.get_collections() #list collections to check where we want to upload the dataset

In [None]:
my_space.get_objects() #list objects to check where we want to upload the dataset

In [None]:
iris_experiment = my_space.get_collection(collection_code) #save selected collection in a variable
testing_iris = my_space.get_objects(code=object_code, collection=my_collection)[0] #save selected object in a variable

In [None]:
iris_test = o.new_dataset(
    type = 'ATTACHMENT', #selected type for the dataset
    collection = iris_experiment, #selected collection
    object = testing_iris, #selected object
    files = ['datasets/iris.csv'] #iris dataset to upload
)
iris_test.save()

## Add a description for the experimental step (modify poperty)

In [None]:
#--------------------------NEW CODE------------------------------
testing_iris.props
testing_iris.props['experimental_step.experimental_description'] = 'Analyzing iris flower classification csv'
#--------------------------NEW CODE------------------------------
iris_test = o.new_dataset(
    type = 'ATTACHMENT', #selected type for the dataset
    collection = iris_experiment, #selected collection
    object = testing_iris, #selected object
    files = ['datasets/iris.csv'] #iris dataset to upload
)
iris_test.save()

## Create the experimental step first

In [None]:
o.get_object_types()

In [None]:
#--------------------------NEW CODE------------------------------
obtaining_flowers = my_space.get_objects(project=project_code, collection=iris_experiment, type='EXPERIMENTAL_STEP', code="OBTAINING_FLOWERS")[0]
if not obtaining_flowers:
    obtaining_flowers = o.new_object(
        type = 'EXPERIMENTAL_STEP',
        project = project_code,
        collection = iris_experiment,
        code = "OBTAINING_FLOWERS"
    )
    
obtaining_flowers.props

obtaining_flowers.props['experimental_step.experimental_description'] = 'Obtaining iris flower samples'

obtaining_flowers.save()
#--------------------------NEW CODE------------------------------
testing_iris = my_space.get_objects(code=object_code, collection=my_collection)[0] #save selected object in a variable

testing_iris.props

testing_iris.props['experimental_step.experimental_description'] = 'Analyzing iris flower classification csv'

iris_samples = o.new_dataset(
    type = 'ATTACHMENT',
    collection = iris_experiment,
    object = obtaining_flowers,
    files = ['datasets/iris.csv']
)
iris_samples.save()

## Add experimental step before the analysis: obtaining the data

In [None]:
iris_experiment = my_space.get_collection(collection_code) #save selected collection in a variable

obtaining_flowers = my_space.get_objects(project=project_code, collection=iris_experiment, type='EXPERIMENTAL_STEP', code="OBTAINING_FLOWERS")[0]
if not obtaining_flowers:
    obtaining_flowers = o.new_object(
        type = 'EXPERIMENTAL_STEP',
        collection = iris_experiment,
        project = project_code,
        code = "OBTAINING_FLOWERS"
    )
    
obtaining_flowers.props

obtaining_flowers.props['experimental_step.experimental_description'] = 'Obtaining iris flower samples'

obtaining_flowers.save()

testing_iris = my_space.get_objects(code=object_code, collection=my_collection)[0] #save selected object in a variable

testing_iris.props

testing_iris.props['experimental_step.experimental_description'] = 'Classifying iris flower classification csv'

testing_iris.save()

#--------------------------NEW CODE------------------------------
testing_iris.parents = obtaining_flowers
testing_iris.save()
#--------------------------NEW CODE------------------------------
iris_samples = o.new_dataset(
    type = 'ATTACHMENT',
    collection = iris_experiment,
    object = obtaining_flowers,
    files = ['datasets/iris.csv']
)
iris_samples.save()

## What if we need another (custom) object type? -> change to use another type (add it as child of the first one, and upload a dataset with the results)

In [None]:
iris_experiment = my_space.get_collection(collection_code) #save selected collection in a variable
#--------------------------NEW CODE------------------------------
analyze_results = my_space.get_objects(project=project_code, collection=iris_experiment, type='DOCUMENT', code="ANALYZE_RESULTS")[0]
if not analyze_results:
    analyze_results = o.new_object(
        type = 'DOCUMENT',
        collection = iris_experiment,
        project = project_code,
        code = "ANALYZE_RESULTS"
    )

analyze_results.props

analyze_results.props['notes'] = 'Analyzing results from the experiment'

analyze_results.save()
#--------------------------NEW CODE------------------------------
obtaining_flowers = my_space.get_objects(project=project_code, type='EXPERIMENTAL_STEP', code="OBTAINING_FLOWERS")[0]
if not obtaining_flowers:
    obtaining_flowers = o.new_object(
        type = 'EXPERIMENTAL_STEP',
        collection = iris_experiment,
        code = "OBTAINING_FLOWERS"
    )
    
obtaining_flowers.props

obtaining_flowers.props['experimental_step.experimental_description'] = 'Obtaining iris flower samples'

obtaining_flowers.save()

testing_iris = my_space.get_objects(code=object_code, collection=my_collection)[0] #save selected object in a variable

testing_iris.props

testing_iris.props['experimental_step.experimental_description'] = 'Classifying iris flower classification csv'

testing_iris.save()

testing_iris.parents = obtaining_flowers
#--------------------------NEW CODE------------------------------
testing_iris.children = analyze_results
#--------------------------NEW CODE------------------------------
testing_iris.save()


iris_samples = o.new_dataset(
    type = 'ATTACHMENT',
    collection = iris_experiment,
    object = obtaining_flowers,
    files = ['datasets/iris.csv']
)
iris_samples.save()
#--------------------------NEW CODE------------------------------
iris_results = o.new_dataset(
    type = 'DOCUMENT',
    collection = iris_experiment,
    object = analyze_results,
    files = ['iris_results.txt']
)
iris_results.save()
#--------------------------NEW CODE------------------------------

## What if the experiment needs to be created first? -> extend to create experiment first

In [None]:
o.get_collection_types()

In [None]:
o.get_projects()

In [None]:
#--------------------------NEW CODE------------------------------
try:
    new_iris_exp = my_space.get_collection("NEW_IRIS_EXPERIMENT")
except ValueError:
    new_iris_exp = o.new_collection(
        code = 'NEW_IRIS_EXPERIMENT', # the code for the collection. Like the name, but unique
        type = 'DEFAULT_EXPERIMENT', # type for the collection. Should be one of the available types on the type list
        project = project_code # this is the project that we previously created, saved in the variable "new_project"
    )
    new_iris_exp.save()
#--------------------------NEW CODE------------------------------

new_analyze_results = my_space.get_objects(collection = new_iris_exp, project = project_code, type='DOCUMENT', code="NEW_ANALYZE_RESULTS")[0]
if not new_analyze_results:
    new_analyze_results = o.new_object(
        type = 'DOCUMENT',
        project = project_code,
        collection = new_iris_exp,
        code = "NEW_ANALYZE_RESULTS"
    )


new_analyze_results.props

new_analyze_results.props['notes'] = 'Analyzing results from the experiment'

new_analyze_results.save()

new_obtaining_flowers = my_space.get_objects(project = project_code, collection=new_iris_exp, type='EXPERIMENTAL_STEP', code="NEW_OBTAINING_FLOWERS")[0]
if not new_obtaining_flowers:
    new_obtaining_flowers = o.new_object(
        type = 'EXPERIMENTAL_STEP',
        project = project_code,
        collection=new_iris_exp,
        code = "NEW_OBTAINING_FLOWERS"
    )
    
new_obtaining_flowers.props

new_obtaining_flowers.props['experimental_step.experimental_description'] = 'Obtaining iris flower samples'

new_obtaining_flowers.save()

new_testing_iris = my_space.get_objects(project = project_code, collection=new_iris_exp, type='EXPERIMENTAL_STEP', code="NEW_TESTING_IRIS")[0] #save selected object in a variable
if not new_testing_iris:
    new_testing_iris = o.new_object(
        type = 'EXPERIMENTAL_STEP',
        project = project_code,
        collection=new_iris_exp,
        code = "NEW_TESTING_IRIS"
    )

new_testing_iris.props

new_testing_iris.props['experimental_step.experimental_description'] = 'Classifying iris flower classification csv'

new_testing_iris.save()

new_testing_iris.add_parents(new_obtaining_flowers)
new_testing_iris.add_children(new_analyze_results)
new_testing_iris.save()


new_iris_samples = o.new_dataset(
    type = 'ATTACHMENT',
    collection = new_iris_exp,
    object = new_obtaining_flowers,
    files = ['datasets/iris.csv']
)
new_iris_samples.save()

new_iris_results = o.new_dataset(
    type = 'DOCUMENT',
    collection = new_iris_exp,
    object = new_analyze_results,
    files = ['iris_results.txt']
)
new_iris_results.save()

## What if we want another (custom) collection type? -> change to use another type

In [None]:
try:
    col_iris_exp = my_space.get_collection("COL_IRIS_EXPERIMENT")
except ValueError:
    col_iris_exp = o.new_collection(
        code = 'COL_IRIS_EXPERIMENT', # the code for the collection. Like the name, but unique
        #--------------------------NEW CODE------------------------------
        type = 'COLLECTION', # type for the collection. Should be one of the available types on the type list
        #--------------------------NEW CODE------------------------------
        project = project_code # this is the project that we previously created, saved in the variable "new_project"
    )
    col_iris_exp.save()

col_analyze_results = my_space.get_objects(collection = col_iris_exp, project = project_code, type='DOCUMENT', code="COL_ANALYZE_RESULTS")[0]
if not col_analyze_results:
    col_analyze_results = o.new_object(
        type = 'DOCUMENT',
        project = project_code,
        collection = col_iris_exp,
        code = "COL_ANALYZE_RESULTS"
    )


col_analyze_results.props

col_analyze_results.props['notes'] = 'Analyzing results from the experiment'

col_analyze_results.save()

col_obtaining_flowers = my_space.get_objects(project = project_code, collection=col_iris_exp, type='EXPERIMENTAL_STEP', code="COL_OBTAINING_FLOWERS")[0]
if not col_obtaining_flowers:
    col_obtaining_flowers = o.new_object(
        type = 'EXPERIMENTAL_STEP',
        project = project_code,
        collection=col_iris_exp,
        code = "COL_OBTAINING_FLOWERS"
    )
    
col_obtaining_flowers.props

col_obtaining_flowers.props['experimental_step.experimental_description'] = 'Obtaining iris flower samples'

col_obtaining_flowers.save()

col_testing_iris = my_space.get_objects(project = project_code, collection=col_iris_exp, type='EXPERIMENTAL_STEP', code="COL_TESTING_IRIS")[0] #save selected object in a variable
if not col_testing_iris:
    col_testing_iris = o.new_object(
        type = 'EXPERIMENTAL_STEP',
        project = project_code,
        collection=col_iris_exp,
        code = "COL_TESTING_IRIS"
    )

col_testing_iris.props

col_testing_iris.props['experimental_step.experimental_description'] = 'Classifying iris flower classification csv'

col_testing_iris.save()

col_testing_iris.add_parents(col_obtaining_flowers)
col_testing_iris.add_children(col_analyze_results)
col_testing_iris.save()


col_iris_samples = o.new_dataset(
    type = 'ATTACHMENT',
    collection = col_iris_exp,
    object = col_obtaining_flowers,
    files = ['datasets/iris.csv']
)
col_iris_samples.save()

col_iris_results = o.new_dataset(
    type = 'DOCUMENT',
    collection = col_iris_exp,
    object = col_analyze_results,
    files = ['iris_results.txt']
)
col_iris_results.save()

## What if we need to work in a different project? -> extend to create a project first

In [None]:
#--------------------------NEW CODE------------------------------
try:
    iris_classification = my_space.get_project("IRIS_CLASSIFICATION")
except ValueError:
    iris_classification = o.new_project(
        code = 'IRIS_CLASSIFICATION', #the code for the project. Like the name, but unique
        space = my_space, # the space that we just previously got
    )
    iris_classification.save()
#--------------------------NEW CODE------------------------------

try:
    clas_iris_exp = my_space.get_collection("CLAS_IRIS_EXPERIMENT")
except ValueError:
    clas_iris_exp = o.new_collection(
        code = 'CLAS_IRIS_EXPERIMENT', # the code for the collection. Like the name, but unique
        type = 'COLLECTION', # type for the collection. Should be one of the available types on the type list
        project = iris_classification # this is the project that we previously created, saved in the variable "new_project"
    )
    clas_iris_exp.save()

clas_analyze_results = my_space.get_objects(collection = clas_iris_exp, project = iris_classification, type='DOCUMENT', code="CLAS_ANALYZE_RESULTS")[0]
if not clas_analyze_results:
    clas_analyze_results = o.new_object(
        type = 'DOCUMENT',
        project = iris_classification,
        collection = clas_iris_exp,
        code = "CLAS_ANALYZE_RESULTS"
    )


clas_analyze_results.props

clas_analyze_results.props['notes'] = 'Analyzing results from the experiment'

clas_analyze_results.save()

clas_obtaining_flowers = my_space.get_objects(project = iris_classification, collection=clas_iris_exp, type='EXPERIMENTAL_STEP', code="CLAS_OBTAINING_FLOWERS")[0]
if not clas_obtaining_flowers:
    clas_obtaining_flowers = o.new_object(
        type = 'EXPERIMENTAL_STEP',
        project = iris_classification,
        collection=clas_iris_exp,
        code = "CLAS_OBTAINING_FLOWERS"
    )
    
clas_obtaining_flowers.props

clas_obtaining_flowers.props['experimental_step.experimental_description'] = 'Obtaining iris flower samples'

clas_obtaining_flowers.save()

clas_testing_iris = my_space.get_objects(project = iris_classification, collection=clas_iris_exp, type='EXPERIMENTAL_STEP', code="CLAS_TESTING_IRIS")[0] #save selected object in a variable
if not clas_testing_iris:
    clas_testing_iris = o.new_object(
        type = 'EXPERIMENTAL_STEP',
        project = iris_classification,
        collection=clas_iris_exp,
        code = "CLAS_TESTING_IRIS"
    )

clas_testing_iris.props

clas_testing_iris.props['experimental_step.experimental_description'] = 'Classifying iris flower classification csv'

clas_testing_iris.save()

clas_testing_iris.parents = clas_obtaining_flowers
clas_testing_iris.children = clas_analyze_results
clas_testing_iris.save()


clas_iris_samples = o.new_dataset(
    type = 'ATTACHMENT',
    collection = clas_iris_exp,
    object = clas_obtaining_flowers,
    files = ['datasets/iris.csv']
)
clas_iris_samples.save()

clas_iris_results = o.new_dataset(
    type = 'DOCUMENT',
    collection = clas_iris_exp,
    object = clas_analyze_results,
    files = ['iris_results.txt']
)
clas_iris_results.save()