# Create MoBIE Project

Create an example MoBIE project with the python mobie package.
See [the installation instructions](https://github.com/mobie/mobie-utils-python) to set up the python package.
For more details on the MoBIE and the MoBIE project structure check out [the MoBIE README](https://github.com/mobie/mobie#data-storage).

The data used in this example is part of the publication [Seipin and Nem1 establish discrete ER subdomains to initiate yeast lipid droplet biogenesis](https://doi.org/10.1083/jcb.201910177) and can be downloaded from [here](https://oc.embl.de/index.php/s/IV1709ZlcUB1k99).

In [None]:
# general imports
import os
import imageio
import mobie
import mobie.metadata as metadata

# the location of the data
# adapt these paths to your system and the input data you are using

# location of the input data. 
# the example data used in this notebook is available via this link:
# https://oc.embl.de/index.php/s/IV1709ZlcUB1k99
example_input_data = '/home/pape/Work/data/mobie/mobie-example-data'

# the location of the mobie project that will be created
# note that mobie project folders should always have the structure <PROECJT_ROOT_FOLDER/data>
# the folder 'data' will contain the sub-folders for individual datasets
mobie_project_folder = '/home/pape/Work/data/mobie/mobie_example_project/data'

# name of the dataset that will be created.
# one project can contain multiple datasets
dataset_name = 'example-dataset'
dataset_folder = os.path.join(mobie_project_folder, dataset_name)

# the platform and number of jobs used for computation.
# choose 'local' to run computations on your machine.
# for large data, it is also possible to run computation on a cluster;
# for this purpose 'slurm' (for slurm cluster) and 'lsf' (for lsf cluster) are currently supported
target = 'local'
max_jobs = 4

## Initialize the dataset

First, we need to initialize the dataset. This step includes generating the top-level project folder (if it's not present already), the subfolders for the new dataset and adding the "default" image for this dataset.
All these steps are performed by the function `add_image`.

This function accepts input image data in different formats. The input data is specified with the arguments
`input_path`, which specifies the file path and `input_key`, which specifies the internal path or search patterns.
- tif images (2d or 3d) - for this option set `input_key=''`
- folder with image files - for this option `input_key` needs to be the glob pattern for the image files, e.g `input_key='*.tif'` to load all tif files
- hdf5 file - `input_key` needs to be the internal file path
- n5 or zarr file - `input_key` needs to be the internal file path

The input files will be copied into the project folder in the [bdv.n5 dataformat](https://github.com/bigdataviewer/bigdataviewer-core/blob/master/BDV%20N5%20format.md) and an image pyramid will be created through consecutive downsampling.

To efficiently process large files the inputs should be in hdf5, n5 or zarr format.
Note that all inputs need to be either 2d or 3d images (volumes).
Multi-channel images (volumes) should be seperated into their channels and then each channel added individually (see `Adding image data` below).

In [None]:
# The 'default' image for our example dataset is a 2d EM slice showing an overview of the dataset.
input_file = os.path.join(example_input_data, 'em_overview.tif')

# This is the name that will be given to the image source in mobie.
raw_name = 'em-raw'

# We need some metadata to create the n5-file in big-data-viewer format:
# - unit: the phyiscal unit of the coordinate system
# - resolution: the size of one voxel in the physical unit, this needs to be a tuple/list of length 3,
#               specifying the size for each of the 3 spatial dimensions
# - chunks: the size of the chunks (in voxels) that are used to store the output file.
#           good choices are usually (1, 512, 512) for 2d data and (64, 64, 64) for 3d data
# - scale_factors: the scale factors used for downsampling the input when creating the image pyramid
#                  this needs to be a list, where each entry specifies the scale factors for the 3 axes.
# Note that axes are always listed in the order ZYX here (in the java implementation of mobie / big-data-viewer the axis convention is XYZ).
# Also note that the values for all three axes (ZYX) need to be specified. In the case of 2d data, the value
# for Z should be set to 1.
unit = 'nanometer'
resolution = (1., 10., 10.)
chunks = (1, 512, 512)
scale_factors = 4 * [[1, 2, 2]]

mobie.add_image(
    input_path=input_file, 
    input_key='',  # the input is a single tif image, so we leave input_key blank
    root=mobie_project_folder,
    dataset_name=dataset_name,
    image_name=raw_name,
    resolution=resolution,
    chunks=chunks,
    scale_factors=scale_factors,
    is_default_dataset=True,  # mark this dataset as the default dataset that will be loaded by mobie
    target=target,
    max_jobs=max_jobs,
    unit=unit
)

## Adding image data

After a dataset is created, we can add additional images to the dataset with the `add_image` function.

In [None]:
# First, we add two EM tomograms that are available in the example dataset.
# These tomograms show small areas in higher detail and in 3d.

# These are the two file names for the tomograms.
tomo_names = ['27_tomogram.tif', '29_tomogram.tif']

# We choose chunks and scale factors for 3d data, taking
# into account that the tomograms have a larger extent in the
# XY plane than in Z
unit = 'nanometer'
resolution = [5., 5., 5.]
chunks = (32, 128, 128)
scale_factors = [[1, 2, 2], [1, 2, 2],
                 [1, 2, 2], [1, 2, 2],
                 [2, 2, 2]]

# The tomograms need to be placed at the correct position w.r.t.
# the 2d em overview. This is achieved via an affine transformation,
# that has been determined externally and will be applied on the fly by big-data-viewer.
# Each affine transformation contains 12 parameters.
transformations = [
    [5.098000335693359, 0.0, 0.0, 54413.567834472655,
     0.0, 5.098000335693359, 0.0, 51514.319843292236,
     0.0, 0.0, 5.098000335693359, 0.0],
    [5.098000335693359, 0.0, 0.0, 39024.47988128662,
     0.0, 5.098000335693359, 0.0, 44361.50386505127,
     0.0, 0.0, 5.098000335693359, 0.0]
]

# add the two tomograms
for name, trafo in zip(tomo_names, transformations):
    im_name = f"em-{os.path.splitext(name)[0]}"
    im_path = os.path.join(example_input_data, name)
    
    # we need to pass additional 'view' arguments for the tomograms.
    # view arguments can modify the viewer state for loading the image source
    # here, we adjust the contrast limits to load the tomograms with
    # the correct contrast already and we set the affine trasnformtaiton
    # that will map the tomograms to the correct position via sourceTransforms
    im = imageio.volread(im_path)
    min_val, max_val = im.min(), im.max()
    view = metadata.get_default_view("image", im_name,
                                     source_transform={"parameters": trafo},
                                     contrastLimits=[min_val, max_val])
    mobie.add_image(
        input_path=im_path,
        input_key="",
        root=mobie_project_folder,
        dataset_name=dataset_name,
        image_name=im_name,
        resolution=resolution,
        scale_factors=scale_factors,
        transformation=trafo,
        chunks=chunks,
        target=target,
        max_jobs=max_jobs,
        view=view,
        unit=unit
    )

In [None]:
# Next, we add a fluorescence image that is also part of the example dataset.

input_path = os.path.join(example_input_data, 'fluorescence_downsampled.tif')

# The name of the image in mobie.
# Note that mobie will use the identifier in front of the first '-'
# to group images by name.
# So in this case we will have the two groups 'em' and 'lm'.
im_name = "lm-fluorescence"

# This is again a 2d image, so we set all values for Z to 1.
unit = 'nanometer'
resolution = [1., 100., 100.]
scale_factors = [[1, 2, 2], [1, 2, 2], [1, 2, 2]]
chunks = (1, 512, 512)

# we set the default display color to green.
view = metadata.get_default_view(
    "image", im_name,
    color="green"
)

mobie.add_image(
    input_path=input_path,
    input_key="",
    root=mobie_project_folder,
    dataset_name=dataset_name,
    image_name=im_name,
    resolution=resolution,
    scale_factors=scale_factors,
    view=view,
    chunks=chunks,
    target=target,
    max_jobs=max_jobs,
    unit=unit
)

In [None]:
# as last image, we add a binary mask for the foreground in the image
input_path = os.path.join(example_input_data, 'em_mask.tif')
mask_name = "em-mask"

# again, the mask is 2d
unit = "nanometer"
chunks = [1, 256, 256]
resolution = [1., 160., 160.]
scale_factors = [[1, 2, 2]]

mobie.add_image(
    input_path=input_path,
    input_key="",
    root=mobie_project_folder,
    dataset_name=dataset_name,
    image_name=mask_name,
    resolution=resolution,
    chunks=chunks,
    scale_factors=scale_factors,
    unit=unit
)

## Adding segmentation data

In addition to image data and masks, MoBIE supports segmentations, which contain label masks for different objects
(e.g. organs, cells, ultrastructure) in the volume. For segmentations, MoBIE also supports tables, which contain additional properties for the objects in the segmentation.
The function `add_segmentation` copies the input data and also generates the default table for the segmentation.

In [None]:
# we add a segmentation for several objects visible in the em-overview image
input_path = os.path.join(example_input_data, 'em_segmentation.tif')
segmentation_name = "em-segmentation"

unit = "nanometer"
resolution = [1., 30., 30.]
chunks = [1, 256, 256]
scale_factors = [[1, 2, 2], [1, 2, 2], [1, 2, 2], [1, 2, 2]]

mobie.add_segmentation(
    input_path=input_path,
    input_key="",
    root=mobie_project_folder,
    dataset_name=dataset_name,
    segmentation_name=segmentation_name,
    resolution=resolution,
    chunks=chunks,
    scale_factors=scale_factors,
    add_default_table=True  # add the default table with the properties mobie needs to interact with table and segmentation
)

## Adding and updating bookmarks

TODO desribe

In [None]:
# we update the default bookmark so that both the raw data 
# and the segmentation are loaded upon opening the dataset
source_list = [[raw_name], [segmentation_name]]
settings = [ 
    {"color": "white", "contrastLimits": [0., 255.]},
    {"color": "glasbey", "opacity": 0.75}
]
mobie.metadata.add_dataset_bookmark(dataset_folder, "default",
                                    sources=source_list, display_settings=settings,
                                    overwrite=True)

# TODO add a bookmark with affine transform and a grid bookmark for the tomograms

## Publishing the project 

The project created above will be located on the local filesystem at `mobie_project_folder`.
In order to share it with collaborators or make the data public, MoBIE can also read data stored in a
[AWS S3](https://aws.amazon.com/s3/) compatible object store.
For this, some additional metadata is necessary, that can be generated via `add_remote_project_metadata`.

The data then needs to be uploaded to the s3 storage by some appropriate tool and the metadata needs to be uploaded to github to make it accessible for MoBIE.

In [None]:
from mobie.metadata import add_remote_project_metadata

# to generate the metadata for publishing the project, the
# following information is needed:
# - bucket_name: the name of the bucket in the object store
# - service_endpoint: the address of the service endpoint used.
#                     this allows specifying object stores that are different from aws
#                     here, we use the object store located at EMBL Heidelberg as service endpoint.
#                     to use an aws s3 endpoint, set it to https://s3.amazonaws.com 
bucket_name = 'my-test-bucket'

service_endpoint = 'https://s3.embl.de'

metadata.add_remote_project_metadata(
    mobie_project_folder,
    bucket_name,
    service_endpoint
)

# Once the metadata is generated, you can upload your project. 
# MoBIE can access projects directly from an s3 compatible object store.
# Optionally the metadata can be uploaded to github to have it under version control;
# the github repository can also be used as entry point for the MoBIE viewer.

# 1.) Upload the complete folder at "mobie_project_folder" to the s3 bucket.
# There are several tools available to achieve this, for example
# aws s3 sync (https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html)
# The sync command would look something like this (assuming the file paths used in this example)
# $ aws s3 sync /home/pape/Work/data/mobie/mobie_example_project/data https://s3.embl.de/my-test-bucket

# 2.) (OPTIONAL!) Create a github repository for this project and upload the metadata to it:
# - Go to https://github.com/ and log into or create your account
# - Create a new empty (!) repository, e.g. called "my-mobie-project"
# - Go to /home/pape/Work/data/mobie_example_project in a terminal (again assuming the filepaths used in the example notebook)
# - Initialize git via 
#   $ git init
# - Add the repository you just created as remote via
#   $ git remote add origin https://github.com/<USERNAME>/my-mobie-project
# - Tell git to ignore the image data files (n5 files) by creating a file ".gitignore" and adding the line "*.n5"
#   This is very important, because otherwise we would add all the image data to git.
# - Add the metadata to git via
#   $ git add .
# - Upload the data to github via
#   $ git push origin master

## Troubleshooting

TODO