# Purpose of this notebook
In order to start using Neuroglancer, we will need to have our data in the right format. Neuroglancer does not accept the standard image formats, such as TIFF. It does this for a good reason which is to be more efficient for very large volumes. One of the formats it uses is called "precomputed" format, and that is the one we will use in this notebook. Fortunately there is a python pipeline for making precomputed data from TIFF files.

This notebook covers how to convert a volumetric dataset obtained from Princeton's light sheet microscope to precomputed format so that it can be viewed in Neuroglancer. 

This notebook also covers how to host the precomputed data on your local machine so that Neuroglancer can load it. 

## A quick note about Neuroglancer
Neuroglancer loads in datasets in "layers". Layers come in different types. The two most common types are "image" layers (like what you would get as output from the light sheet microscope) and "segmentation" layers (like an atlas annotation volume). The naming is a little confusing because both layer types refer to volumes (3-d objects). In this notebook, we will be only be using one layer, and it is of type "image".

# Setup
In order to run the code in this notebook, you will need a conda environment with python3 and containing some additional libraries. This environment "ng_demo" can be set up in the following way:
In terminal:
- conda create -n ng_demo python=3.7.4 -y
- conda activate ng_demo (or "source activate ng_demo", depending on which version of conda you have)
- pip install cloud-volume
- pip install SimpleITK <br>

\# To enable you to use jupyter notebooks to work with this environment as a kernel:
- pip install --user ipykernel
- python -m ipykernel install --user --name=ng_demo

Once this is all installed, make sure to select this conda environment as the kernel when running this notebook (you might have to restart the notebook server)

In [None]:
import os,csv,json
import numpy as np
from cloudvolume import CloudVolume
from cloudvolume.lib import mkdir, touch
import SimpleITK as sitk

from concurrent.futures import ProcessPoolExecutor

# An example whole mouse brain volume is here: 
registered_brain_vol_path = '/jukebox/LightSheetData/lightserv_testing/lightserv-test/two_channels/two_channels-001/imaging_request_1/output/processing_request_1/resolution_1.3x/elastix/result.1.tif'
# Decide on a folder name where your precomputed data for this layer is going to live. 
# In this example my layer folder will be 'my_registered_volume' saved on my local hard drive.
# The output will take up ~120 MB of space
ng_home_dir = '/home/ahoag/ngdemo/demo_bucket/ng_tutorial'
layer_dir = os.path.join(ng_home_dir,'my_registered_volume')
# Make the layer directory
mkdir(layer_dir) # won't overwrite if already exists, also won't complain if already exists
print(f"Using {layer_dir}")
    
# Finally, decide how many cpus you are willing and able to use for the parallelized conversion (see step 3)
cpus_to_use = 8

## Step 1: Write the instructions ("info") file that will tell Neuroglancer about your volume 
The info file is a required file for the precomputed data format. It is a JSON-formatted file (looks like a nested python dictionary) containing things like the shape and physical resolution of your volume.

In [None]:
def make_info_file(resolution_xyz,volume_size_xyz,layer_dir):
    """ Make an JSON-formatted file called the "info" file
    for use with the precomputed data format. 
    Precomputed is one of the formats that Neuroglancer can read in.  
    --- parameters ---
    resolution_xyz:      A tuple representing the size of the pixels (dx,dy,dz) 
                         in nanometers, e.g. (20000,20000,5000) for 20 micron x 20 micron x 5 micron
    
    volume_size_xyz:     A tuple representing the number of pixels in each dimension (Nx,Ny,Nz)
    
                         
    layer_dir:           The directory where the precomputed data will be
                         saved
    """
    info = CloudVolume.create_new_info(
        num_channels = 1,
        layer_type = 'image', # 'image' or 'segmentation'
        data_type = 'uint16', # 32 not necessary for atlases unless you have > 2^(32)-1 labels  
        encoding = 'raw', # other options: 'jpeg', 'compressed_segmentation' (req. uint32 or uint64)
        resolution = resolution_xyz, # X,Y,Z values in nanometers, 40 microns in each dim
        voxel_offset = [ 0, 0, 0 ], # values X,Y,Z values in voxels
        chunk_size = [ 1024, 1024, 1 ], # rechunk of image X,Y,Z in voxels.
        volume_size = volume_size_xyz, # X,Y,Z size in voxels
    )

    vol = CloudVolume(f'file://{layer_dir}', info=info)
    vol.provenance.description = "A test info file" # can change this if you want a description
    vol.provenance.owners = [''] # list of contact email addresses
    # Saves the info and provenance files for the first time
    vol.commit_info() # generates info json file
    vol.commit_provenance() # generates provenance json file
    print("Created CloudVolume info file: ",vol.info_cloudpath)

    return vol

In [None]:
## Make the info file.

# The volume is registered to the Princeton Mouse Atlas which has 20micron isotropic resolution 
resolution_xyz = (20000,20000,20000) # must be in nanometers
# Load the volume into memory and get its shape - can be slow if using VPN from off-campus. 
# Might just be faster to copy the file to your local disk and then change the registered_brain_vol_path 
# to point to the local copy
brain_vol = np.array(sitk.GetArrayFromImage(
    sitk.ReadImage(registered_brain_vol_path)),dtype=np.uint16,order='F')
z_dim,y_dim,x_dim = brain_vol.shape
volume_size_xyz = (x_dim,y_dim,z_dim)


# Write the info file
vol = make_info_file(
    resolution_xyz=resolution_xyz,
    volume_size_xyz=volume_size_xyz,
    layer_dir=layer_dir)


# Step 2: Convert volume to precomputed data format
First we create a directory (the "progress_dir") at the same level as the layer directory to keep track of the progress of the conversion. 
All the conversion does is copy the numpy array representing the 3d volume to a new object "vol". This is done one plane at a time (although it is parallelized). As each plane is converted, an empty file is created in the progress_dir with the name of the plane. By the end of the conversion, there should be as many files in this progress_dir as there are z planes. 

In [None]:
layer_name = layer_dir.split('/')[-1]
parent_dir = '/'.join(layer_dir.split('/')[:-1])
# print(parent_dir)
progress_dir = mkdir(parent_dir+ f'/progress_{layer_name}') # unlike os.mkdir doesn't crash on prexisting 
print(f"created directory: {progress_dir}")

In [None]:
def process_slice(z):
    """ This function copies a 2d z-plane slice from the atlas volume
    to the cloudvolume object, vol. We will run this in parallel over 
    all z planes
    ---parameters---
    z:    An integer representing the 0-indexed z plane to be converted
    """
    if os.path.exists(os.path.join(progress_dir, str(z))):
        print(f"Slice {z} already processed, skipping ")
        return
    if z >= z_dim: # z is zero indexed and runs from 0-(z_dim-1)
        print("Index {z} >= z_dim of volume, skipping")
        return
    print('Processing slice z=',z)
    array = brain_vol[z].reshape((1,y_dim,x_dim)).T
    vol[:,:, z] = array
    touch(os.path.join(progress_dir, str(z)))
    return "success"


In [None]:
# Run the conversion in parallel. It's not a huge amount of processing but the more cores the better

# First figure out if there are any planes that have already been converted 
# by checking the progress dir
done_files = set([ int(z) for z in os.listdir(progress_dir) ])
all_files = set(range(vol.bounds.minpt.z, vol.bounds.maxpt.z))
# Figure out the ones we still need to convert 
to_upload = [ int(z) for z in list(all_files.difference(done_files)) ]
to_upload.sort()
print(f"Have {len(to_upload)} planes to upload")
with ProcessPoolExecutor(max_workers=cpus_to_use) as executor:
    for result in executor.map(process_slice,to_upload):
        try:
            print(result)
        except Exception as exc:
            print(f'generated an exception: {exc}')

# Step 3: Host the precomputed data on your machine so that Neuroglancer can see it
This step is really easy! Note: Executing the cell below will cause your jupyter notebook to hang (expected). This is the last cell you need to run in the notebook, but if for some reason you want to use the same notebook for something else then run the code in the cell below elsewhere (e.g. a new terminal window). 

In [None]:
vol = CloudVolume(f'file://{layer_dir}')
vol.viewer(port=1338)

# Step 4: View your custom volume and labels in Neuroglancer!
Step 3 hosts your data via http on port 1338 of your local machine. When you run Neuroglancer in your browser, you can tell it to look for data hosted at a particular port. To do this, open up the Braincogs Neuroglancer client: [https://braincogs00.pni.princeton.edu/nglancer_viewer](https://braincogs00.pni.princeton.edu/nglancer_viewer) (you must be using a Princeton VPN) and then click the "+" in the upper left hand corner of the screen once the black screen loads. To load in your data, type the following into the source text box:<br>
> precomputed://http://localhost:1338 <br>

Then hit tab and name your layer if you'd like. Hit enter or the "add layer" button and your layer should load into Neuroglancer. The first thing you will notice if it works is the image is all black within the yellow bounding box. Your data are there but you need to change the contrast to see it. To do that, use the "d" and "f" keys until you have the contrast you like. The "i" key inverts the colormap.