## Pipeline for microendoscopic data processing in CaImAn using the CNMF-E algorithm
This demo presents a complete pipeline for processing microendoscopic data using CaImAn. It includes:
- Motion Correction using the NoRMCorre algorithm
- Source extraction using the CNMF-E algorithm
- Deconvolution using the OASIS algorithm

Some basic visualization is also included. The demo illustrates how to `params`, `MoctionCorrection` and `cnmf` object for processing 1p microendoscopic data. For processing two-photon data consult the related `demo_pipeline.ipynb` demo. For more information see the companion CaImAn paper.

In [None]:
try:
    get_ipython().magic(u'load_ext autoreload')
    get_ipython().magic(u'autoreload 2')
    get_ipython().magic(u'matplotlib qt')
except:
    pass

import logging
import matplotlib.pyplot as plt
import numpy as np

logging.basicConfig(format=
                          "%(relativeCreated)12d [%(filename)s:%(funcName)20s():%(lineno)s] [%(process)d] %(message)s",
                    # filename="/tmp/caiman.log",
                    level=logging.DEBUG)

import caiman as cm
from caiman.source_extraction import cnmf
from caiman.utils.utils import download_demo
from caiman.utils.visualization import inspect_correlation_pnr, nb_inspect_correlation_pnr
from caiman.motion_correction import MotionCorrect
from caiman.source_extraction.cnmf import params as params
from caiman.utils.visualization import plot_contours, nb_view_patches, nb_plot_contour
import cv2

try:
    cv2.setNumThreads(0)
except:
    pass
import bokeh.plotting as bpl
import holoviews as hv
bpl.output_notebook()
hv.notebook_extension('bokeh')

### Select file(s) to be processed
The `download_demo` function will download the specific file for you and return the complete path to the file which will be stored in your `caiman_data` directory. If you adapt this demo for your data make sure to pass the complete path to your file(s). Remember to pass the `fnames` variable as a list. Note that the memory requirement of the CNMF-E algorithm are much higher compared to the standard CNMF algorithm. Test the limits of your system before trying to process very large amounts of data.

In [None]:
fnames = [f'C:\\ENTER_FILE_DIRECTORY_HERE\\msCam{i}.tif' for i in range(0, 2)]  # filename to be processed
#i list ends +1 from last video name
#fnames = [download_demo(fnames[0])]
print(fnames)

In [None]:
print(fnames)

In [None]:
fname_base = ''
for i in range(len(fnames[0])-1,0,-1):
    fname_base = fnames[0][0:i]
    if(fnames[0][i] == '\\'): 
        fname_base = fnames[0][0:i]
        break
print(fname_base)

### Setup a cluster
To enable parallel processing a (local) cluster needs to be set up. This is done with a cell below. The variable `backend` determines the type of cluster used. The default value `'local'` uses the multiprocessing package. The `ipyparallel` option is also available. More information on these choices can be found [here](https://github.com/flatironinstitute/CaImAn/blob/master/CLUSTER.md). The resulting variable `dview` expresses the cluster option. If you use `dview=dview` in the downstream analysis then parallel processing will be used. If you use `dview=None` then no parallel processing will be employed.

In [None]:
#%% start a cluster for parallel processing (if a cluster already exists it will be closed and a new session will be opened)
if 'dview' in locals():
    cm.stop_server(dview=dview)
c, dview, n_processes = cm.cluster.setup_cluster(
    backend='local', n_processes=None, single_thread=False)


### Setup some parameters
We first set some parameters related to the data and motion correction and create a `params` object. We'll modify this object with additional settings later on. You can also set all the parameters at once as demonstrated in the `demo_pipeline.ipynb` notebook.

In [None]:
# dataset dependent parameters
frate = 20                       # movie frame rate
decay_time = 0.4                 # length of a typical transient in seconds

# motion correction parameters
motion_correct = True # flag for performing motion correction
pw_rigid = False    # flag for performing piecewise-rigid motion correction (otherwise just rigid)
gSig_filt = (10, 10)       # size of high pass spatial filtering, used in 1p data
max_shifts = (80, 80)      # maximum allowed rigid shift
strides = (48, 48)       # start a new patch for pw-rigid motion correction every x pixels
overlaps = (24, 24)      # overlap between pathes (size of patch strides+overlaps)
max_deviation_rigid = 2  # maximum deviation allowed for patch with respect to rigid shifts
border_nan = 'copy'      # replicate values along the boundaries
useCuda = True 
mc_dict = {
    'fnames': fnames,
    'fr': frate,
    'decay_time': decay_time,
    'pw_rigid': pw_rigid,
    'max_shifts': max_shifts,
    'gSig_filt': gSig_filt,
    'strides': strides,
    'overlaps': overlaps,
    'max_deviation_rigid': max_deviation_rigid,
    'border_nan': border_nan
    ,'use_cuda': False
}

opts = params.CNMFParams(params_dict=mc_dict)


### Motion Correction
The background signal in micro-endoscopic data is very strong and makes the motion correction challenging. 
As a first step the algorithm performs a high pass spatial filtering with a Gaussian kernel to remove the bulk of the background and enhance spatial landmarks. 
The size of the kernel is given from the parameter `gSig_filt`. If this is left to the default value of `None` then no spatial filtering is performed (default option, used in 2p data).
After spatial filtering, the NoRMCorre algorithm is used to determine the motion in each frame. The inferred motion is then applied to the *original* data so no information is lost.

The motion corrected files are saved in memory mapped format. If no motion correction is being performed, then the file gets directly memory mapped.

In [None]:
    if motion_correct:
        # do motion correction rigid
        mc = MotionCorrect(fnames, dview=dview, **opts.get_group('motion'))
        mc.motion_correct(save_movie=True)
        fname_mc = mc.fname_tot_els if pw_rigid else mc.fname_tot_rig
        if pw_rigid:
            bord_px = np.ceil(np.maximum(np.max(np.abs(mc.x_shifts_els)),
                                         np.max(np.abs(mc.y_shifts_els)))).astype(np.int)
        else:
            bord_px = np.ceil(np.max(np.abs(mc.shifts_rig))).astype(np.int)
            plt.subplot(1, 2, 1); plt.imshow(mc.total_template_rig)  # % plot template
            plt.subplot(1, 2, 2); plt.plot(mc.shifts_rig)  # % plot rigid shifts
            plt.legend(['x shifts', 'y shifts'])
            plt.xlabel('frames')
            plt.ylabel('pixels')

        bord_px = 0 if border_nan is 'copy' else bord_px
        fname_new = cm.save_memmap(fname_mc, base_name='memmap_', order='C',
                                   border_to_0=bord_px)
    else:  # if no motion correction just memory map the file
        fname_new = cm.save_memmap(fnames, base_name='memmap_',
                                   order='C', border_to_0=0, dview=dview)

In [None]:
# Start here if you do not want to create any more motion corrected objects...

In [None]:
print(fname_new)

In [None]:
Yr, dims, T = cm.load_memmap(fname_new)
images = Yr.T.reshape((T,) + dims, order='F')

images = np.array(images)
type(images)
#print(images)
directory = fname_base + '\\DFF\\'

import os
if not os.path.exists(directory):
    os.mkdir(directory)

In [None]:
import tifffile

tifffile.imsave(directory + 'entire_motion_corrected.tif',images)

#for img in train_images:
    #import scipy.misc
    #scipy.misc.imsave(path + str(num)  + '.tif', img)
    #num +=1

In [None]:
import os
from caiman.base import movies
directory = fname_base + '\\DFF\\'
if not os.path.exists(directory):
    os.mkdir(directory)
fnames = []

count = 0

integ = 0; 
for i in range(int(np.shape(images)[0]/1000)): #cut dff movie into smaller chunks
    dff_x = images[count:count+1000,:,:]
    count = count+1000
    tifffile.imsave(directory + 'mc'+str(i)+'.tif',dff_x)
    fnames.append(directory + 'mc'+str(i)+'.tif')
    integ = i; 
    
if(np.shape(images)[0] % 1000 is not 0):
    dff_x = images[count:,:,:]
    tifffile.imsave(directory + ''+str(integ+1)+'.tif',dff_x)
    fnames.append(directory + ''+str(integ+1)+'.tif')

movie_glut = movies.load(fnames)
if np.min(movie_glut)<= 0:
    movie_glut[movie_glut<=0]=1
dff = movies.movie.computeDFF(movie_glut,method='delta_f_over_f')

dff= dff[0]

In [None]:
dff.save(directory + 'dff_entire.tif')

In [None]:
print(np.min(movie_glut))

In [None]:
dff.shape

In [None]:
from caiman.base import movies
import os
#change to dff movie
if not 'movie_glut' in locals():
    movie_glut = movies.load(fnames)
#dff = movies.movie.computeDFF(movie_glut,method='delta_f_over_f')
#dff = dff[0]
#directory = fname_base + '\\DFF'
#if not os.path.exists(directory):
    #os.mkdir(directory)
fnames = []

#dff.save(directory + 'entire_tiff.tif')
count = 0

integ = 0; 
for i in range(int(np.shape(dff)[0]/1000)): #cut dff movie into smaller chunks
    dff_x = dff[count:count+1000,:,:]
    count = count+1000
    dff_x.save(directory + '\\dff'+str(i)+'.tif')
    fnames.append(directory + '\\dff'+str(i)+'.tif')
    integ = i; 
    
if(np.shape(dff)[0] % 1000 is not 0):
    dff_x = dff[count:,:,:]
    dff_x.save(directory + '\\dff'+str(integ+1)+'.tif')
    fnames.append(directory + '\\dff'+str(integ+1)+'.tif')

### Load memory mapped file

In [None]:
print(fnames)

In [None]:
fname_new = cm.save_memmap(fnames, base_name='memmap_1',
                                   order='C', border_to_0=0, dview=dview)

Yr, dims, T = cm.load_memmap(fname_new)
images = Yr.T.reshape((T,) + dims, order='F')

In [None]:
load_prev_mask = False

In [None]:
import pickle

In [None]:
prev_mask_path = '10-29-20-IL-10-ROIs.pkl'

In [None]:
if load_prev_mask:  # : defined above among the parameters
    with open(prev_mask_path, 'rb') as f:
        Ain = pickle.load(f)  # loads previously computed cnmf object
        rf = None
        print('AIN LOADED')
        only_init = False

else:
    Ain = None 
    rf = 36 
    #only_init = True 


### Parameter setting for CNMF-E
We now define some parameters for the source extraction step using the CNMF-E algorithm. 
We construct a new dictionary and use this to modify the *existing* `params` object,

In [None]:
# parameters for source extraction and deconvolution
p = 1               # order of the autoregressive system
K = None            # upper bound on number of components per patch, in general None

#K = 20 
gSig = (8, 8)       # gaussian width of a 2D gaussian kernel, which approximates a neuron
gSiz = (33, 33)     # average diameter of a neuron, in general 4*gSig+1
Ain = None          # possibility to seed with predetermined binary masks

#merge_thr = .8     # merging threshold, max correlation allowed
merge_thr = 0.999
rf = 36             # half-size of the patches in pixels. e.g., if rf=40, patches are 80x80
stride_cnmf = 25    # amount of overlap between the patches in pixels
#                     (keep it at least large as gSiz, i.e 4 times the neuron size gSig)
#tsub = 2            # downsampling factor in time for initialization,
tsub = 1
#                     increase if you have memory problems
ssub = 2            # downsampling factor in space for initialization,
#                     increase if you have memory problems
#                     you can pass them here as boolean vectors
low_rank_background = None  # None leaves background of each patch intact,
#                     True performs global low-rank approximation if gnb>0
#gnb = 0             # number of background components (rank) if positive,

gnb = -1 
#                     else exact ring model with following settings
#                         gnb= 0: Return background as b and W
#                         gnb=-1: Return full rank background B
#                         gnb<-1: Don't return background
nb_patch = 0        # number of background components (rank) per patch if gnb>0,
#                     else it is set automatically
#min_corr = .75       # min peak value from correlation image

min_corr = 0.6
min_pnr = 4        # min peak to noise ration from PNR image; lower to 3 if neuron number too low
ssub_B = 2         # additional downsampling factor in space for background
ring_size_factor = 1  # radius of ring is gSiz*ring_size_factor

opts.change_params(params_dict={'method_init': 'corr_pnr',  # use this for 1 photon
                                'K': K,
                                'gSig': gSig,
                                'gSiz': gSiz,
                                'merge_thr': merge_thr,
                                'p': p,
                                'tsub': tsub,
                                'ssub': ssub,
                                #'rf': rf,
                                'stride': stride_cnmf,
                                #'only_init': only_init,    # set it to True to run CNMF-E
                                'nb': gnb,
                                'nb_patch': nb_patch,
                                'method_deconvolution': 'oasis',       # could use 'cvxpy' alternatively
                                #'method_deconvolution': 'cvxpy',       # could use 'cvxpy' alternatively
                                'low_rank_background': low_rank_background,
                                'update_background_components': True,  # sometimes setting to False improve the results
                                'min_corr': min_corr,
                                'min_pnr': min_pnr,
                                'normalize_init': False,               # just leave as is
                                'center_psf': True,                    # leave as is for 1 photon
                                'ssub_B': ssub_B,
                                'ring_size_factor': ring_size_factor,
                                'del_duplicates': True, 
                                 #'use_cuda': True
                                # whether to remove duplicates from initialization
                                #'border_pix': bord_px
                                })                # number of pixels to not consider in the borders)

In [None]:
print(n_processes)

### Inspect summary images and set parameters
Check the optimal values of `min_corr` and `min_pnr` by moving slider in the figure that pops up. You can modify them in the `params` object. 
Note that computing the correlation pnr image can be computationally and memory demanding for large datasets. In this case you can compute
only on a subset of the data (the results will not change). You can do that by changing `images[::1]` to `images[::5]` or something similar.
This will compute the correlation pnr image

In [None]:
# compute some summary images (correlation and peak to noise)
cn_filter, pnr = cm.summary_images.correlation_pnr(images[::1], gSig=gSig[0], swap_dim=False) # change swap dim if output looks weird, it is a problem with tiffile
# inspect the summary images and set the parameters
nb_inspect_correlation_pnr(cn_filter, pnr)



You can inspect the correlation and PNR images to select the threshold values for `min_corr` and `min_pnr`. The algorithm will look for components only in places where these value are above the specified thresholds. You can adjust the dynamic range in the plots shown above by choosing the selection tool (third button from the left) and selecting the desired region in the histogram plots on the right of each panel.

In [None]:
# print parameters set above, modify them if necessary based on summary images
print(min_corr) # min correlation of peak (from correlation image)
print(min_pnr)  # min peak to noise ratio

In [None]:
trace_information = [[],[],[],[],[]] #first row to last is sn, y_diff, b, ci_before, ci_after 

In [None]:
# does the fitting as originally; after component evaluation, this version of the mask is saved below for later use
    #Ain_test = Ain.toarray() > 0.094; #.095 is the limit (exclusive so use <=.94)
    
if load_prev_mask:
    Ain_test = Ain.toarray() > 0.05 
    cnm = cnmf.CNMF(n_processes=n_processes, dview = dview, params=opts)
    cnm.estimates.A = Ain_test 
else:
    cnm = cnmf.CNMF(n_processes=n_processes, dview=dview, Ain=Ain, params=opts)
cnm.fit(images)

### Run the CNMF-E algorithm

In [None]:
print(cnm.params.to_dict())

### Alternate way to run the pipeline at once
It is possible to run the combined steps of motion correction, memory mapping, and cnmf fitting in one step as shown below. The command is commented out since the analysis has already been performed. It is recommended that you familiriaze yourself with the various steps and the results of the various steps before using it.

In [None]:
import pickle 
print(pickle.format_version)
print(images.size)

In [None]:
# cnm1 = cnmf.CNMF(n_processes, params=opts, dview=dview)
# cnm1.fit_file(motion_correct=motion_correct)
print(cnm.estimates.coordinates)

In [None]:
from platform import python_version

print(python_version())

## Component Evaluation

The processing in patches creates several spurious components. These are filtered out by evaluating each component using three different criteria:

- the shape of each component must be correlated with the data at the corresponding location within the FOV
- a minimum peak SNR is required over the length of a transient
- each shape passes a CNN based classifier

<img src="../../docs/img/evaluationcomponent.png"/>
After setting some parameters we again modify the existing `params` object.

In [None]:


#%% COMPONENT EVALUATION
# the components are evaluated in three ways:
#   a) the shape of each component must be correlated with the data
#   b) a minimum peak SNR is required over the length of a transient
#   c) each shape passes a CNN based classifier
cnn_thr = 0.99              # threshold for CNN based classifier; default .99 
cnn_lowest = 0    
min_SNR = 1            # adaptive way to set threshold on the transient size
r_values_min = 0.5 # threshold on space consistency (if you lower more components
#                        will be accepted, potentially with worst quality)
cnm.params.set('quality', {'min_SNR': min_SNR,
                           'rval_thr': r_values_min,
                                       'use_cnn': True,
            #'min_cnn_thr': cnn_thr,
            #'cnn_lowest': cnn_lowest
                          })

cnm.estimates.dims = dims
cnm.estimates.evaluate_components(images, cnm.params, dview = dview)

print(' ***** ')
print('Number of total components: ', len(cnm.estimates.C))
print('Number of accepted components: ', len(cnm.estimates.idx_components))

### Do some plotting

### Setup a cluster
To enable parallel processing a (local) cluster needs to be set up. This is done with a cell below. The variable `backend` determines the type of cluster used. The default value `'local'` uses the multiprocessing package. The `ipyparallel` option is also available. More information on these choices can be found [here](https://github.com/flatironinstitute/CaImAn/blob/master/CLUSTER.md). The resulting variable `dview` expresses the cluster option. If you use `dview=dview` in the downstream analysis then parallel processing will be used. If you use `dview=None` then no parallel processing will be employed.

### Do some plotting

In [None]:
#%% plot contour plots of accepted and rejected components
cnm.estimates.dims = dims
cnm.estimates.plot_contours_nb(img=cn_filter, idx=cnm.estimates.idx_components)
print(cnm.estimates.coordinates)

In [None]:
print(cnm.estimates.idx_components_bad)

In [None]:
# import caiman.source_extraction.cnmf.utilities
# cnm.utilities.computeDFF_traces()

In [None]:
# print(Yr)

In [None]:
cnm.estimates.F_dff = cnmf.utilities.computeDFF_traces(Yr,cnm.estimates.A,cnm.estimates.C,cnm.estimates.bl )

View traces of accepted and rejected components. Note that if you get data rate error you can start Jupyter notebooks using:
'jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10'

In [None]:
# accepted components
cnm.estimates.hv_view_components(img=cn_filter, idx=cnm.estimates.idx_components,
                                denoised_color='red', cmap='gray')

In [None]:
# rejected components
cnm.estimates.hv_view_components(img=cn_filter, idx=cnm.estimates.idx_components_bad,
                                denoised_color='red', cmap='gray')

In [None]:
print(cnm.estimates.coordinates)

In [None]:
cnm.estimates.nb_view_components(img=cn_filter, denoised_color='red')
print('you may need to change the data rate to generate this one: use jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10 before opening jupyter notebook')

In [None]:
#PICK GOOD ROIS AS good_idx

idx = list(range(0,0)) # TODO - change the range to 0, max neuron number

#bad_idx = cnm.estimates.idx_components_bad

good_idx = []
good_idx[:] = [number - 1 for number in good_idx]
print(good_idx)
bad_idx = np.delete(idx,good_idx)

In [None]:
cnm_mod = cnm.estimates.select_components(good_idx)

In [None]:
coords = cnm.estimates.coordinates
print(coords)
import pickle
comps = {}
comps['C_array'] = cnm.estimates.C
#comps['F_df'] = F_df
#comps['C_df'] = C_df
comps['A_array'] = cnm.estimates.A
comps['S_array'] = cnm.estimates.S
comps['coords'] = coords
comps['accept'] = cnm.estimates.idx_components
comps['reject'] = cnm.estimates.idx_components_bad

#CHANGE PKL FILE NAME FOR EACH EXPERIMENT!!!
pickle.dump( comps, open( f'C:\\ENTER_FILE_DIRECTORY_HERE-[0,1)'+ '.pkl', "wb" ))

In [None]:
print(cnm.estimates.C)

In [None]:
# Can stop running code here

### Stop cluster

In [None]:
bad_idx

In [None]:
print(good_idx)

In [None]:
cnm_mod = cnm.estimates.select_components(good_idx)

In [None]:
print(merge_thr)

In [None]:
#%% Extract DF/F values
cnm.estimates.detrend_df_f(flag_auto=True)

In [None]:
cnm.estimates.computeDFF?

In [None]:
cnm.estimates.F_dff?

In [None]:
trace = a[0,:]
plt.plot(trace)
plt.show

In [None]:
def extract_DF_F(Yr, A, C, bl, quantileMin=8, frames_window=200, block_size=400, dview=None):
    """ Compute DFF function from cnmf output.

     Disclaimer: it might be memory inefficient

    Args:
        Yr: ndarray (2D)
            movie pixels X time

        A: scipy.sparse.coo_matrix
            spatial components (from cnmf cnm.A)

        C: ndarray
            temporal components (from cnmf cnm.C)

        bl: ndarray
            baseline for each component (from cnmf cnm.bl)

        quantile_min: float
            quantile minimum of the

        frames_window: int
            number of frames for running quantile

    Returns:
        Cdf:
            the computed Calcium acitivty to the derivative of f

    See Also:
        ..image::docs/img/onlycnmf.png
    """
    import scipy
    import numpy as np
    from caiman.mmapping import parallel_dot_product, load_memmap
    nA = np.array(np.sqrt(A.power(2).sum(0)).T)
    A = scipy.sparse.coo_matrix(A / nA.T)
    C = C * nA
    bl = (bl * nA.T).squeeze()
    nA = np.array(np.sqrt(A.power(2).sum(0)).T)

    T = C.shape[-1]
    if 'memmap' in str(type(Yr)):
        if block_size >= 500:
            print('Forcing single thread for memory issues')
            dview_res = None
        else:
            print('Using thread. If memory issues set block_size larger than 500')
            dview_res = dview

        AY = parallel_dot_product(Yr, A, dview=dview_res, block_size=block_size,
                                  transpose=True).T
    else:
        AY = A.T.dot(Yr)
        
    print(AY)
    
    bas_val = bl[None, :]
    Bas = np.repeat(bas_val, T, 0).T
    AA = A.T.dot(A)
    AA.setdiag(0)
    Cf = (C - Bas) * (nA**2)
    C2 = AY - AA.dot(C)
    print(C2)
    if frames_window is None or frames_window > T:
        Df = np.percentile(C2, quantileMin, axis=1)
        C_df = Cf / Df[:, None]

    else:
        Df = scipy.ndimage.percentile_filter(
            C2, quantileMin, (frames_window, 1))
        C_df = Cf / Df
    print('printing C_df')
    print(C_df)
    return C_df

C_df = extract_DF_F(Yr, cnm.estimates.A, cnm.estimates.C, cnm.estimates.bl, quantileMin=8, frames_window=200)
import matplotlib.pyplot as plt
a = np.array(C_df[0,:])
plt.plot(a)


In [None]:
a = cnm.estimates.F_dff

for i in range(0,len(a[:,0])):
    trace = a[i,:]
    import matplotlib.pyplot as plt
    plt.plot(trace)
plt.show

In [None]:
a = cnm.estimates.F_dff

good_roi = []
good_roi[:] = [j - 1 for j in good_roi]
for i in good_roi:
    trace = a[i,:]
    import matplotlib.pyplot as plt
    plt.plot(trace)
plt.show

In [None]:
a = cnm.estimates.C

for i in range(0,len(a[:,0])):
    trace = a[i,:]
    import matplotlib.pyplot as plt
    plt.plot(trace)
plt.show()

In [None]:
a = cnm.estimates.C

trace = a[0,:]
plt.plot(trace)
plt.show()

In [None]:
print(good_roi)

### Stop cluster

### Some instructive movies
Play the reconstructed movie alongside the original movie and the (amplified) residual

In [None]:
idx=cnm.estimates.idx_components
print(idx)

In [None]:
save_results = True
if save_results:
    cnm.save('03-10-22-SecondRecording-[31,38).hdf5')