# CNMF demo pipeline: Intro
A full pipeline for the analysis of a two-photon calcium imaging dataset using the CaImAn software package. It demonstrates how to use Caiman for the following analysis steps:

![cnmf pipeline full](https://raw.githubusercontent.com/EricThomson/image_sandbox/main/images/full_cnmf_workflow.jpg)
    
1) Apply the NoRMCorre (nonrigid motion correction) algorithm for motion correction.
2) Apply the constrained nonnegative matrix factorization (CNMF) source separation algorithm to extract an initial estimate of neuronal spatial footprint and calcium traces.  
3) Apply quality control metrics to evaluate the initial estimates, and narrow down to the final set of estimates.

The CNMF algorithm is best for data with relatively low background noise, like most two-photon data and *some* one photon data (e.g., certain light sheet data). For a demo analysis pipeline of a one-photon microendoscopic data set see `demo_pipeline_cnmfE.ipynb`.

<div class="alert alert-info">
    <h2 style="margin-top: 0;">Getting more help</h2>
    More detailed background information about CNMF can be found in the <a href="https://pubmed.ncbi.nlm.nih.gov/26774160/">original CNMF paper</a> and <a href="https://pubmed.ncbi.nlm.nih.gov/30652683/">the Caiman paper</a>. If you have specific questions about this demo, or the underlying algorithms, you can ask questions at <a href="https://github.com/flatironinstitute/CaImAn/discussions">GitHub Discussions</a>. If you find a bug or you have a feature request, feel free to <a href="https://github.com/flatironinstitute/CaImAn/issues">open an issue at our Github repo</a>.
</div>

## Imports and general setup
We first need to import the Python libraries we will use in the rest of the notebook and tweak some general settings. Don't worry about these details now, we will explain the important things when they come up. Note in the following, we import `caiman` as `cm`, so when you see `cm` in the rest of the notebook, it just means you are using something from the Caiman package.  

In [None]:
import bokeh.plotting as bpl
import cv2
import datetime
import glob
import holoviews as hv
from IPython import get_ipython
import logging
import matplotlib.pyplot as plt
import numpy as np
import os
import psutil
from pathlib import Path

try:
    cv2.setNumThreads(0)
except():
    pass

try:
    if __IPYTHON__:
        get_ipython().run_line_magic('load_ext', 'autoreload')
        get_ipython().run_line_magic('autoreload', '2')
except NameError:
    pass

import caiman as cm
from caiman.motion_correction import MotionCorrect
from caiman.source_extraction.cnmf import cnmf, params
from caiman.utils.utils import download_demo
from caiman.utils.visualization import plot_contours, nb_view_patches, nb_plot_contour
from caiman.utils.visualization import view_quilt

bpl.output_notebook()
hv.notebook_extension('bokeh')

Continuing with our basic setup, we will set up a logger, and also set some environment variables in case that wasn't done already in your shell. If you want to learn more about Caiman's logger functionality, or tweak your logger (e.g., to log to file instead of your console), see [our docs on logging](https://caiman.readthedocs.io/en/latest/Getting_Started.html#logging). 

In [None]:
# set up logging
logging.basicConfig(format="{asctime} - {levelname} - [{filename} {funcName}() {lineno}] - pid {process} - {message}",
                    filename=None, 
                    level=logging.WARNING, style="{") #logging level can be DEBUG, INFO, WARNING, ERROR, CRITICAL

# set env variables 
os.environ["MKL_NUM_THREADS"] = "1"
os.environ["OPENBLAS_NUM_THREADS"] = "1"
os.environ["VECLIB_MAXIMUM_THREADS"] = "1"

## Specify data to be processed
For this demo, we will analyze the data file `Sue_2x_3000_40_-46.tif`. This 3000-frame movie, provided courtesy of Sue Koay and David Tank (Princeton University), is two-photon data from supragranular parietal cortex of a GCaMP6f-expressing mouse during a virtual reality task. It was collected at 30Hz, and to save space, the demo data has been spatially cropped and downsampled by a factor of 2 compared to the original.  

To get the data, we will use Caiman's built-in `download_demo()` function. It will download the data to  `~/caiman_data/example_movies/` where `~` is your home directory (the path format and home directory will depend on your operating system). If you already have the movie, it will just return the path to the movie. 

In [None]:
movie_path = download_demo('Sue_2x_3000_40_-46.tif')
print(f"Original movie for demo is in {movie_path}")

If you want to adapt this demo for your own data, just direct the `movie_path` variable to your own movie:

    movie_path = 'full/path/to/your/your_movie.extension'

While this demo uses a movie that has been stored in <i>tif</i> format, Caiman can handle movies in multiple common (and not so common) formats, including hdf5/h5, n5, zarr, avi, nwb, mat, and npz.  Please reach out if you have problems loading your data into Caiman and it is a common data format!

<div class="alert alert-info">
    <h2>Working with multiple files or sessions</h2>
     What if you have a recording that is broken up into multiple files? This is common for many acquisition systems. It is  easy to adapt the current demo for such cases. There are a couple of changes you would need to make. First, instead of just downloading a single demo file, you would want to create a <i>list</i> of files for Caiman to handle. To adapt the demo for such a case, you can split the main demo movie into <em>Sue_split1.tif</em> and <em>Sue_split2.tif</em>:

    movie_path1 = download_demo('Sue_Split1.tif')
    movie_path2 = download_demo('Sue_Split2.tif')
    movie_paths = [movie_path1, movie_path2]

Then, when creating the movie object instead of using <em>cm.load()</em> you would use <em>cm.load_movie_chain()</em> which takes in a list as an argument:

    movie_orig = cm.load_movie_chain(movie_paths)
    
Also, what if you have <b>noncontiguous</b> recording sessions, for instance data in files from sessions separated by many days or weeks, and you need to register/match the neurons from these sessions? this is a different use case, and we have a demo for that: see [demo_multisession_registration.ipynb](./demo_multisession_registration.ipynb).
</div> 

<h1>I'm a super good coder - yeeey! ;)</h1>

## Load and visualize raw data
Caiman has a built-in movie class for movie-viewing (documentation [here](https://caiman.readthedocs.io/en/latest/Handling_Movies.html)). Once you have loaded a movie using `cm.load()`, you can view it using `movie.play()`. The `play()` function has multiple parameters: 

    gain: brightness 
    fr:  frame rate
    magnification: scale the size of the display  
    qmax, q_min: percentile for setting vmax, vmin -- below vmin is set to min, above vmax is set to max
    plot_text (Bool): show the frame number
    do_loop (Bool): whether to loop the video 
    
The movie object also has a `resize()` method, which we use in the following to downsample the movie before playing.

Playing the movie uses the `OpenCV` library, so the following cell runs a blocking function (a function that blocks execution of all other code until it is stopped), opening a separate window which doesn't run in Jupyter. You will need to press `q` on that window to close it. 

In [None]:
# press q to close
movie_orig = cm.load(movie_path) 
downsampling_ratio = 0.2  # subsample 5x
movie_orig.resize(fz=downsampling_ratio).play(gain=1.3,
                                              q_max=99.5, 
                                              fr=30,
                                              plot_text=True,
                                              magnification=2,
                                              do_loop=False,
                                              backend='opencv')

<div class="alert alert-info">
    <h2>Displaying large files</h2>
Loading a movie with <em>cm.load()</em> pulls the data into memory, which is not always feasible. When working with your own data, you might need to adapt the above code when working with extremely large files. Caiman provides tools to handle this use case. One, you can just load *some* of your data into a movie object using the `subindices` argument to the `load()` function. For example, if you just want to load the first 500 frames of a movie, you can send it <em>subindices=np.arange(0,500)</em>. 

If you don't want to truncate your movie, there is a <em>play_movie()</em> function that behaves just like <em>movie.play()</em>, that doesn't ever load the movie into memory. Rather, it takes the filename as an argument and iteratively loads frames from disk and shows them when needed. If you want to play with it, just import it using <em>caiman.base.movies import play_movie</em> and read the documentation. We don't use it for the demo because the demo movie is small, and we do some calculations on the loaded movie array. 

Another option for viewing very large movies is to use the <a href=https://github.com/fastplotlib/fastplotlib>fastplotlib library</a>, which leverages the GPU to provide interactive visualization within Jupyter notebooks (we discuss this more below).
</div>

Let's also create a couple of summary images of the movie, including a *maximum projection* (the maximum value of each pixel) and a *correlation image* (how correlated each pixel is with its neighbors). If a pixel comes from an active neural component it will tend to be highly correlated with its neighbors. 

In [None]:
max_projection_orig = np.max(movie_orig, axis=0)
correlation_image_orig = cm.summary_images.local_correlations_fft(movie_orig, swap_dim=False)
correlation_image_orig[np.isnan(correlation_image_orig)] = 0 # get rid of NaNs, if they exist

In [None]:
f, (ax_max, ax_corr) = plt.subplots(1,2,figsize=(6,3))
ax_max.imshow(max_projection_orig, 
              cmap='viridis',
              vmin=np.percentile(np.ravel(max_projection_orig),50), 
              vmax=np.percentile(np.ravel(max_projection_orig),99.5));
ax_max.set_title("Max Projection Orig", fontsize=12);

ax_corr.imshow(correlation_image_orig, 
               cmap='viridis', 
               vmin=np.percentile(np.ravel(correlation_image_orig),50), 
               vmax=np.percentile(np.ravel(correlation_image_orig),99.5));
ax_corr.set_title('Correlation Image Orig', fontsize=12);

These images will not be particularly sharp yet, as there is still a good deal of motion in the movie. 

> Note the above will generate static images. To view them interactively you can generate them in qt mode by running the cell magic `%matplotlib qt` at the beginning of this Jupyter notebook. If you are in colab or other cloud services, you can wrap figures using the [mpld3 package](https://github.com/mpld3/mpld3). 

## Set initial parameters
In general in Caiman, estimators are first initialized with a set of parameters, and then they are fit against actual data in a separate step. In this section, we'll define a `parameters` object that will subsequently be used to initialize our different estimators. 

The parameters are divided into different categories. We will not discuss them in detail in this section, but will go over them when needed (and note that in this notebook we will mostly focus on CNMF and later steps):

In [None]:
# general dataset-dependent parameters
fr = 30                     # imaging rate in frames per second
decay_time = 0.4            # length of a typical transient in seconds
dxy = (2., 2.)              # spatial resolution in x and y in (um per pixel)

# motion correction parameters
strides = (48, 48)          # start a new patch for pw-rigid motion correction every x pixels
overlaps = (24, 24)         # overlap between patches (width of patch = strides+overlaps)
max_shifts = (6,6)          # maximum allowed rigid shifts (in pixels)
max_deviation_rigid = 3     # maximum shifts deviation allowed for patch with respect to rigid shifts
pw_rigid = True             # flag for performing non-rigid motion correction

# CNMF parameters for source extraction and deconvolution
p = 1                       # order of the autoregressive system (set p=2 if there is visible rise time in data)
gnb = 2                     # number of global background components (set to 1 or 2)
merge_thr = 0.85            # merging threshold, max correlation allowed
bas_nonneg = False          # enforce nonnegativity constraint on calcium traces (technically on baseline)
rf = 15                     # half-size of the patches in pixels (patch width is rf*2 + 1)
stride_cnmf = 6             # amount of overlap between the patches in pixels (overlap is stride_cnmf+1) 
K = 4                       # number of components per patch
gSig = np.array([4, 4])     # expected half-width of neurons in pixels 
gSiz = 2*gSig + 1           # half-width of bounding box created around neurons during initialization
method_init = 'greedy_roi'  # initialization method (if analyzing dendritic data see demo_dendritic.ipynb)
ssub = 1                    # spatial subsampling during initialization 
tsub = 1                    # temporal subsampling during intialization

# parameters for component evaluation
min_SNR = 2.0               # signal to noise ratio for accepting a component
rval_thr = 0.85             # space correlation threshold for accepting a component
cnn_thr = 0.99              # threshold for CNN based classifier
cnn_lowest = 0.1            # neurons with cnn probability lower than this value are rejected

We place the above parameter values in a dictionary that is passed to the `CNMFParams` class (those parameters *not* explicitly defined will assume default values):

In [None]:
parameter_dict = {'fnames': movie_path,
                  'fr': fr,
                  'dxy': dxy,
                  'decay_time': decay_time,
                  'strides': strides,
                  'overlaps': overlaps,
                  'max_shifts': max_shifts,
                  'max_deviation_rigid': max_deviation_rigid,
                  'pw_rigid': pw_rigid,
                  'p': p,
                  'nb': gnb,
                  'rf': rf,
                  'K': K, 
                  'gSig': gSig,
                  'gSiz': gSiz,
                  'stride': stride_cnmf,
                  'method_init': method_init,
                  'rolling_sum': True,
                  'only_init': True,
                  'ssub': ssub,
                  'tsub': tsub,
                  'merge_thr': merge_thr, 
                  'bas_nonneg': bas_nonneg,
                  'min_SNR': min_SNR,
                  'rval_thr': rval_thr,
                  'use_cnn': True,
                  'min_cnn_thr': cnn_thr,
                  'cnn_lowest': cnn_lowest}

parameters = params.CNMFParams(params_dict=parameter_dict) # CNMFParams is the parameters class

This parameters object (`parameters`) is basically a collection of dictionaries, each containing a different parameter category. These different dictionaries can be accessed using dot notation.  Some parameters are related to the dataset in general (`parameters.data`), while most are related to specific aspects of the workflow such as motion correction (`parameters.motion`) or quality evaluation (`parameters.quality`).
 
For instance, if you want to inspect the dataset-dependent parameters:

In [None]:
parameters.data

To access a particular parameter in this parameter field, you just need to get the value from the dictionary using the appropriate key. For instance, to get the frame rate:

In [None]:
parameters.data['fr'] 

<div class="alert alert-info" markdown="1">
    <h2 style="margin-top: 0;">To dig deeper into this design</h2>  
    To see more about the design of Caiman estimators and parameters, and their decoupling, see <a href="https://caiman.readthedocs.io/en/latest/Getting_Started.html#estimator-design">our docs on estimator design</a>.
</div>

## Setting up a cluster
Caiman is optimized for parallel computing, and distributes computations to multiple CPU cores for motion correction and CNMF (there is also an option to run motion correction on the GPU, but we will not focus on that here). Setting up the multicore processing is done with the `setup_cluster()` function below. 

First, let's see how many CPUs we have available, and set the number of processors we want to use. If you set `num_processors_to_use` to `None`, then `setup_cluster()` will set it one less than the total number available:

In [None]:
print(f"You have {psutil.cpu_count()} CPUs available in your current environment")
num_processors_to_use = None

Set up a cluster of processors. If one has already been set up (the `cluster` variable is already in your namespace), then that cluster will be closed and a new one created.

In [None]:
if 'cluster' in locals():  # 'locals' contains list of current local variables
    print('Closing previous cluster')
    cm.stop_server(dview=cluster)
print("Setting up new cluster")
_, cluster, n_processes = cm.cluster.setup_cluster(backend='multiprocessing', 
                                                   n_processes=num_processors_to_use, 
                                                   ignore_preexisting=False)
print(f"Successfully set up cluster with {n_processes} processes")

We won't focus much on the outputs, except for `cluster`. This is the pool of processors (CPUs) that will be used in many of Caiman's subsequent processing steps. In these later steps, if you set the parameter `dview` to `cluster`, then parallel processing will be used. If instead you set `dview` to `None` then no parallel processing will be used. This latter option can be important when debugging, as the logger doesn't typically work for multithreaded operations.

For more details, please see [our documentation on cluster setup](https://caiman.readthedocs.io/en/latest/Getting_Started.html#cluster-setup-and-shutdown). 

<div class="alert alert-info" markdown="1">
    <h2 style="margin-top: 0;">Optimizing performance</h2>  
If you hit memory issues later, there are a few things you can do. First, you may want to lower the number of processors you are using. Each processor uses more RAM, and on a workstation with many processors, you can sometimes get better performance by reducing <em>num_processors_to_use</em>.The best way to determine the optimal number is by trial and error. When you set <em>num_processors_to_use</em> variable to <em>None</em>, it defaults to <i>one</i> less than the total number of CPU cores available (the reason we don't automatically set it to the total number of cores is because in practice this typically leads to worse performance).

Second, if your system has less than 32GB of RAM, and things are running slowly or you are running out of memory, then get more RAM. While you can sometimes get away with less, we recommend a *bare minimum* level of 16GB of RAM, but more is better. 32GB RAM is acceptable, 64GB or more is best. Obviously, this will depend on the size of your data sets. 

Third, if none of your memory optimizations work, you may just have too much data for offline CNMF. For this case, we also provide an online version of CNMF (OnACID), which uses a small number of frames to initialize the spatial and temporal components, and iteratively updates them with new data. This uses much less memory than the offline approach. The demo notebook for OnACID is found in <a href="./demo_OnACID_mesoscope.ipynb">demo_OnACID_mesoscope.ipynb</a>. See the <a href="https://pubmed.ncbi.nlm.nih.gov/30652683/">Caiman paper</a> for more discussion.
</div>

## Motion Correction
The first substantive step in our analysis pipeline is to remove motion artifacts from the original movie:

![Txt](https://raw.githubusercontent.com/EricThomson/image_sandbox/main/images/normcorre_workflow.jpg)

It is *very* important to get rid of motion artifacts, as the subsequent CNMF source separation algorithm assumes that each pixel represents the same region of space

First, we initialize the motion correction estimator using the parameters that we set above:   

In [None]:
mot_correct = MotionCorrect(movie_path, dview=cluster, **parameters.motion)

This notebook focuses on CNMF, not motion correction, but let's consider a couple of the motion correction parameters:

- `pw_rigid=True` tells us that we are going to perform piecewise rigid motion correction using the nonrigid motion correction (NoRMCorre) algorithm (this is because the data seems to exhibit some non-uniform motion). If your data exhibits uniform motion across the field of view, set this to `False` for efficiency.
- NoRMCorre will split the movie into patches that repeat every 48 pixels (`strides`), and have 24 pixels of overlap (`overlaps`): the total patch width is 72 (the sum of stride and overlap).

<div class="alert alert-info" markdown="1">
    <h2 style="margin-top: 0;">For more on motion correction</h2>  
For a detailed exploration of Caiman's motion correction pipeline, see the <a href="./demo_motion_correction.ipynb">demo_motion_correction.ipynb</a> demo.
</div>

The next step is to run the motion correction algorithm using the `motion_correct()` method. You may see some warnings about negative movie averages: you can ignore them.

In [None]:
%%time
#%% Run piecewise-rigid motion correction using NoRMCorre
mot_correct.motion_correct(save_movie=True);

Optionally inspect the results by comparing the original movie: note we are turning the gain up here to highlight motion.

In [None]:
#%% compare with original movie  : press q to quit
movie_orig = cm.load(movie_path) # in case it was not loaded earlier
movie_corrected = cm.load(mot_correct.mmap_file) # load motion corrected movie
ds_ratio = 0.2
cm.concatenate([movie_orig.resize(1, 1, ds_ratio) - mot_correct.min_mov*mot_correct.nonneg_movie,
                movie_corrected.resize(1, 1, ds_ratio)], 
                axis=2).play(fr=20, 
                             gain=2, 
                             magnification=2) 

Let's look at the max projection and correlation image of the motion corrected movies. In movies that originally contained a lot of movement, these summary images will look quite a bit different than at first, more "crisp" because they are no longer blurred by movement:

In [None]:
max_projection = np.max(movie_corrected, axis=0)
correlation_image = cm.summary_images.local_correlations_fft(movie_corrected, swap_dim=False)
correlation_image[np.isnan(correlation_image)] = 0 # get rid of NaNs, if they exist

In [None]:
f, ((ax_max_orig, ax_max), (ax_corr_orig, ax_corr)) = plt.subplots(2,2,figsize=(6,6), sharex=True, sharey=True)
# plot max projection
ax_max_orig.imshow(max_projection_orig, 
                   cmap='viridis', 
                   vmin=np.percentile(np.ravel(max_projection_orig),50), 
                   vmax=np.percentile(np.ravel(max_projection_orig),99.5));
ax_max_orig.set_title("Max Projection Orig", fontsize=12);
ax_max.imshow(max_projection, 
              cmap='viridis', 
              vmin=np.percentile(np.ravel(max_projection),50), 
              vmax=np.percentile(np.ravel(max_projection),99.5));
ax_max.set_title("Max Motion Corrected", fontsize=12);

# plot correlation image
ax_corr_orig.imshow(correlation_image_orig, 
                    cmap='viridis', 
                   vmin=np.percentile(np.ravel(correlation_image_orig),50), 
                   vmax=np.percentile(np.ravel(correlation_image_orig),99.5));
ax_corr_orig.set_title('Correlation Image Orig', fontsize=12);
ax_corr.imshow(correlation_image, 
               cmap='viridis', 
               vmin=np.percentile(np.ravel(correlation_image),50), 
               vmax=np.percentile(np.ravel(correlation_image),99.5));
ax_corr.set_title('Correlation Motion Corrected', fontsize=12);

plt.tight_layout()

<div class="alert alert-info" markdown="1">
    <h2 style="margin-top: 0;">If you don't need to run motion correction</h2>  
If you don't need to run motion correction  (for instance if your movie has no movement -- which is often the case in slice praparations), you can directly memory map your file to prepare for subsequent processing:

    mc_memmapped_fname = cm.save_memmap(movie_path, base_name='memmap_',
                                         order='C', border_to_0=0, dview=cluster)

You would need to modify the rest of your code accordingly (e.g., don't run motion correction, don't run the following code cell that creates a memmapped file from the motion correction estimator object). 
</div>

## Creating and accessing memory mapped files
The motion-corrected data was saved as as a memory mapped file in `mot_correct.mmap_file` (in F-order, which was optimized for motion correction). The cell below saves the same motion corrected data in a second memory mapped file in C-order, which is optimized for CNMF. We then load the memmapped data with the built-in caiman function `load_memmap()`. This lets us treat the memory-mapped data *as if* it were in memory while leaving it on disk (for more details about memory mapping, see [our documentation on memmapping](https://caiman.readthedocs.io/en/latest/Getting_Started.html#memory-mapping)).

In [None]:
border_to_0 = 0 if mot_correct.border_nan == 'copy' else mot_correct.border_to_0 # trim border against NaNs
mc_memmapped_fname = cm.save_memmap(mot_correct.mmap_file, 
                                        base_name='memmap_', 
                                        order='C',
                                        border_to_0=border_to_0,  # exclude borders, if that was done
                                        dview=cluster)

Yr, dims, num_frames = cm.load_memmap(mc_memmapped_fname)
images = np.reshape(Yr.T, [num_frames] + list(dims), order='F') #reshape frames in standard 3d format (T x X x Y)

Restart the cluster to clean up memory in preparation for CNMF run.

In [None]:
cm.stop_server(dview=cluster)
_, cluster, n_processes = cm.cluster.setup_cluster(backend='multiprocessing', 
                                                   n_processes=num_processors_to_use, 
                                                   single_thread=False)

# Run CNMF on patches in parallel

Everything is now set up for running CNMF. This algorithm simultaneously extracts the *spatial footprint* and corresponding *calcium trace* for each component. 

![Txt](https://raw.githubusercontent.com/EricThomson/image_sandbox/main/images/cnmf_workflow.jpg)

It also performs *deconvolution*, providing an estimate of the spike count that generated the calcium signal in the movie. 

The algorithm is parallelized as illustrated here:

<img src="https://raw.githubusercontent.com/EricThomson/image_sandbox/main/images/cnmf_patches.jpg" alt="cnmf patch flow" width="500"/>

1) The movie field of view is split into overlapping patches.
2) These patches are processed in parallel by the CNMF algorithm. The degree of parallelization depends on your available computing power: if you have just one CPU then the patches will be processed sequentially. 
3) The results from all the patches are merged, with special focus on components in overlapping regions -- overlapping components are merged if their activity is highly correlated.
4) Results are refined with additional iteratoins of CNMF.

As discussed above, Caiman's main algorithms are run in two steps: first the estimators are *initialized* with a set of parameters, and then they are *fit* against actual data. Let's initialize our CNMF estimator object:

In [None]:
cnmf_model = cnmf.CNMF(n_processes, 
                       params=parameters, 
                       dview=cluster)

Once initialized, we we could immediately run `cnmf_model.fit(images)` if we knew the parameters were good -- the parameters in your CNMF estimator object are accessible in `cnmf_model.params`. However, before running the algorithm, let's discuss the more critical parameters that you will most likely need to tweak when running CNMF on your own data. 

### Key parameters for CNMF

`rf (int)`: *patch half-width*

> `rf` ('receptive field') the half width of patches in pixels. The patch width is `2*rf + 1`. `rf` should be *at least* 3-4 times larger than the observed neuron diameter. The larger the patch size, the less parallelization will be used by Caiman. If `rf` is set to `None`, then CNMF will be run on the entire field of view. 

`stride_cnmf (int)`: *patch overlap*

> `stride_cnmf` is the overlap between patches in pixels (the overlap is `stride_cnmf + 1`). This should be at least the diameter of a neuron. The larger the overlap, the greater the computational load, but the results will be more accurate when stitching together results from different patches. 
 
`gSig (int, int)`: *half-width of neurons*

> `gSig` is roughly the half-width of neurons in your movie in pixels (height, width). It is the sigma parameter of a Gaussian filter run on all the images during initialization. If the filter is appropriately matched to your data, you will get a much better estimate. `gSig` is related to `gSiz`, which is typically set to `2*gSig + 1`. `gSiz` is the size (in pixels) of a bounding box created around each seed pixel when running CNMF -- if after running `refit()` your neurons end up looking square or artificially cut off, you may need to increase `gSiz`.
    
`K (int)`: *components per patch*

> `K` is the expected number of components per patch. You should adapt this to the density of components in your data, and the current `rf` parameter. We suggest you pick `K` based on the more dense patches in your movie so you don't miss neurons (we want to avoid false negatives).
    
`merge_thr (float)`: *merge threshold* 

> If two spatially overlapping components are correlated above `merge_thr`, they will be merged into one component. The correlation coefficient is calculated using their respective calcium traces.  

You typically will set `rf` and `stride` infrequently, so `K`, `gSig`, and `merge_thr` are the main parameters you will tweak when analyzing a given session. Note these are not the *only* important parameters. They just tend to be the *most* important, while many others tend to depend on your calcium indicator or other factors that don't vary within an experiment. These are listed above in the section on setting initial parameters and [the docs](https://caiman.readthedocs.io/en/master/Getting_Started.html#parameters).

It is useful to keep track of the key parameters, so let's create a helper function `key_params()` to return them whenever we want (we include `gSiz` just so that if you tweak `gSig`, you will remember to check on `gSiz` as well):

In [None]:
def key_params(cnmf_model):
    """
    Convenience function to return critical parameters given CNMF estimator object.
    Returns dictionary with values of rf, stride, gSig, gSiz, K, merge_threshold
    
    Note: 
    gSiz is included because it depends on gSig and you want to make sure to change it when you change gSig.
    These are not set in stone: tweak for your own needs!
    """
    rf = cnmf_model.params.patch['rf']
    stride = cnmf_model.params.patch['stride']
    gSig = cnmf_model.params.init['gSig']
    gSiz = cnmf_model.params.init['gSiz']
    K = cnmf_model.params.init['K']
    merge_thr = cnmf_model.params.merging['merge_thr']
    
    key_params = {'rf': rf, 
                  'stride': stride,
                  'gSig': gSig,
                  'gSiz': gSiz,
                  'K': K,
                  'merge_thr': merge_thr}
    
    return key_params

In [None]:
print(f"Initial key params: {key_params(cnmf_model)}")

### How to pick parameters
How do you determine what values to use for the key parameters with your own data? The previous section provided some guidelines on how to select them via inspection of your data. Basically: look at your movie, or a summary image for your movie, and pick values close to those suggested by the guidelines above. Just keep in mind that there is typically no *perfect* value, and there will usually be some trial-and-error involved: luckily the cnmf model is fairly robust to small changes in their values.

It is helpful to use `view_quilt()` function to see if our critical spatial parameters are in the right ballpark (note we recommend running this viewer in interactive qt mode so you can interact with it and get a better feel for the parameters): 

In [None]:
# calculate stride and overlap from parameters
cnmf_patch_width = cnmf_model.params.patch['rf']*2 + 1
cnmf_patch_overlap = cnmf_model.params.patch['stride'] + 1
cnmf_patch_stride = cnmf_patch_width - cnmf_patch_overlap
print(f'Patch width: {cnmf_patch_width} , Stride: {cnmf_patch_stride}, Overlap: {cnmf_patch_overlap}');

# plot the patches
patch_ax = view_quilt(correlation_image, 
                      cnmf_patch_stride, 
                      cnmf_patch_overlap, 
                      vmin=np.percentile(np.ravel(correlation_image),50), 
                      vmax=np.percentile(np.ravel(correlation_image),99.5),
                      figsize=(4,4));
patch_ax.set_title(f'CNMF Patches Width {cnmf_patch_width}, Overlap {cnmf_patch_overlap}');

#### Evaluate spatial parameters using the quilt plot
- Is the patch width at least three times the width of a neuron? Yes
- Do individual neurons fit in the overlap region (`stride`)? No, current value of 6 is a bit low, so let's bump `stride` to 9 or 10.
- When in interactive mode, you can zoom and inspect the average width of each neuron in pixels. Is `gSig` about half that? Yes. Each neuron is about 6-8 pixels wide, and `gSig` is 4.
- For `K`: how many neurons are in each patch? Remember you want an upper bound, not an average. The current value of 4 seems good.

If you can't remember the current parameter values, remember you can always use the `key_params()` convenience function to print them out:

In [None]:
key_params(cnmf_model)

The only parameter we are going to change is `stride`. To *change* parameters, Caiman has a built-in `change_params()` function, and you can iteratively go through the following cell, tweaking params until you are happy with them based on the `rf/stride/K/gSig` combination for your data. 

Note while you can just change the parameter and keep going, we recommend inspecting the patches that will result using `plot_patches()`, as there can be counterintuitive edge effects with patches that yield a lot of redundancy of computation with certain combinations of `rf` and `stride`, so it is helpful to inspect the results.

In [None]:
rf_new = rf  # unchanged
stride_new = 10
gsig_new = gSig # unchanged
gsiz_new = gsig_new*2 + 1
k_new = K  # unchanged
merge_thr_new = merge_thr  # unchanged

print(f"Before changing: {key_params(cnmf_model)}")
cnmf_new_params = {'rf': rf_new,
                   'stride': stride_new,
                   'gSig': gsig_new,
                   'gSiz': gsiz_new, 
                   'K': k_new,
                   'merge_thr': merge_thr_new}
cnmf_model.params.change_params(params_dict=cnmf_new_params)


# plot the patches
# calculate stride and overlap from parameters
cnmf_overlap = cnmf_model.params.patch['stride'] + 1
cnmf_patch_width = cnmf_model.params.patch['rf']*2 + 1
cnmf_stride = cnmf_patch_width - cnmf_overlap

patch_ax = view_quilt(correlation_image, 
                      cnmf_stride, 
                      cnmf_overlap, 
                      vmin=np.percentile(np.ravel(correlation_image),50), 
                      vmax=np.percentile(np.ravel(correlation_image),99.5),
                      figsize=(4,4));
patch_ax.set_title(f'CNMF Patches Width {cnmf_patch_width}, Overlap {cnmf_overlap}');

print(f"After changing: {key_params(cnmf_model)}")

The new parameters look good, but ultimately the only way to really know is to fit the model with the parameters and see how it does: often you have to iteratively search a bit in parameter space. It's time to run CNMF! Note if you are ever unhappy with your parameters you can always run through an iterative process like the one above to improve them. 

<div class="alert alert-info" markdown="1">
    <h2 style="margin-top: 0;">Mesmerize your search!</h2>  
It is not always efficient to cobble together iterative parameter search from scratch, especially if you end up needing to do a grid search in a large parameter space. <a href=https://github.com/nel-lab/mesmerize-core>Mesmerize</a> is a great package built to make parameter exploration in Caiman faster and more convenient: it keeps track of all your different results, and includes powerful GPU-based tools (based on the <a href="https://github.com/fastplotlib/fastplotlib">fastplotlib library</a>) for visualization of results in Jupyter notebooks. 
</div>

### Run CNMF
Now that we are happy with our parameters, let's run the cnmf algorithm using the `fit()` method. 

In [None]:
%%time
cnmf_fit = cnmf_model.fit(images)

<div class="alert alert-info" markdown="1">
    <h3 style="margin-top: 0;">Run all of the above with one command</h3>  
It is possible to run the combined steps of motion correction, memory mapping, and cnmf fitting in one step using the <em>fit_file()</em> method. We recommend that you familiriaze yourself with the different steps first. It can be useful when testing code, or if sending jobs to a cluster in those cases when you already know the parameter settings.

    cnmf_all = cnmf.CNMF(n_processes, 
                         params=parameters, 
                         dview=cluster)
    cnmf_all.fit_file(motion_correct=True)
</div>

### Inspect the initial estimates
Briefly inspect the results by plotting contours of identified components using `plot_contours_nb()`. 

You can interactively explore this plot in your notebook with the help of the buttons on the right-hand-side of the plot (it was made using the [Bokeh](https://bokeh.org/) library). They let you zoom, pan, reset, or save the image.

In [None]:
cnmf_fit.estimates.plot_contours_nb(img=correlation_image, cmap='gray');

Note this is just an initial result, which will contain many false positives, which is to be expected. The main concern to watch for here is whether you have lots of false *negatives* (has the algorithm missed neurons?). False negatives are hard to fix later, so if you have an unacceptable number, be sure to go back and re-run CNMF with new parameters.

> If you get a data rate error with any notebook plotting commmands, can start your notebook using     
`jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10`

### Re-run (seeded) CNMF  on the full field of view 
It is typically helpful to refine the initial estimates by re-running the CNMF algorithm seeded just on the spatial estimates from the previous step using the `refit()` method. 

In [None]:
%%time
cnmf_refit = cnmf_fit.refit(images, dview=cluster)

The spatial contours of the new estimates should now look cleaner and more canonically neuronal in shape:

In [None]:
cnmf_refit.estimates.plot_contours_nb(img=correlation_image);

While the estimates look better, at this step there will still be false positives. The next **component evaluation** stage of the pipeline will evaluate the initial set of estimates, and remove the bad ones. But first, let's disuss the estimates object.

> Note if you are using this notebook to get hints for how to run CNMFE, **do not** run this `refit()` step. It is not needed, and not implemented, for CNMFE. 

# The estimates class
The main point of the CNMF algorithm is to perform source separation: to extract the spatial footprints and temporal traces of calcium activity from neurons underlying the raw data. This information, and many useful methods for visualization and analysis, are contained in Caiman's `Estimates` class. In the above code, an `estimates` object was generated as an attribute of your CNMF model after running `fit()` (you can find it in `cnmf_refit.estimates`). 

The rest of this notebook, from component evaluation to calculation of DFoF, is effectively an exploration of the properties and methods of the `Estimates` class. The most important estimates generated are:

    C: denoised calcium traces (num components x num frames) -- the temporal traces
    A: spatial components (num pixels x num components) - the spatial footprints
    YrA: residual for each calcium trace (num components x num frames)
    S: spike count estimate from deconvolution, if used (num components x num frames)
    F_dff: deltaF/F -- detrended and normalized raw calcium traces (num components x num frames)

To recover raw calcium traces, you can the denoised calcium traces and their residuals (`C + YrA`). The `F_dff` calculation is not done automatically, so when you run `fit()` the `F_dff` field will initially be `None`. We will show how to do populate that field below.

In [None]:
# see shape of A and C
cnmf_refit.estimates.A.shape, cnmf_refit.estimates.C.shape

<div class="alert alert-info" markdown="1">
    <h2 style="margin-top: 0;">More on Estimates</h2>  
The estimates object contain a great deal of information. The attributes are discussed in more detail <a href="https://caiman.readthedocs.io/en/latest/Getting_Started.html#result-interpretation">in the documentation</a>, but you might also find exploring the <a href="https://github.com/flatironinstitute/CaImAn/blob/main/caiman/source_extraction/cnmf/estimates.py">source code</a> rewarding. For instance, while most users initially care about the extracted calcium signals <em>C</em> and spatial footprints <em>A</em>, the <b>background model</b> is also very important. The background model is included in the estimate in fields <em>b</em> and <em>f</em> (which correspond to the spatial and temporal components of the low-rank background model, respectively). We discuss the background model more below.

Also, we realize that attribute names like <em>A</em> are not very informative or Pythonic. These names are rooted in mathematical conventions from the original papers in the literature, but we may add aliases that are more transparent in the future to make things more readable/understandable.
</div>

# Component Evaluation
As already mentioned, the initial estimates produced by CNMF contains many spurious components. Our next step is to do some some quality control, cutting out some of the bad estimates to arrive at our final set of estimates:

![component evaluation image](https://raw.githubusercontent.com/EricThomson/image_sandbox/main/images/evaluation_workflow.jpg)


We will evaluate each component by applying the `evaluate_components()` method. The criteria Caiman uses to evaluate components are:  

- **Signal to noise ratio (SNR)**: a baseline noise estimate is extraced for each raw calcium trace, and SNR during calcium transients is calculated relative to this baseline. These values are stored in `estimates.SNR_comp`. Those components with high SNR are higher quality, and less likely to be false positives.
- **Spatial correlation**: the extracted spatial footprints in `estimates.A` should be highly correlated with activity in the actual movie, at least on those frames when that component is active. These correlation coefficients are stored in `estimates.r_values`. 
- **CNN confidence**: Each spatial component in `estimates.A` is passed through a CNN-based classifier, trained on consensus data sets, that produces a confidence value between 0 and 1 that the shape is a real neuron. These are stored in `estimates.cnn_preds`.

The first two criteria are illustrated schematically here (see also Figure 2 of <a href="https://elifesciences.org/articles/38173">the Caiman paper</a>):

![component evaluation image](https://raw.githubusercontent.com/EricThomson/image_sandbox/main/images/component_evaluation.jpg)

The `evaluate_components()` method uses the above criteria to sort components into accepted and rejected components. For each criterion, there is a threshold value in `quality` field of the parameters object -- the thresholds are `min_SNR`, `rval_thr`, and `min_cnn_thr`, respectively. If a unit is below *all* of those threshold values, it will be rejected.

In [None]:
print("Thresholds to be used for evaluate_components()")
print(f"min_SNR = {cnmf_refit.params.quality['min_SNR']}")
print(f"rval_thr = {cnmf_refit.params.quality['rval_thr']}")
print(f"min_cnn_thr = {cnmf_refit.params.quality['min_cnn_thr']}")

Run `estimates.evaluate_components()`. This can take a few minutes if you have a very large number of components:

In [None]:
cnmf_refit.estimates.evaluate_components(images, cnmf_refit.params, dview=cluster);

This method filled in two arrays in the `estimates` class: `idx_components` (accepted) and `idx_components_bad` (rejected). 

In [None]:
print(f"Num accepted/rejected: {len(cnmf_refit.estimates.idx_components)}, {len(cnmf_refit.estimates.idx_components_bad)}")

<div class="alert alert-info" markdown="1">
    <h2 style="margin-top: 0;">More on component evaluation</h2>  
<p>In practice, SNR is the most important evaluation factor. The spatial correlation factors are less important. In particular, the CNN for spatial evaluation may be inaccurate if your neural components are not "canonically" shaped somata.</p>  
    
    
You may have noticed from the description above, that when running <em>evaluate_components()</em> the three evaluation thresholds are appied <em>inclusively</em>: if a component is above <em>any</em> of the thresholds, it will pass muster. This was found in practice to be reasonable (e.g., a low SNR component that is very strongly neuronally shaped tends to not be an accident: it is just a very low SNR neuron). However, there is a second set of <b>absolute</b> threshold parameters  set for each criterion. If a component is <em>below</em> this absolute threshold for any of the evaluation parameters, it will be discarded: these are the <em>SNR_lowest</em>, <em>rval_lowest</em>, and <em>cnn_lowest</em>, respectively. 
</div>

## Visualizing results
We've sorted our components into accepted/rejected, so it's time to start looking at the results! Caiman provides many built-in `estimates` methods for visualizaing results. 

### All contours
We have already used `plot_contours_nb()`, but if we provide it the `idx` keyword it will split the view into accepted and rejected components.

In [None]:
cnmf_refit.estimates.plot_contours_nb(img=correlation_image, 
                                      idx=cnmf_refit.estimates.idx_components);

### View individual spatial/temporal components
One of the most useful visualization tools is `nb_view_components()`, which lets you scroll through individiual spatial and temporal components. This tool also displays the values of the three evaluation criteria for each component, which can be useful if you feel you need to change your evaluation criteria and re-run `evaluate_components()`. Perhaps you have too many false negatives and want to lower your SNR threshold. 

In [None]:
# view accepted components
cnmf_refit.estimates.nb_view_components(img=correlation_image, 
                                        idx=cnmf_refit.estimates.idx_components,
                                        cmap='gray');

The above shows the raw traces (`C+YrA`) by default, but you can superimpose the denoised traces from `C` if you add a color to the `denoised_color` parameter. As always in Jupyter, if you are unsure how a method works, you can enter `cnmf_refit.estimates.nb_view_components?` in a new cell to get the documentation for the method. 

We can also view the rejected compoonents:

In [None]:
# rejected components
if len(cnmf_refit.estimates.idx_components_bad) > 0:
    cnmf_refit.estimates.nb_view_components(img=correlation_image, 
                                            idx=cnmf_refit.estimates.idx_components_bad, 
                                            cmap='gray',
                                            denoised_color='red')
else:
    print("No components were rejected.")

> One legacy from Matlab with these plotters is that they use one-based indexing when showing `Neuron number`. This can make it confusing when comparing to your own plotting results which will use zero-based indexing.

### Building your own visualizations
> This is a slightly more advanced section that you can safely skip your first time through. 

There are many custom visualizations you can build yourself based on the estimates generated from Caiman. For example, if you wanted to plot the spatial footprint of a neuron with the contour superimposed, this information is already contained in `estimates`.  The contours of the spatial footprints are in `estimates.coordinates`, which is a list of dictionaries corresponding to each component. The dictionary includes a `coordinates` field that contains the x,y coordinates of the contour. Here we'll show how to plot this superimposed on the corresponding spatial footprint in `A`. 

The following extracts all of the contours of the accepted components into a list from the estimates (it puts them in a list because they are not all the same size):

In [None]:
idx_accepted = cnmf_refit.estimates.idx_components
all_contour_coords = [cnmf_refit.estimates.coordinates[idx]['coordinates'] for idx in idx_accepted]

Each footprint in `A` is stored as a compressed sparse column array. We can convert it to a dense array with `toarray()`, and plot the contour and footprint together:

In [None]:
idx_to_plot = 30
component_contour = all_contour_coords[idx_to_plot]
component_footprint = np.reshape(cnmf_refit.estimates.A[:, idx_accepted[idx_to_plot]].toarray(), dims, order='F')

In [None]:
plt.figure(); 
plt.imshow(component_footprint, cmap='gray');
plt.plot(component_contour[:, 0], 
         component_contour[:, 1], 
         color='pink', 
         linewidth=2)
plt.title('Footprint/Contour');

 ## How to save and load results (optional)
There is a built-in `save()` method for the `cnmf` object.

> Note: when you save, you are only saving what is contained in the `cnmf` object. If you want to save other things in your workspace at the same time, you can attach them to your `estimates` object. We'll show how to do this with the correlation image which is useful to load later for plotting.

In [None]:
save_results = True
if save_results:
    save_path =  r'demo_pipeline_results.hdf5'  # or add full/path/to/file.hdf5
    cnmf_refit.estimates.Cn = correlation_image # squirrel away correlation image with cnmf object
    cnmf_refit.save(save_path)

### Loading saved results
Once you have saved the results, use the `load_CNMF()` method to load them. 

In [None]:
load_results = True
if load_results:
    save_path =  r'demo_pipeline_results.hdf5'  # or add full/path/to/file.hdf5
    cnmf_refit = cnmf.load_CNMF(save_path, 
                                n_processes=num_processors_to_use, 
                                dview=cluster)
    correlation_image = cnmf_refit.estimates.Cn
    print(f"Successfully loaded data.")

# A few final things
We have extracted the calcium traces `C`, spatial footprints `A`, and estimated spike counts `S`, which is the main goal with CNMF. But there are a few important things remaining.

## Extract $\Delta F/F$ values
So far our calcium traces are in arbitrary units. In the literature, it is common to report the calcium fluorescence relative to some baseline value. In Caiman the baseline is a running percentile calculated over a `frames_window` moving window. You can calculate $\Delta F/F$ using raw traces or the denoised traces in C (this is toggled using the `use_residuals` argument):

In [None]:
if cnmf_refit.estimates.F_dff is None:
    print('Calculating estimates.F_dff')
    cnmf_refit.estimates.detrend_df_f(quantileMin=8, 
                                      frames_window=250,
                                      use_residuals=False);  # use denoised data
else:
    print("estimates.F_dff already defined")

The estimates object will now have a `F_dff` field, which makes it easier to compare traces across neurons/sessions. 

## Select only accepted components (optional)
If you want to discard rejected components (`estimates.idx_components_bad`) from the `estimates` field, you can run  `select_components()`. This can be useful if you are sure you only want to focus on the accepted components for downstream analysis (e.g., to share final results with colleagues, for instance).

<div class="alert alert-warning" markdown="1">
    <h4 style="margin-top: 0;">Warning: select_components() is a destructive operation</h4>  
If you run this command, the rejected components will be removed from your <em>estimates</em> field. If you think you might want them later, you can set the <em>save_discarded_components</em> parameter to <em>True</em>. That way you can retrieve them later with the <em>restore_discarded_components()</em> method. 
</div>

In [None]:
cnmf_refit.estimates.select_components(use_object=True);

The `use_object` parameter specifies that we want to select the accepted components in `estimates.idx_components` (and remove the components in `idx_components_bad`). Alternatively, you could also specify the indices of the components using an `idx_components` parameter). 

## Display final results
View the final refined set of remaining components.

In [None]:
cnmf_refit.estimates.nb_view_components(img=correlation_image, 
                                        denoised_color='red',
                                        cmap='gray');

## View different result movies
### An aside on the underlying model
To understand the next two visualizations, we need to discuss the model used by Caiman a little bit. In broad terms, it is relatively simple: the CNMF algorithm models the original movie as a sum of *neural activity*, *background activity*, and *noise* (or *residual*):

    original_movie = neural_activity + background + residual
    
In this model, `neural_activity` is the product of the matrices `AC`, the spatial and temporal components we have been exploring (`estimates.A` and `estimates.C`). It is our model of the neural bits that we care about.

`background` is the model's representation of all the background activity our movie that we wish wasn't there -- this includes stray fluorescence from the neuropil and out-of-focus neural components. This background model is also broken up into spatial and temporal components, which are in `estimates.b` (background) and `estimates.f` (fluctuations), respectively. These components are positive, but otherwise unconstrained, low-dimension (typically 1 or 2 rank -- this is controlled by the `gnb` parameter) models that capture the spatially very large-scale background fluctuating activity in the movie. 

The "noise", or residual term, is by definition, everything else not captured by the model: 

    residual = original_movie - neural_activity - background

### View denoised movie
We can rearrange the equations above to yield a "denoised" movie, which is just the original movie with the residual removed:

    denoised_movie = original_movie - residual = neural_activity + background
    
Plugging in appropriate terms from the models of neural activity and background activity (`AC` and `bf`) yields:

In [None]:
# reconstruct denoised movie
neural_activity = cnmf_refit.estimates.A @ cnmf_refit.estimates.C  # AC
background = cnmf_refit.estimates.b @ cnmf_refit.estimates.f  # bf
denoised_movie = neural_activity + background  # AC + bf

# turn into a movie object
denoised_movie = cm.movie(denoised_movie).reshape(dims + (-1,), order='F').transpose([2, 0, 1])

View the denoised movie:

In [None]:
# press q to quit
downsampling_ratio = 0.2
denoised_movie.resize(fz=downsampling_ratio).play(gain=0.8,
                                                  q_min=30,
                                                  q_max=99, 
                                                  fr=30,
                                                  plot_text=True,
                                                  magnification=3,
                                                  backend='opencv')

### Visualize data, predicted activity, and residual
For our final visualization, we will use a built-in method `play_movie()` that shows the original movie, the predicted movie (either `AC` or `AC + bf`), and the residual. 

Viewing the residuals (what the model doesn't explain) can be extremely useful: if you end up seeing lots of neural activity in the residual movie, that means your model is leaving something important out, and you might need to go tweak some parameters. In fact, in Caiman's online algorithms, this is how we decide whether to add new neurons after initialization. If neural activity is discovered in the residual buffer, then it's time to add a new neuron to `AC`! In the offline algorithms, it just means you might need to go check your parameters. No model is perfect: there is always some residual. You have to use your judgment about whether it is worth chasing with additional fitting.

> The `play_movie()` method has an option to include the background from the model (`bf`) or not (this is the `include_bck` Boolean parameter). If you set it to `False`, it will subtract `bf` from the original movie and will only include `AC` in the middle panel. 

In [None]:
# in case you are working from loaded data, recover the raw movie
Yr, dims, num_frames = cm.load_memmap(cnmf_refit.mmap_file)
images = np.reshape(Yr.T, [num_frames] + list(dims), order='F')

In [None]:
# press q to quit (can take a while to start running)
cnmf_refit.estimates.play_movie(images, 
                                q_max=99.9, 
                                gain_res=0.5,
                                magnification=2,
                                include_bck=True,
                                use_color=True,
                                thr=0); # set to 0.1 to see contours

If you ran `select_components()` to remove rejected components, the residual movie will contain activity from those rejected bits. This doesn't mean the model is bad, it is just showing false positives you already removed.

# Where next?
The goal of Caiman is to help you extract high-quality signals from your data, and we have achieved that goal with the above steps. The next step, of doing actual *analysis* of these results, is the most interesting neuroscience happens: what are the statistical features of these signals? How do they relate to other environmental, molecular, behavioral, and neuronal features that you are studying? What kinds of visualization and analysis tools can you build around this infrastructure? If you find/publish things that help, please share them with us, as we would love to hear about them! 

# Clean up open resources
We have a few resources we have left open we should take care of.

## Shut down cluster 
To free up processing resources, let's shut down the cluster in case it is still open. 

In [None]:
cm.stop_server(dview=cluster)

## Shut down logger, optionally remove log files
If you set up your logger to log to files, and you don't want to preserve them, you can delete them with the following. If you have custom log filenames, you may have to change the `log_files` pattern for the following to work. 

In [None]:
# Shut down logger (otherwise will not be able to delete it)
logging.shutdown()

In [None]:
delete_logs = True
logging_dir = cm.paths.get_tempdir() 
if delete_logs:
    log_files = glob.glob(logging_dir + '\\demo_pipeline' + '*' + '.log')
    for log_file in log_files:
        print(f"Deleting {log_file}")
        os.remove(log_file)
else:
    print(f"If you want to inspect your logs they are in {logging_dir}")