<a href="https://colab.research.google.com/github/HelmchenLabSoftware/Cascade/blob/master/Demo%20scripts/Calibrated_spike_inference_with_Cascade_(batch_script).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CASCADE

## Calibrated spike inference from calcium imaging data using deep networks (batch script)
Written and maintained by [Peter Rupprecht](https://github.com/PTRRupprecht) from the [Helmchen Lab](https://www.hifo.uzh.ch/en/research/helmchen.html).
The project started as a collaboration of the Helmchen Lab and the [Friedrich Lab](https://www.fmi.ch/research-groups/groupleader.html?group=119). Feedback goes to [Peter Rupprecht](mailto:p.t.r.rupprecht+cascade@gmail.com).

---



This colaboratory notebook runs on servers in the cloud. It uses an algorithm based on deep networks for spike inference (CASCADE, described in this **[Resource Article](https://www.nature.com/articles/s41593-021-00895-5)** published in Nature Neuroscience). Here, you can test the algorithm and use it without any installation on your computer. You just have to sequentially **press the play buttons ("Run cell")** on the left of each box, and the code will be executed.

* If you want to **see the algorithm in action**, just execute the cells without any modifications. Enjoy!

* If you want to **upload your own data**, make predictions and download the saved files, you have to modify the variable names and follow the instructions. Usually no or very little modifications of the code is required.

* If you want to integrate CASCADE into **your local data analysis pipeline**, we suggest you take a look at the [Github repository](https://github.com/HelmchenLabSoftware/Calibrated-inference-of-spiking).

##1. Download repository into the Colab Notebook


In [None]:
#@markdown The Github repository with all custom functions, the ground truth datasets and the pretrained models is copied to the environment of this notebook. This will take a couple of seconds.

#@markdown *Note: You can check the code underlying each cell by double-clicking on it.*

import os

# If in Colab and not yet downloaded, download GitHub repository and change working directory
if os.getcwd() == '/content':
    !git clone https://github.com/HelmchenLabSoftware/Cascade
    os.chdir('Cascade')

# If executed as jupyter notebook on own computer, change to parent directory for imports
if os.path.basename( os.getcwd() ) == 'Demo scripts':
    %cd ..
    print('New working directory:', os.getcwd() )

##2. Import required python packages


In [None]:
#@markdown Downloads packages from public repository, and packages from Cascade.

%%capture
!pip install ruamel.yaml

# standard python packages
import os, warnings
import glob
import numpy as np
import scipy.io as sio
import matplotlib.pyplot as plt
import ruamel.yaml as yaml
yaml = yaml.YAML(typ='rt')

# cascade2p packages, imported from the downloaded Github repository
from cascade2p import checks
checks.check_packages()
from cascade2p import cascade # local folder
from cascade2p.utils import plot_dFF_traces, plot_noise_level_distribution, plot_noise_matched_ground_truth

##3. Define the function to load ΔF/F traces


In [None]:
#@markdown ΔF/F traces must be saved as \*.npy-files (for Python) or \*.mat-files (for Matlab/Python) as a single large matrix named **`dF_traces`** (neurons x time). ΔF/F values of the input should be numeric, not in percent (e.g. 0.5 instead of 50%). For different input formats, the code in this box can be modified (it\'s not difficult).

def load_neurons_x_time(file_path):
    """Custom method to load data as 2d array with shape (neurons, nr_timepoints)"""

    if file_path.endswith('.mat'):
      traces = sio.loadmat(file_path)['dF_traces']
      # PLEASE NOTE: If you use mat73 to load large *.mat-file, be aware of potential numerical errors, see issue #67 (https://github.com/HelmchenLabSoftware/Cascade/issues/67)

    elif file_path.endswith('.npy'):
      traces = np.load(file_path, allow_pickle=True)
      # if saved data was a dictionary packed into a numpy array (MATLAB style): unpack
      if traces.shape == ():
        traces = traces.item()['dF_traces']

    else:
      raise Exception('This function only supports .mat or .npy files.')

    print('Traces standard deviation:', np.nanmean(np.nanstd(traces,axis=1)))
    if np.nanmedian(np.nanstd(traces,axis=1)) > 2:
      print('Fluctuations in dF/F are very large, probably dF/F is given in percent. Traces are divided by 100.')
      return traces/100
    else:
        return traces





##4. Batch process files

In [None]:
#@markdown If you are testing the script, you can leave everything unchanged. If you want to apply the algorithm to your own data, you have to upload your data first. The paragraph above tells you how to format and name the files. You can do this by clicking on the **folder symbol ("Files")** on the left side of the Colaboratory notebook. Next, indicate the path of the uploaded files in the variables **example_folder** and the pattern **`file_pattern`**. The file pattern with the asterisk (\*) as placeholder indicates the general pattern of all files. All files for this pattern will be processed. Finally, indicate the sampling rate of your recordings in the variable **`frame_rate`**.

example_folder = "Example_datasets/Allen-Brain-Observatory-Visual-Coding-30Hz/" #@param {type:"string"}

file_pattern = "Experiment_55*.mat" #@param {type:"string"}

frame_rate = 30 #@param {type:"number"}


#@markdown Select and download the model that fits to your dataset (frame rate, training data; see FAQ for more details) and assign to variable **`model_name`**.
model_name = "Global_EXC_30Hz_smoothing25ms_causalkernel" #@param {type:"string"}
cascade.download_model( model_name,verbose = 1)


all_file_names = glob.glob(example_folder+file_pattern)

for file_index,example_file in enumerate(all_file_names):

  try:

    traces = load_neurons_x_time( example_file )
    print('Number of neurons in dataset:', traces.shape[0])
    print('Number of timepoints in dataset:', traces.shape[1])

  except Exception as e:

    print('\nSomething went wrong!\nEither the target file is missing, in this case please provide the correct location.\nOr your file is not yet completely uploaded, in this case wait until the upload is completed.\n')

    print('Error message: '+str(e))

  #@markdown If this takes too long, make sure that the GPU runtime is activated (*Menu > Runtime > Change Runtime Type*).

  total_array_size = traces.itemsize*traces.size*64/1e9

  # If the expected array size is too large for the Colab Notebook, split up for processing
  if total_array_size < 10:

    spike_prob = cascade.predict( model_name, traces, verbosity=1 )

  # Will only be use for large input arrays (long recordings or many neurons)
  else:

    print("Split analysis into chunks in order to fit into Colab memory.")

    # pre-allocate array for results
    spike_prob = np.zeros((traces.shape))
    # nb of neurons and nb of chuncks
    nb_neurons = traces.shape[0]
    nb_chunks = int(np.ceil(total_array_size/10))

    chunks = np.array_split(range(nb_neurons), nb_chunks)
    # infer spike rates independently for each chunk
    for part_array in range(nb_chunks):
      spike_prob[chunks[part_array],:] = cascade.predict( model_name, traces[chunks[part_array],:] )


  #@markdown By default saves as variable **`spike_prob`** both to a *.mat-file and a *.npy-file. You can uncomment the file format that you do not need or leave it as it is.

  folder = os.path.dirname(example_folder)
  file_name = 'predictions_' + os.path.splitext( os.path.basename(example_file))[0]
  save_path = os.path.join(folder, file_name)

  # save as mat file
  sio.savemat(save_path+'.mat', {'spike_prob':spike_prob})

  # save as numpy file
  np.save(save_path, spike_prob)

Downloading and extracting new model "Global_EXC_30Hz_smoothing25ms_causalkernel"...
Pretrained model was saved in folder "/content/Cascade/Pretrained_models/Global_EXC_30Hz_smoothing25ms_causalkernel"
Traces standard deviation: 0.079336
Number of neurons in dataset: 74
Number of timepoints in dataset: 6001

 
The selected model was trained on 18 datasets, with 5 ensembles for each noise level, at a sampling rate of 30Hz, with a resampled ground truth that was smoothed with a causal kernel of a standard deviation of 25 milliseconds. 
 

Loaded model was trained at frame rate 30 Hz
Given argument traces contains 74 neurons and 6001 frames.
Noise levels (mean, std; in standard units): 0.93, 0.17





Predictions for noise level 2:




	... ensemble 0
[1m434/434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step
	... ensemble 1
[1m434/434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step
	... ensemble 2
[1m434/434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step
	... ensemble 3
[1m434/434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step
	... ensemble 4
[1m434/434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step

Predictions for noise level 3:
	No neurons for this noise level

Predictions for noise level 4:
	No neurons for this noise level

Predictions for noise level 5:
	No neurons for this noise level

Predictions for noise level 6:
	No neurons for this noise level

Predictions for noise level 7:
	No neurons for this noise level

Predictions for noise level 8:
	No neurons for this noise level

Predictions for noise level 9:
	No neurons for this noise level
Spike rate inference done.
Traces standard deviation: 0.079336
Number of neurons in datase




Predictions for noise level 2:




	... ensemble 0
[1m434/434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step
	... ensemble 1
[1m434/434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step
	... ensemble 2
[1m434/434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step
	... ensemble 3
[1m434/434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step
	... ensemble 4
[1m434/434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step

Predictions for noise level 3:
	No neurons for this noise level

Predictions for noise level 4:
	No neurons for this noise level

Predictions for noise level 5:
	No neurons for this noise level

Predictions for noise level 6:
	No neurons for this noise level

Predictions for noise level 7:
	No neurons for this noise level

Predictions for noise level 8:
	No neurons for this noise level

Predictions for noise level 9:
	No neurons for this noise level
Spike rate inference done.
