# “Software” Requirement Specifications
The package is intended to evaluate predictions by different connectivity models and plot evaluation measures. It should have interfaces that enable the user to read and write data in MATLAB format and also gifti files. 

## Modules

### Data Integration Module
Should be able to work with MATLAB’s .mat files and gifti files.
* Read in MATLAB’s .mat file
    * Model’s input data used for prediction. Can be a structure with all the possible input data:
        * Activity profiles
        * Time series
            * Predicted
            * Residual
            * Raw
    * <font color = 'red'>Model’s output data structures for models created in MATLAB.<font color = 'black'>
* Read in gifti formats to create cortical and cerebellar maps.

#### How to load in MATLAB’s structures (saved as .mat files) into python?
* Using python’s Matlab engine. Follow the instructions in https://www.mathworks.com/help/matlab/matlab_external/install-the-matlab-engine-for-python.html
* mat73 package seems to be working for .mat files saved with '-v7.3'
    * Pros: Works really good for -v7.3
    * Cons: Works __only__ for version 7.3
* scipy.io loadmat .mat files saved with versions before -7.3
    
#### List of functions:
<font color = 'green'>__matImport__:<font color = 'black'>
* Import mat files saved with v7.3 or before
* Cannot import SPM.mat files saved with v7.3
* Can handle nested structures
    
<font color = 'green'>__giftiImport__:<font color = 'black'>
* Import gifti files
* Uses nibabel
** not even sure if we need it.

### Preparation Module
preparing input data for the modelling and evaluation pipelines.

Check the comment section of each function for the description of the nested dictionaries' hierarchy

#### List of functions:
<font color = 'green'>__get_data__:<font color = 'black'>
* get and prepares data for modelling
* returns a nested dictionary:
    * B_dict{experiment:{subject:{session:B_sess} }}
* saves B_dict
    
<font color = 'green'>__get_wcon__:<font color = 'black'>
* uses the data saved in get_data and the text file created to prepare the data for modelling

##### Package dependencies:
* pickle: for saving and loading the saved .dat files. 
    * pip install pickle-mixin 
    * pip3 install pickle-mixin
    

### Modelling Module
implements different models:
* Ridge regression
* principal component regression
* partial least squares (projection to latent structure) regression
* simultaneous parameter learning and biclustering for multi-response models:https://www.frontiersin.org/articles/10.3389/fdata.2019.00027/full


#### List of functions:
<font color = 'green'>__connect_fit__:<font color = 'black'>
* used for fitting different models
* uses sklearn FOR NOW!
* Also calculates R2, R, R2_vox, R_vox
* STILL UNDER CONSTRUCTION::Things to do:
    * add simultaneous parameter estimation and biclustering method
    * modify the pls regression method to model "non-one-to-one" connections
    
<font color = 'green'>__model_fit__:<font color = 'black'>
* Uses connect_fit to fit models to data for each subject
* saves a dictionary with all the model info
    
##### Package dependencies:
* sklearn

### Essentials Module
will contain functions used in other modules (preparation, modelling, and evaluation).
List of functions (will be updated):
* indicatorMatrix (translated indicatorMatrix.m into python)

#### List of functions:
<font color = 'green'>__indicatorMatrix__:<font color = 'black'>
* translated code from MATLAB
* STILL UNDER CONSTRUCTION!

### Evaluation Module
Uses test data __(usually from sc2)__ and the model structure data to evaluate the predictions by the model. 
Cross-validation: calculates R2cv, Rcv, R2, R, and the voxel level measures + noise ceilings sparsity measures
* Cross-validation within each subject:
    * Sessions as folds
    * Studies as folds
    * Cross-validation with each subject as a fold. In case we decide to do the modelling at the group level.
* Evaluation measures:
    * Using all the conditions:
        * Double cross-validated predictive correlation
        * Not double-cross-validated predictive correlation
        * Reliability of data 
        * Reliability of predictions
        * Sparseness measures
        * Upper noise ceiling
        * Lower noise ceiling
        * RDMs with the predicted data!
    * Using the shared conditions only:
        * Double cross-validated predictive correlation
        * Not double-cross-validated predictive correlation
        * Reliability of data 
        * Reliability of predictions
        * Sparseness measures
        * Upper noise ceiling
        * Lower noise ceiling
        * RDMs with the predicted data!
* Make an integrated data structure with the data for all the models and all the subjects
* Evaluation plots:
    * Model evaluation measures vs parameters
    * Cerebellar maps for voxel-wise measures.
    
#### List of functions:
<font color = 'green'>__evaluate_model__:<font color = 'black'>
* either crossed or uncrossed evaluation
* STILL UNDER CONSTRUCTION::Things to do::
    * use X'X to weight 
    * more flexible coding to use subsets of conditions
    * plottings and maps
    
<font color = 'green'>__evaluate_pipeline__:<font color = 'black'>
* evaluates multiple models for multiple subjects
    
<font color = 'green'>__eval_df__:<font color = 'black'>
* creates a dataframe of all the available modelling and eval. info
    
#### Package dependencies:
* sklearn
* pickle


# Data types for saving
In the pipeline, we will need to save some variables. 
* What is the proper type to save the variables? Dictionaries maybe? 
* What package do we need for saving?
    * pickle is one of the recommended packages: pip install pickle-mixin