Implementation of reconstruction error ? #37

emdupre · 2019-12-02T19:43:47Z

Thanks again for making this toolkit !

To understand it a little better, I'm trying to reproduce the Figure 4a from your recent paper introducing fmralign; you can see the implementation I have here, or launch it remotely with Binder.

I can't seem to get my distributions for ridge or scaled orthogonal alignment to match what you have in Figure 4a. I think the most likely explanation is simply that I'm not implementing the formula for reconstruction error appropriately.

Do you have an implementation of this already available ?

The text was updated successfully, but these errors were encountered:

thomasbazeille · 2019-12-13T07:58:10Z

Hello Elizabeth, thanks a lot for your tutorials and the good opportunity to reproduce the results of the paper, sorry in advance for the long message. This answer includes :

Feedback on your notebook code.
A figure with results i get with my reimplementation.
The full replication code, sorry I didn't have time to make it a proper tutorial

Don't hesitate if you have any questions, or if you have ideas to improve documentation with what seemed unclear for you.

Feedbacks after running your code :

the important difference is the experimental set-up. For one pair, we use all "ap" conditions to learn one common alignment that we then apply to every "pa" conditions whereas you tried to learn one alignment per condition. As a result, ridge fails since the cross-validation scheme requires at least as many images as folds and the results you get with all methods must be spurious
The metrics are voxelwise and should be calculated across several samples (conditions here). So the results of your implementation of the reconstruction error and ratio match mine at the average level, but importantly, they should be run by voxel and on several samples
So for plotting you should get 40000 values for each subject. Then the violinplots we did are on all the concatened voxels of target subject. (each point representing one of the 20 x 40000 voxels)

Sidenotes :

before using the basc atlas to cluster you forgot to resample it so maybe that's what caused you troubles. For my implementation I sticked with automatic kmeans clustering (even though the paper used hierarchical kmeans)
if you provide a masker to the PairwiseAlignment class with the mask that comes with the data it will avoid fmralign recalculating it.

With these changes I obtain results that are fairly comparable with the paper. (code below)

Full implementation includes:

a reconstruction error function with scikit-learn api.
same parameters as the paper when possible

import numpy as np
import scipy.optimize
from scipy.stats import pearsonr
from sklearn.metrics import r2_score
from fmralign.fetch_example_data import fetch_ibc_subjects_contrasts
from nilearn.image import load_img


def score_table(loss, X_gt, X_pred, multioutput='raw_values'):
    if loss is "r2":
        score = r2_score(X_gt, X_pred, multioutput=multioutput)
    elif loss is "n_reconstruction_err":
        score = normalized_reconstruction_error(
            X_gt, X_pred, multioutput=multioutput)
    elif loss is "corr":
        score = np.array([pearsonr(X_gt[:, vox], X_pred[:, vox])[0]
                          for vox in range(X_pred.shape[1])])
    else:
        raise NameError(
            "Unknown loss, it should be 'r2' or 'zero_mean_r2' or 'corr' or 'reconstruction_err'")
    return np.maximum(score, -1)


def normalized_reconstruction_error(y_true, y_pred, sample_weight=None, multioutput="uniform_average"):
    if y_true.ndim == 1:
        y_true = y_true.reshape((-1, 1))

    if y_pred.ndim == 1:
        y_pred = y_pred.reshape((-1, 1))

    if sample_weight is not None:
        weight = sample_weight[:, np.newaxis]
    else:
        weight = 1.

    numerator = (weight * (y_true - y_pred) ** 2).sum(axis=0, dtype=np.float64)
    denominator = (weight * (y_true) ** 2).sum(axis=0, dtype=np.float64)
    nonzero_denominator = denominator != 0
    nonzero_numerator = numerator != 0
    valid_score = nonzero_denominator & nonzero_numerator
    output_scores = np.ones([y_true.shape[1]])
    output_scores[valid_score] = 1 - (numerator[valid_score] /
                                      denominator[valid_score])
    output_scores[nonzero_numerator & ~nonzero_denominator] = 0

    if multioutput == 'raw_values':
        # return scores individually
        return output_scores
    elif multioutput == 'uniform_average':
        # passing None as weights results is uniform mean
        avg_weights = None
    elif multioutput == 'variance_weighted':
        avg_weights = (weight * (y_true - np.average(y_true, axis=0,
                                                     weights=sample_weight)) ** 2).sum(axis=0, dtype=np.float64)
        # avoid fail on constant y or one-element arrays
        if not np.any(nonzero_denominator):
            if not np.any(nonzero_numerator):
                return 1.0
            else:
                return 0.0
    else:
        avg_weights = multioutput

    return np.average(output_scores, weights=avg_weights)


def reconstruction_ratio(aligned_error, identity_error):
    """
    Calculates the reconstruction error
    as defined by Bazeille and
    colleagues (2019).

    A value greater than 0 indicates that
    voxels are predicted better by aligned data
    than by raw data.

    Parameters
    ----------
    aligned_error : float64
        The reconstruction error from a given
        functional alignment method
    identity error :  float64
        The reconstruction error from predicting
        the target subject as the source subject
    """
    num = 1 - aligned_error
    den = 1 - identity_error
    return 1 - (num / den)


def align_score_one_pair(df, source_sub, target_sub, method, n_pieces, masker, n_jobs):
    # load the the ap and pa conditions for every subject
    source_train = df[df.subject ==
                      source_sub][df.acquisition == 'ap'].path.values
    target_train = df[df.subject ==
                      target_sub][df.acquisition == 'ap'].path.values
    source_test = df[df.subject ==
                     source_sub][df.acquisition == 'pa'].path.values
    target_test = df[df.subject ==
                     target_sub][df.acquisition == 'pa'].path.values

    # define alignment
    alignement_estimator = PairwiseAlignment(
        alignment_method=method, n_pieces=200, mask=masker, n_jobs=5)
    # fit estimator and predict missing data
    alignement_estimator.fit(source_train, target_train)
    target_pred = alignement_estimator.transform(source_test)
    # score prediction and source data
    reconstruction_err = score_table("n_reconstruction_err", masker.transform(target_test), masker.transform(target_pred),
                                     multioutput='raw_values')
    id_err = score_table("n_reconstruction_err", masker.transform(target_test), masker.transform(source_test),
                         multioutput='raw_values')
    # calculate reconstruction_ratio
    ratio = reconstruction_ratio(reconstruction_err, id_err)
    return reconstruction_err, ratio

files, df, mask = fetch_ibc_subjects_contrasts(subjects="all")

# Pairs used for paper
pairs_used = [(12, 8), (2, 5), (2, 8), (3, 11),
              (3, 1), (4, 6), (7, 10), (9, 1), (6, 3), (7, 9), (10, 1), (5, 12), (7, 4), (8, 6), (2, 11), (3, 5), (11, 7), (4, 12), (9, 10)]
subjects = df.subject.unique()
n_jobs = 10

# In the paper data were clustered automatically with hierarchical_k_means, here
# let's use kmeans, the closest algorithm available in fmralign, with 200 pieces

n_pieces = 200
clustering = "kmeans"
methods = ["scaled_orthogonal", "ridge_cv", "optimal_transport"]

# define masker
masker = NiftiMasker(mask_img=mask)
masker.fit()

# for each method calculate for all pairs reconstruction_error and reconstruction_ratio after alignment ; store them in scores.
scores = {}
for method in methods:
    method_err, method_ratio = [], []
    for pair in pairs_used:
        source_sub, target_sub = subjects[pair[0]], subjects[pair[1]]
        reconstruction_err, ratio = align_score_one_pair(
            df, source_sub, target_sub, method, n_pieces, masker, n_jobs)
        method_err.append(reconstruction_err)
        method_ratio.append(ratio)
    scores["{}_reconstruction_err".format(method)] = method_err
    scores["{}_reconstruction_ratio".format(method)] = method_ratio

# recover all reconstruction ratios and flatten their values across subjects

reconstruction_ratios = []
for method in methods:
    reconstruction_ratios.append(np.asarray(
        scores["{}_reconstruction_ratio".format(method)]).flatten())

# plot the distribution of scores for each method
import matplotlib.pyplot as plt
plt.figure(figsize=(5, 5))
plt.violinplot(reconstruction_ratios, vert=False, showextrema=False,
               showmeans=False, showmedians=True, points=800, widths=0.8)
plt.tick_params(axis='y', which='both', labelleft='on',
                labelright='off', labelsize=20)
plt.xlim(left=-1, right=1)
plt.xlabel(r'$R_{\eta^2}$', fontsize=24, rotation="horizontal")
plt.axvline(x=0, color='dimgrey')
plt.tight_layout()
plt.yticks(range(1, len(methods) + 1), methods)

emdupre · 2019-12-19T17:18:03Z

This has been super helpful, thank you !! I've updated my code based on your feedback 😸 Very much appreciated !

I do still have one small trouble I was hoping you could help me track down. I ran my adaption of your code (as well as the exact code you included above), and I'm getting slightly different results for ridge_cv and very different results for optimal_transport:

I think these differences are likely coming from one of two places; either (1) my running without subjects 3 and 10 (since these aren't distributed on OSF), or (2) a difference on dependency versions. Here's what I'm currently running the replication with:

astroid==2.3.1
atomicwrites==1.3.0
attrs==19.3.0
autopep8==1.4.4
backcall==0.1.0
bleach==3.1.0
bokeh==1.4.0
brainiak==0.9.1
bz2file==0.98
certifi==2019.9.11
cycler==0.10.0
Cython==0.29.13
decorator==4.4.0
defusedxml==0.6.0
entrypoints==0.3
-e git+https://github.com/emdupre/fmralign@9d4884a31908e727caf15c8236c8bb84f880d327#egg=fmralign
h5py==2.10.0
importlib-metadata==0.23
ipdb==0.12.2
ipykernel==5.1.3
ipython==7.8.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
isort==4.3.21
jedi==0.15.1
Jinja2==2.10.3
joblib==0.13.2
jsonschema==3.1.1
jupyter==1.0.0
jupyter-client==5.3.3
jupyter-console==6.0.0
jupyter-core==4.5.0
jupytext==1.2.4
kiwisolver==1.1.0
lazy-object-proxy==1.4.2
Mako==1.1.0
MarkupSafe==1.1.1
matplotlib==3.1.1
mccabe==0.6.1
mistune==0.8.4
more-itertools==7.2.0
mpi4py==3.0.2
nb-conda==2.2.1
nb-conda-kernels==2.2.2
nbconvert==5.6.0
nbformat==4.4.0
networkx==2.4
nibabel==2.5.1
nilearn==0.6.0a0
nitime==0.8.1
notebook==6.0.1
numpy==1.16.4
packaging==19.2
pandas==0.25.2
pandocfilters==1.4.2
parso==0.5.1
patsy==0.5.1
pexpect==4.7.0
pickleshare==0.7.5
Pillow==6.2.1
pluggy==0.13.0
POT==0.6.0
prometheus-client==0.7.1
prompt-toolkit==2.0.10
psutil==5.6.3
ptvsd==4.3.2
ptyprocess==0.6.0
py==1.8.0
pybind11==2.4.2
pycodestyle==2.5.0
pydicom==1.3.0
Pygments==2.4.2
pygpu==0.7.6
pylint==2.4.2
pymanopt==0.2.4
pyparsing==2.4.2
pyrsistent==0.15.4
pytest==5.2.1
python-dateutil==2.8.0
pytz==2019.3
PyYAML==5.1.2
pyzmq==18.1.0
qtconsole==4.5.5
scikit-learn==0.21.3
scipy==1.2.1
seaborn==0.9.0
Send2Trash==1.5.0
six==1.12.0
statsmodels==0.10.1
terminado==0.8.2
testpath==0.4.2
Theano==1.0.4
tornado==6.0.3
traitlets==4.3.3
wcwidth==0.1.7
webencodings==0.5.1
widgetsnbextension==3.5.1
wrapt==1.11.2
zipp==0.6.0

Does anything stand out to you ?

EDIT: Or, option (3), I'm doing something else wrong entirely ! Any and all feedback is welcome, here.

Thanks so much again !

thomasbazeille · 2019-12-19T17:54:00Z

This figure comes from the code above ?!😱

emdupre · 2019-12-19T17:58:44Z

This figure comes from the code above ?! 😱

My running of it 😅

I did add two missing imports so it would run (from nilearn.input_data import NiftiMasker and from fmralign.pairwise_alignment import PairwiseAlignment), otherwise I believe it's been run exactly as written.

emdupre · 2020-01-06T22:19:52Z

OK ! I can now get the correct figure:

It's clearly something in my dependencies. I'll keep looking into this and report back, sorry !

thomasbazeille · 2020-01-07T07:29:29Z

Glad you found it, I was about looking into this !

It's a bit worrying that these can change so much just based on dependencies.

So it's mostly but not only optimal transport, all methods seem to yield different results while all the front-end libraries seemed to be up-to-date. I'm a bit out of ideas here. @bthirion maybe ?

thomasbazeille · 2020-02-01T09:16:12Z

Ok #40 was merged as consequence of this issue. There is still the dependency question that stays open but i'm not sure we'll debug that quickly. I get back to it when I have time. I close this in the mean time.

Thanks a lot for the PR

This was referenced Dec 13, 2019

Running ridge_cv with a pre-defined parcellation yields ValueError #36

Closed

[DOC] Improve documentation based on feedback #38

Open

emdupre mentioned this issue Jan 6, 2020

[ENH] Adding score metrics #40

Merged

2 tasks

thomasbazeille closed this as completed Feb 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of reconstruction error ? #37

Implementation of reconstruction error ? #37

emdupre commented Dec 2, 2019

thomasbazeille commented Dec 13, 2019 •

edited

emdupre commented Dec 19, 2019 •

edited

thomasbazeille commented Dec 19, 2019 •

edited

emdupre commented Dec 19, 2019

emdupre commented Jan 6, 2020

thomasbazeille commented Jan 7, 2020

thomasbazeille commented Feb 1, 2020

Implementation of reconstruction error ? #37

Implementation of reconstruction error ? #37

Comments

emdupre commented Dec 2, 2019

thomasbazeille commented Dec 13, 2019 • edited

emdupre commented Dec 19, 2019 • edited

thomasbazeille commented Dec 19, 2019 • edited

emdupre commented Dec 19, 2019

emdupre commented Jan 6, 2020

thomasbazeille commented Jan 7, 2020

thomasbazeille commented Feb 1, 2020

thomasbazeille commented Dec 13, 2019 •

edited

emdupre commented Dec 19, 2019 •

edited

thomasbazeille commented Dec 19, 2019 •

edited