<a href="https://colab.research.google.com/github/yilewang/TVB_Demo/blob/master/The_contrast_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This is a demo for contrast analysis. Contrast analysis is a statistical tool for trends analysis. The basic idea is based on ANOVA but it will allow research to customize contrast weights for different groups. This demo will only focus on posterior test of the contrast analysis (Scheffe's Test). The detailed info could be seen at Dr. Abdi's paper here: https://personal.utdallas.edu/~herve/abdi-contrasts2010-pretty.pdf

In [1]:
# !/usr/bin/python

import numpy as np
import scipy.stats
import pandas as pd
"""
The contrast analysis used for group comparison

Author: Yile Wang
Date: 08/17/2021
"""

'\nThe contrast analysis used for group comparison\n\nAuthor: Yile Wang\nDate: 08/17/2021\n'

In [4]:
def contrast_analysis(datatable, contrast, group_variable = "group"):
    """ 
    Arg: 
        Pandas DataFrame with all the features and groups info
    Return: 
        The contrast analysis results
    
    For this dataset, it should contain four groups, SNC, NC, MCI, AD;


    """

    # the number of cases for each group
    num_group = len(contrast)
    num_cases = datatable.groupby([group_variable]).count().iloc[:,0].to_numpy()

    F_table = pd.DataFrame(columns=['features','F_value', 'P_value'])
    mean_array = np.zeros(num_group)
    var_array = np.zeros(num_group)

    for col in datatable.columns[3:]:

        # mean calculation
        mean_array = datatable.groupby([group_variable]).mean().loc[:,col].to_numpy()
        meanNcontrast = np.dot(mean_array, contrast)
        contrast2 = np.square(contrast)

        # variance calculation
        var_array = datatable.groupby([group_variable]).var().loc[:,col].to_numpy()
        denominator = sum(num_cases) - num_group
        # degree of freedom of the each case
        num_cases_df = num_cases -1

        # compute the sum of squares & mean sum of squares 
        SSE = np.dot(var_array, num_cases_df)
        MSE = SSE/denominator
        tmp_ms_contrast = sum(contrast2/num_cases)

        # compute the MS contrast
        MS_contrast = (meanNcontrast**2) / tmp_ms_contrast
        F_value = MS_contrast/MSE

        # alpha = 0.05
        F_critical = scipy.stats.f.ppf(q=1-0.05, dfn=1, dfd=denominator)

        # for posterior contrast, using scheffe test
        scheffe = F_critical * (num_group-1)
        if F_value >= scheffe:
            p = 0.05
        else:
            p = 'NA'

        print(f"The {col} contrast has F_value {F_value}, and the F_critical Scheffe's Test is {scheffe}")
        F_table = F_table.append({'features':col,'F_value':F_value, 'P_value':p}, ignore_index=True)
    return F_table

In [14]:
# The data set should be a pandas Datafram, and the groups info should be specificed as a column called 'groups'
# e.x.

#G_table = pd.read_excel('./test.xlsx')
G_table = pd.read_excel("/home/yat-lok/workspace/data4project/lateralization/gc1sec_res/lateralization.xlsx")

contrast = [-3, -1, 1, 3] #linear trend
contrast2 = [1,-1,-1,1] #quadratic trend
contrast3 = [-1,3,-3,1] #poly
contrast4 = [0,-2, 0,2]
F_table = contrast_analysis(G_table, contrast4)
print(F_table)

The LI_freq_theta contrast has F_value 2.2575342668986402, and the F_critical Scheffe's Test is 11.933338178430578
The LI_amp_gamma contrast has F_value 1.5248223903865612, and the F_critical Scheffe's Test is 11.933338178430578
The LI_amp_theta contrast has F_value 1.8620327797914085, and the F_critical Scheffe's Test is 11.933338178430578
The LI_pac contrast has F_value 1.244134546342883, and the F_critical Scheffe's Test is 11.933338178430578
The LI_plv contrast has F_value 1.8497763949230595, and the F_critical Scheffe's Test is 11.933338178430578
        features   F_value P_value
0  LI_freq_theta  2.257534      NA
1   LI_amp_gamma  1.524822      NA
2   LI_amp_theta  1.862033      NA
3         LI_pac  1.244135      NA
4         LI_plv  1.849776      NA


  F_table = F_table.append({'features':col,'F_value':F_value, 'P_value':p}, ignore_index=True)
  F_table = F_table.append({'features':col,'F_value':F_value, 'P_value':p}, ignore_index=True)
  F_table = F_table.append({'features':col,'F_value':F_value, 'P_value':p}, ignore_index=True)
  F_table = F_table.append({'features':col,'F_value':F_value, 'P_value':p}, ignore_index=True)
  F_table = F_table.append({'features':col,'F_value':F_value, 'P_value':p}, ignore_index=True)
