<a href="https://colab.research.google.com/github/LMD-nat/melatonin/blob/main/Help_Calculating_SMDs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Manually calculate means and SDs for Standardized Mean Differences in a meta analysis

In some cases, articles did not readily provide means and standard deviations to use when calculating a standardized mean difference.

Sometimes there are results of t-tests, medians, or standard errors of the mean. With these measures, we can work backwards to estimate the mean and standard deviation of the score in a group.

I used formulas provided by David B. Wilson and Mark W. Lipsey (*Practical Meta Analysis, 2001*) to estimate the standardized mean difference, using the mean differences and p-values provided by the authors.

### In this first part, we obtain the standardized mean difference when only limited information (independent samples t-test results) is known.

In [None]:
import scipy.stats as stats
import math

In [None]:
# Custom function to get SMD from p_value and degrees of freedom for an idependent-samples t-test.

def calculate_smd(p_val, df1, df2):
    t = stats.t.ppf(1 - p_val/2, df1 + df2)
    smd = t * math.sqrt((df1 + df2) / (df1 * df2))
    return smd

def calculate_t_test(p_val, df1, df2):
    t_test = stats.t.ppf(1 - p_val/2, df1 + df2)
    return t_test

**Example usage of the above function**

```
p_val = 0.07
df1 = 30
df2 = 30
smd_result = calculate_smd(p_val, df1, df2)
print(smd_result)
```

In [None]:
calculate_smd(0.07, 30, 30)

0.2581988897471611

In [None]:
calculate_t_test(0.06, 15, 15)

0.7137356199794702

### In this second part, we obtain the mean and standard deviation when only the range and median are known.

Using the method described by [Hozo, Djulbegovic, & Hozo, 2005](https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-5-13)

In [None]:
from math import sqrt

def calculate_variance(a, m, b):
    return (1/12) * (((a - 2*m + b)**2)/4 + (b - a)**2)

def calculate_sd(a, m, b):
    variance = calculate_variance(a, m, b)
    return sqrt(variance)

def calculate_mean(a, m, b):
    return (a + 2*m + b) / 4

a_value = 28
median = 75
b_value = 92

calculate_variance(a_value, median, b_value)

360.0833333333333

In [None]:
calculate_sd(a_value, median, b_value)

18.97586186009303

In [None]:
calculate_mean(a_value, median, b_value)

67.5

### In this third part, we obtain the mean and standard deviation when only the interquartile range and median are known.

Using the methods described by [Luo, Wan, Liu & Tong, 2018; Wan, Wang, Liu & Tong; 2014](https://www.math.hkbu.edu.hk/~tongt/papers/median2mean.html)

In [13]:
import math
from scipy.stats import norm

def nt1():
    n = float(input("Enter the sample size: "))
    a = float(input("Enter the lower limit of the range: "))
    m = float(input("Enter the median: "))
    b = float(input("Enter the upper limit of the range: "))

    if a > m or a > b or m > b:
        print("Invalid data!")
        retVal = 0
    else:
        if abs((a + b - 2 * m) / (b - a)) > 2.5 / (n + 1) + 1 / math.log(n + 9):
            print("The data are significantly skewed away from normality. Stop here.")
        else:
            print("There is no significant evidence to show that the data are skewed. Proceed.")

def calc2():
    n = float(input("Enter sample size: "))
    q1 = float(input("Enter the first quartile of the sample: "))
    m = float(input("Enter the median of the sample: "))
    q3 = float(input("Enter the third quartile of the sample: "))

    if q1 > m or q1 > q3 or m > q3:
        print("Invalid data!")
        retVal = 0
    else:
        data_range = q3 - q1 * 1

        if n < 1:
            n = 1

        weight = 0.70 + 0.39 / n

        result1 = round(weight * (q1 + q3) / 2 + (1 - weight) * m, 4)
        result2 = round(data_range / (2 * norm_s_inv((0.75 * n - 0.125) / (n + 0.25))), 4)

        print("Estmated mean:", result1)
        print("Estimated standard deviation:", result2)

# Additional function to calculate the inverse of the standard normal distribution
def norm_s_inv(p):
    return norm.ppf(p)

# Additional function to round to 4 decimal places
def rnd4(value):
    return round(value, 4)


In [14]:
#nt1(n, a, m, b)
# Determines whether the data are skewed or not
nt1()


Enter the sample size: 30
Enter the lower limit of the range: 2
Enter the median: 10
Enter the upper limit of the range: 17
There is no significant evidence to show that the data are skewed. Proceed.


In [10]:
# Performs the actual calculations of M and SD
calc2()

Enter n: 30
Enter q1: 2
Enter m: 10
Enter q3: 17
Result 1: 9.6435
Result 2: 11.6763
