### Example 2 - Descriptive Statistic - Mean; Confidence Interval - Median; Inter-quartile range

### 2.1 What to do?
For example load the exemplary data from example 2 (ex2.csv) of synthetic height and age data.
Now the goal is to calculate  the mean with standard deviation or confidence interval, the median with inter-quartile range or the pseudomedian with confidence interval based on the signed-rank distribution. To give them as descriptive statistics.

The function statsmed.get_desc() requires as input the data (please exclude NaN or None values) the number of decimals it should return and the mode.
The output depends on mode:\
&emsp;&emsp;if mode = 'all' the function prints mean with standard deviation and confidence interval\
&emsp;&emsp;&emsp;&emsp;as well as median with inter-quartile range and pseudomedian with confidence interval of the signed-rank distribution\
&emsp;&emsp;if mode = 'normal distribution' - only the mean with standard deviation and confidence interval is given\
&emsp;&emsp;if mode = 'no normal distribution' - median with inter-quartile range and pseudomedian with confidence interval of the\
&emsp;&emsp;&emsp;&emsp;signed-rank distribution is returned\
&emsp;&emsp;if something else is given the respective output depends on whether  the data is normal distributed due to stdnorm_test\
&emsp;&emsp;the output is rounded to the number of given decimals\
&emsp;&emsp;the function also returns a numpy array containing all values depending on mode


If applied to your own data please be careful with possible NaN or None values in your data.

In [3]:
import pandas
from statsmed import statsmed
data = pandas.read_csv('ex2.csv',delimiter=',',on_bad_lines='skip')

print('Descriptive Statistic of the height-data with two decimals and all returns:')
print(statsmed.get_desc(data['height'],2,'all'))

Descriptive Statistic of the height-data with two decimals and all returns:
Shapiro-Wilk: Normal dsitribution (p-value = 0.80 
 	 - p-value >= 0.05 indicates no significant difference from normal distribution)
Kolmogorow-Smirnow: Normal dsitribution (p-value = 0.85 
 	 - p-value >= 0.05 indicates no significant difference from normal distribution)
Both tests do not indicate a significant difference from a normal distribution
The mean with standard deviation is: 1.78 ± 0.06
The mean with 95%-confidence interval is: 1.78 (CI: 1.77 - 1.79)
The median with interquartile range (IQR) from the 25th to 75th percentile is: 1.78 (IQR: 1.74 - 1.82)
The pseudomedian with 95%-confidence interval from the signed-rank distribution is: 1.78 (CI: 1.77 - 1.79)
[[1.78 0.06  nan]
 [1.78 1.77 1.79]
 [1.78 1.74 1.82]
 [1.78 1.77 1.79]]


If you just want mean and standard deviation and confidence interval or median with inter-quartile range and pseudomedian with confidence interval of the signed-rank distribution is returned, just do not give the mode parameter.

In [6]:
import pandas
from statsmed import statsmed
data = pandas.read_csv('ex2.csv',delimiter=',',on_bad_lines='skip')

print('Descriptive Statistic of the height-data with two decimals and distribution specific returns:')
print(statsmed.get_desc(data['height'],2))
print()
print('Descriptive Statistic of the age-data with zero decimals and distribution specific returns:')
print(statsmed.get_desc(data['age'],0))

Descriptive Statistic of the height-data with two decimals and distribution specific returns:
Shapiro-Wilk: Normal dsitribution (p-value = 0.80 
 	 - p-value >= 0.05 indicates no significant difference from normal distribution)
Kolmogorow-Smirnow: Normal dsitribution (p-value = 0.85 
 	 - p-value >= 0.05 indicates no significant difference from normal distribution)
Both tests do not indicate a significant difference from a normal distribution
The mean with standard deviation is: 1.78 ± 0.06
The mean with 95%-confidence interval is: 1.78 (CI: 1.77 - 1.79)
[[1.78 0.06  nan]
 [1.78 1.77 1.79]]

Descriptive Statistic of the age-data with zero decimals and distribution specific returns:
Shapiro-Wilk: No normal dsitribution (p-value = 0.0031)
Kolmogorow-Smirnow: Normal dsitribution (p-value = 0.40 
 	 - p-value >= 0.05 indicates no significant difference from normal distribution)
At least one test indicates no normal distribution
The median with interquartile range (IQR) from the 25th to 75t

If you have tested significant difference to a normal distribution (for example by using statsmed.stdnorm_test()), you can also give the mode parameters: 'normal distribution' or 'no normal distribution'.\
You can also use these parameters directly if that is advantageous for you.

### 2.2 What to write?

In the statistical analysis section of a manuscript you may write:

"Continuous variables were reported as mean +/− standard deviation (SD) when normally distributed and as medians (interquartile range (IQR)) otherwise."\
or "Continuous variables were reported as mean (95%-confidence interval (CI)) when normally distributed and as pseudomedian (95%-confidence interval (CI)) otherwise."

In the results section of the manuscript you can now give the respective values.

