# FBR Identification paper

The below document contains the details for the Family Busines Report Breslau screening paper analyses (apologies for any typos - there is no spellcheck!).

#### Abbreviations

- BSSS: Breslau short screen for PTSD
- GEM: Growth and empowerment scale
- K6: Kessler 6 psychological distress scale (+ 2 other items)
- PTCI: Post-traumatic cognitions inventory
- TSCL: Trauma symptom checklist

#### Aims
- to ascertain if the BSSS is a good screening tool in this cohort (results suggest they are not).
- test if the GEM and Kessler scales improve upon the BSSS accuracy when identifying PTSD.
- test if including other trauma checklists (the PTCI and TSCL) have improved predictive ability and, if yes, what specific items predict PTSD and what does this tell us about PTSD in this particular cohort?

Critically - we're not trying to develop a PTSD screener - rather we are just trying to show that perhaps the BSSS isn't an appropriate screen and that certain other measures (GEM K6) and items may be informative for future studies.

In [1]:
from IPython.display import HTML
HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
The raw code for this IPython notebook is by default hidden for easier reading.
To toggle on/off the raw code, click <a href="javascript:code_toggle()">here</a>.''')

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

file = 'C:\\Users\\HearneL\\AnacondaProjects\\FBR_Prevalence\\116_booklet_LJH.xlsx'
df  = pd.read_excel(file,'116_booklet_LJH')

ID = df['CODE']
PTSD = df['ICD_PTSD12m']
PTSD = PTSD-1
SCREEN = df['PTSD_SCREEN']
BSSS = df.filter(regex='ptsd_screen')
TSCL = df.filter(regex='TSCL')
PTCI = df.filter(regex='PTCI')
K6   = df.filter(regex='K6')
GEM  = df.filter(regex='GEM')
CulT = df[["A_STOLENGEN","B_PARENTSTOLENGEN","C_SELFREMOVECIHLD",
               "D_SIBLINGREMOVE","E_CHILDRENTAKEGOVE","F_RACISMDISCRIM",
                "G_FAMILYNOTACCEPT","H_COMMNOTACCEPT"]
               ]

### Floor/ceiling effects & missing data

In [3]:
data = pd.concat([ID,PTSD,SCREEN,BSSS,TSCL,PTCI,K6,GEM,CulT],axis=1)
print('Size of original data: ',data.shape,'[participants,items]')

def floor_ceil(df,maxScale,minScale):
    idx = df.quantile(0.25,axis=0)!=maxScale
    df = df.loc[:,idx]
    
    idx = df.quantile(0.75,axis=0)!=minScale
    df = df.loc[:,idx]
    return df

TSCL = floor_ceil(TSCL,3,0)
PTCI = floor_ceil(PTCI,7,1)
K6 = floor_ceil(K6,5,1)
GEM = floor_ceil(GEM,7,1)
CulT = floor_ceil(CulT,2,1)
data = pd.concat([ID,PTSD,SCREEN,BSSS,TSCL,PTCI,K6,GEM,CulT],axis=1)

print('Size after ceiling and floor effects: ',data.shape)

data = data.dropna(thresh=112,axis=1)
print('Size after missing item deletion: ',data.shape)

data = data.dropna(how = 'any',axis=0)
data.to_csv('Raw_data.csv')
print('Size after missing participant deletion: ',data.shape)

Size of original data:  (116, 111) [participants,items]
Size after ceiling and floor effects:  (116, 105)
Size after missing item deletion:  (116, 92)
Size after missing participant deletion:  (104, 92)


## Demographics for paper
Explanation for this code can be found in the prevalence paper data...

In [4]:
import pandas as pd
import numpy as np
import sys
sys.path.insert(0, '\\Users\\HearneL\\AnacondaProjects\\FBR_Prevalence\\')
from prevalence_functions import *

df  = pd.read_excel('C:\\Users\\HearneL\\AnacondaProjects\\FBR_Prevalence\\116_booklet_LJH.xlsx','116_booklet_LJH')

demo = df[['Age_abs','relationship_dummy','Edu<10','custody_status','Youth_det_dummy']]
var_labels = list(demo)
demo = demo.iloc[data.index,:]
demo = demo.values

filename = 'demo.csv'
head = ["Var", "N NoPTSD","% NoPTSD","N PTSD","% NoPTSD","N Total","% Total",
        "OR", "LCI","UCI","X2","p"]

with open(filename,'w') as newFile:
    newFileWriter = csv.writer(newFile)
    newFileWriter.writerow(head)

results_master(filename,demo,var_labels,data['ICD_PTSD12m'])

dataPrint = pd.read_csv(filename,header = 0) # load data
dataPrint.head(len(dataPrint))



Unnamed: 0,Var,N NoPTSD,% NoPTSD,N PTSD,% NoPTSD.1,N Total,% Total,OR,LCI,UCI,X2,p
0,Age_abs,10.0,19.23,16.0,30.77,26.0,25.0,1.74,0.66,4.6,2.0,0.368
1,"['Age_abs', 1]",25.0,48.08,23.0,44.23,48.0,46.15,1.74,0.66,4.6,2.0,0.368
2,"['Age_abs', 2]",17.0,32.69,13.0,25.0,30.0,28.85,1.74,0.66,4.6,2.0,0.368
3,relationship_dummy,32.0,61.54,31.0,59.62,63.0,60.58,0.92,0.42,2.03,0.0,1.0
4,Edu<10,18.0,34.62,24.0,46.15,42.0,40.38,1.62,0.73,3.57,1.0,0.318
5,custody_status,27.0,52.94,17.0,34.0,44.0,43.56,0.39,0.16,0.95,4.34,0.114
6,"['custody_status', 1]",15.0,29.41,24.0,48.0,39.0,38.61,0.39,0.16,0.95,4.34,0.114
7,"['custody_status', 2]",9.0,17.65,9.0,18.0,18.0,17.82,0.39,0.16,0.95,4.34,0.114
8,Youth_det_dummy,15.0,28.85,24.0,46.15,39.0,37.5,2.11,0.94,4.76,2.63,0.105


In [5]:
# -t-tests
demo = df[['Age_yr','numtimecustadult','DurAdultCustTotal']]
demo = demo.iloc[data.index,:]
var_labels = list(demo)
demo = demo.values

filename = 'demo_tTests.csv'
head = ["Var", "Mean","std","Mean","std","Mean","std","t","p","cohen's d"]

with open(filename,'w') as newFile:
    newFileWriter = csv.writer(newFile)
    newFileWriter.writerow(head)
    
ptsd_ttest(filename,demo,var_labels,data['ICD_PTSD12m'])
    
dataPrint = pd.read_csv(filename,header = 0) # load data
dataPrint.head(len(dataPrint))

Unnamed: 0,Var,Mean,std,Mean.1,std.1,Mean.2,std.2,t,p,cohen's d
0,Age_yr,31.92,8.17,29.62,7.75,30.77,8.05,1.46,0.15,0.29
1,numtimecustadult,4.49,5.61,3.96,3.68,4.23,4.77,0.55,0.58,0.11
2,DurAdultCustTotal,23.52,40.2,29.24,36.56,26.38,38.53,-0.73,0.47,-0.15
