Skip to content

Demographically Stratified Templates for MAGeTbrain

Gabriel A. Devenyi edited this page Apr 26, 2019 · 1 revision

While the MAGeTbrain algorithm has been shown to be insensitive to the choice of templates, referees will typically expect template selection to be stratified across age/sex and other factors.

The following is example code that takes a dataset with ID, Age and Sex, and calculates an age-stratitifed, sex balanced template set from the original data. If you also need to split based on some group, this step would be done similar to the Age splitting.

#filtered.data is only valid subjects (QC == 1)

#Generate Templates for MAGET - Split Male/Female, Break into Bins by Age, Pick one Subject from Each Age Bin
#Count number of women and men
nummen = round(21 * dim(subset(filtered.data, M.F == "M"))[1] / dim(filtered.data)[1])
numwomen = round(21 * dim(subset(filtered.data, M.F == "F"))[1] / dim(filtered.data)[1])

#Sort subjects by age
sorted.filtered.data = filtered.data[order(filtered.data$Age),]

#Calculate bins in Age for Men and women
fembreaks=seq(min(subset(sorted.filtered.data, M.F=="F")$Age), max(subset(sorted.filtered.data, M.F=="F")$Age), length.out=numwomen+1)
manbreaks=seq(min(subset(sorted.filtered.data, M.F=="M")$Age), max(subset(sorted.filtered.data, M.F=="M")$Age), length.out=nummen+1)

#Group subjects into bins based on bin calculation above
femgroups = split(subset(sorted.filtered.data, M.F=="F"), cut(subset(sorted.filtered.data, M.F=="F")$Age, fembreaks))
mangroups = split(subset(sorted.filtered.data, M.F=="M"), cut(subset(sorted.filtered.data, M.F=="M")$Age, manbreaks))

#Create an empty list
template = c()
#Add the first subject to the list
template = append(template, as.character(mangroups[[1]]$ID[1]))
#Loop over the binned subjects and choose the middle member, add to list
for (i in seq(2,nummen-1))
  template = append(template, as.character(mangroups[[i]]$ID[ceiling(length(mangroups[[i]]$ID)/2)]))
#Add the last member
template = append(template,as.character(tail(mangroups[[nummen]]$ID,1)))

#Do the same as above for the women
template = append(template,as.character(femgroups[[1]]$ID[1]))
for (i in seq(2,numwomen-1)) 
  template = append(template,as.character(femgroups[[i]]$ID[ceiling(length(femgroups[[i]]$ID)/2)]))
template = append(template,as.character(tail(femgroups[[numwomen]]$ID,1)))

#Write out ID names to a file
write.table(template, file="templates.txt", quote=FALSE, row.names=FALSE, col.names=FALSE )
Clone this wiki locally