# PPMI data preparation script

2018-09-10 Hirotaka Iwaki

Before staring the analysis, Donwnload data from PPMI LONI site.

**steps**
1. Create 1-1 matching table for Event ID and DATE of each participant. 
2. Pull PD diagnosis information

In [1]:
# Setting
library(data.table)
library(dplyr)
FOLDER = c("PPMI180910")
OUTPUT = c("out180910")


Attaching package: 'dplyr'

The following objects are masked from 'package:data.table':

    between, first, last

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union



## Step 1 PATNO-EVENT-DATE matching table

In [2]:
cat("Populate DATE and EVENT_ID from all the available files.\n
Skip 'Signature_Form.csv' because it contains many EVENT_ID which are not seen in other files.
Additionally, the following files will be skipped because they don't contain PATNO in the file;\n ")
FILES = dir(path = FOLDER, full.names = F, recursive = T)
TEMP=c()
for (i in 1:length(FILES)){
  if(FILES[i]=="Signature_Form.csv"){next} 
  CHECK = fread(paste(FOLDER, FILES[i], sep="/"))
  if(length(grep("^PATNO$", names(CHECK)))==0){cat(FILES[i], ", ");next}
  TEMP1 = fread(paste(FOLDER, FILES[i], sep="/"), na.strings = '', header = T, quote="\"", 
                colClasses = c("PATNO"="character")) 
  if(length(grep("^EVENT_ID$|^INFODT$", names(TEMP1)))==2){
    TEMP2 = TEMP1 %>% select(PATNO, EVENT_ID, INFODT) %>% 
      mutate(DATE = as.numeric(as.Date(paste("01", INFODT, sep="/"), format="%d/%m/%Y"))) %>% # pad 01 and cal days from 1970-01-01
      arrange(DATE) %>% 
      distinct(PATNO, EVENT_ID, .keep_all = T) %>%
      .[complete.cases(.),]%>% 
      select(PATNO, EVENT_ID, DATE) %>% mutate(FILE = FILES[i]) %>% data.frame
    if(!exists("TEMP")){TEMP=TEMP2}else{
      TEMP = bind_rows(TEMP, TEMP2)
    }
  }else{next}
}


cat("\n \n \n Check how consistent the EVENT_ID-DATEs are. 
Calculate P: the probability of a certain event of individual recorded as such a date\n")
temp = TEMP %>% arrange(DATE) %>% 
  group_by(PATNO, EVENT_ID) %>% mutate(NUM_E = n()) %>% data.frame %>%  
  group_by(PATNO, EVENT_ID, DATE) %>% mutate(NUM_E_D = n()) %>% data.frame %>% 
  mutate(P = NUM_E_D / NUM_E) %>% 
  distinct(PATNO, EVENT_ID, NUM_E_D, .keep_all = T) %>%
  arrange(PATNO, DATE) 

cat("\n\nList of files that contains only 1 event for any participant across all files. Numbers are such obs in the file.")
temp %>%  filter(NUM_E == 1) %>% with(table(FILE, NUM_E==1))
temp %>% filter(P<1) %>% nrow %>% cat("\n", ., "observations have different DATEs for the same EVENT_ID.
For these events, take the most frequent DATE")
temp2 = temp %>% arrange(PATNO, EVENT_ID, desc(P)) %>% distinct(PATNO, EVENT_ID, .keep_all = T) 
temp2 %>% filter(P == 0.5) %>% nrow %>% cat("\nFor", ., "observations, the most freqent DATE of the EVENT_ID is 0.5,
DATEs of those EVENT_IDs will be determined by the order of the filenames (because we cannot determine which is true).\n
Next, check the chronological order of EVENT_ID. 
EVENT_ID of schduled site visits and telephone visits are used to detect inconsistency.\n")

# Code to see the names of EVENT_IDs
# unique(temp$EVENT_ID) %>% .[-grep("^T", .)] %>% paste(., collapse = '","')

Visits = c("SC", "BL", sprintf("%s%02d", "V", 1:20))
TVisits= c("TSC", "TBL", sprintf("%s%02d", "T", 1:99))
temp3.1 = temp2 %>% 
  filter(EVENT_ID %in% Visits) %>% 
  mutate(V = factor(EVENT_ID, levels = Visits)) %>% 
  arrange(PATNO, V) %>% 
  group_by(PATNO) %>% 
  mutate(DATEbef = lag(DATE, default = first(DATE)),
         DATEaft = lead(DATE, default = last(DATE))) %>% 
  data.frame %>% filter(DATE < DATEbef | DATE > DATEaft ) %>% select(-"V")
temp3.2 = temp2 %>% 
  filter(EVENT_ID %in% TVisits) %>% 
  mutate(V = factor(EVENT_ID, levels = TVisits)) %>% 
  arrange(PATNO, V) %>% 
  group_by(PATNO) %>% 
  mutate(DATEbef = lag(DATE, default = first(DATE)),
         DATEaft = lead(DATE, default = last(DATE))) %>% 
  data.frame %>% filter(DATE < DATEbef | DATE > DATEaft ) %>% select(-"V")
temp3 = bind_rows(temp3.1, temp3.2) # the list of not chronologially consistent events
nrow(temp2) %>% cat("\nAmong", ., "obs, ")
nrow(temp3) %>% cat(., " will be deleted because chronological order is inconsitent")
temp4 = anti_join(temp2, temp3, by=c("PATNO", "EVENT_ID"))
cat("\nAfter exclude these obs, check chronological orders again.")
temp4 %>% 
  filter(EVENT_ID %in% Visits) %>% 
  mutate(V = factor(EVENT_ID, levels = Visits)) %>% 
  arrange(PATNO, V) %>% 
  group_by(PATNO) %>% 
  mutate(DATEdiff = DATE - lag(DATE, default = first(DATE))) %>% 
  data.frame %>% filter(DATEdiff<0) %>% select(-"V") %>% nrow %>%
  cat("\n",., ": Number of obs still problematic. 0 is expected." )
temp4 %>% 
  filter(EVENT_ID %in% TVisits) %>% 
  mutate(V = factor(EVENT_ID, levels = TVisits)) %>% 
  arrange(PATNO, V) %>% 
  group_by(PATNO) %>% 
  mutate(DATEdiff = DATE - lag(DATE, default = first(DATE))) %>% 
  data.frame %>% filter(DATEdiff<0) %>% select(-"V")%>% nrow %>%
  cat("\n",., ": Number of obs still problematic. 0 is expected." )
cat("\nSave PATNO_EVENT_DATE matching file")
EDtable = temp4 %>% select(PATNO, EVENT_ID, DATE)
write.csv(EDtable, paste(OUTPUT, "PATNO_EVENTID_DATE.csv", sep="/"), row.names=F)

Populate DATE and EVENT_ID from all the available files.

Skip 'Signature_Form.csv' because it contains many EVENT_ID which are not seen in other files.
Additionally, the following files will be skipped because they don't contain PATNO in the file;
 AV-133_SBR_Results.csv , Code_List.csv , Data_Dictionary.csv , Derived_Variable_Definitions_and_Score_Calculations.csv , FOUND_RFQ_Alcohol.csv , FOUND_RFQ_Anti-Inflammatory_Meds.csv , FOUND_RFQ_Caffeine.csv , FOUND_RFQ_Calcium_Channel_Blockers.csv , FOUND_RFQ_Female_Reproductive_Health.csv , FOUND_RFQ_Head_Injury.csv , FOUND_RFQ_Height___Weight.csv , FOUND_RFQ_Occupation.csv , FOUND_RFQ_Pesticides_Non-Work.csv , FOUND_RFQ_Pesticides_at_Work.csv , FOUND_RFQ_Physical_Activity.csv , FOUND_RFQ_Residential_History.csv , FOUND_RFQ_Smoking_History.csv , FOUND_RFQ_Toxicant_History.csv , IUSM_BIOSPECIMEN_CELL_CATALOG.csv , IUSM_CATALOG.csv , MRI_Imaging_Data_Transfer_Information_Source_Document.csv , Olfactory_UPSIT.csv , Page_Descriptions.csv , SPE

                                       
FILE                                    TRUE
  AV-133_Imaging.csv                       3
  Conclusion_of_Study_Participation.csv  353
  Contact_Information_Brain_Bank.csv     267
  Contact_Information_FOUND.csv            9
  DNA_Sample_Collection.csv                2
  DaTscan_Imaging.csv                     42
  Features_of_REM_Behavior_Disorder.csv    5
  Florbetaben_Eligibility.csv             14
  Gait_Data___Arm_swing.csv                1
  General_Medical_History.csv              5
  Genetic_Testing_Results.csv           1119
  Inclusion_Exclusion.csv                  3
  Lumbar_Puncture_Sample_Collection.csv    5
  Magnetic_Resonance_Imaging.csv           2
  Research_Advance_Directive.csv           2
  Surgery_for_Parkinson_Disease.csv        7
  TAP-PD_Conclusion.csv                   12
  TAP-PD_OPDM_Assessment.csv               1
  TAP-PD_Subject_Eligibility.csv           3
  Use_of_PD_Medication.csv                 3
  Vital_Signs.c


 1544 observations have different DATEs for the same EVENT_ID.
For these events, take the most frequent DATE
For 4 observations, the most freqent DATE of the EVENT_ID is 0.5,
DATEs of those EVENT_IDs will be determined by the order of the filenames (because we cannot determine which is true).

Next, check the chronological order of EVENT_ID. 
EVENT_ID of schduled site visits and telephone visits are used to detect inconsistency.

Among 15121 obs, 2  will be deleted because chronological order is inconsitent
After exclude these obs, check chronological orders again.
 0 : Number of obs still problematic. 0 is expected.
 0 : Number of obs still problematic. 0 is expected.
Save PATNO_EVENT_DATE matching file

## Step 2 PD diagnosis

In [3]:
cat("Pull information from 'Patient_Status.csv'\n\n")
temp = fread(paste(FOLDER, 'Patient_Status.csv', sep='/'), 
                 colClasses = c("PATNO"="character")) %>%
  mutate(RECRUIT = RECRUITMENT_CAT,
         IMG = IMAGING_CAT,
         SUBCAT = DESCRP_CAT) %>% 
  select(PATNO, RECRUIT, IMG, SUBCAT, ENROLL_DATE, ENROLL_STATUS)
cat("Check ENROLL_STATUS by contingency table. 
Note:If dup > 1 it indiates duplicated IDs in the file and need to be checked.\n")
temp %>% group_by(PATNO) %>% mutate(dup = n()) %>% with(table(dup, ENROLL_STATUS, useNA = "always"))
cat("\nContingency table for cohorts and image results. RECRUIT=PD has SWEDD category. GEN has image but not REG")
temp %>% with(table(RECRUIT, IMG, useNA = "always"))
cat("\nContingency table for RECRUIT cohort and Subcategories. RECRUIT=PRODROMA has HYP and RBD. RECRUIT=GEN/REG have genetics information")
temp %>% with(table(RECRUIT, SUBCAT, useNA = "always"))
STATUS = temp

Pull information from 'Patient_Status.csv'

Check ENROLL_STATUS by contingency table. 
Note:If dup > 1 it indiates duplicated IDs in the file and need to be checked.


      ENROLL_STATUS
dup    Declined Enrolled Excluded Pending Withdrew <NA>
  1         111     1679      201      19      131    0
  <NA>        0        0        0       0        0    0


Contingency table for cohorts and image results. RECRUIT=PD has SWEDD category. GEN has image but not REG

          IMG
RECRUIT    GENPD GENUN  HC  PD PRODROMA REGPD REGUN SWEDD no image <NA>
  GENPD      209     0   0   0        0     0     0     0       60    0
  GENUN        0   335   0   0        0     0     0     0       28    0
  HC           0     0 241   0        0     0     0     0        0    0
  PD           0     0   0 450        0     0     0    79       41    0
  PRODROMA     0     0   0   0      194     0     0     0       22    0
  REGPD        0     0   0   0        0     4     0     0      209    0
  REGUN        0     0   0   0        0     0     1     0      268    0
  <NA>         0     0   0   0        0     0     0     0        0    0


Contingency table for RECRUIT cohort and Subcategories. RECRUIT=PRODROMA has HYP and RBD. RECRUIT=GEN/REG have genetics information

          SUBCAT
RECRUIT        GBA+ GBA- HYP LRRK2+ LRRK2- RBD SNCA+ SNCA- <NA>
  GENPD      0   82    0   0    165      0   0    22     0    0
  GENUN      0  149    1   0    204      2   0     7     0    0
  HC       241    0    0   0      0      0   0     0     0    0
  PD       570    0    0   0      0      0   0     0     0    0
  PRODROMA   1    0    0 119      0      0  96     0     0    0
  REGPD      0   52    0   0    156      0   0     5     0    0
  REGUN      0  102    2   0    144     12   0     7     2    0
  <NA>       0    0    0   0      0      0   0     0     0    0

In [4]:
cat("Obatin the initial diagnosis, and the latest diagnosis from 'Primary_Diagnosis.csv'.\n\n")
dxcode=data.frame(
    CODE = c(1:18, 23, 24, 97), 
    DX = c("iPD", "Alz", "FTD_Chr17", "CBD", "DLB", "Dysto_DopaR", "ET", "Hemi_PKS", "ARPD", "MNDwPKS",
            "MSA", "DrugPKS", "NPH", "PSP", "Psychogenic", "VasPKS", "NoDisease", "SCA", "PrdrNM", "PrdrM","OtherD"))
diag1=fread(paste(FOLDER, 'Primary_Diagnosis.csv', sep='/'), 
            colClasses = c("PATNO"="character")) %>%
  filter(!(is.na(PATNO))) %>% 
  mutate(DXCODE = as.numeric(PRIMDIAG)) %>% 
  select(PATNO, EVENT_ID, DXCODE) %>% 
  inner_join(., EDtable, by = c("PATNO", "EVENT_ID")) %>%
  arrange(desc(DATE)) %>% 
  distinct(PATNO, .keep_all = T) %>%
  left_join(., dxcode, by = c("DXCODE"="CODE")) %>% 
  rename(DATE_LASTDX = DATE, DX_LAST = DX, EVENT_LAST=EVENT_ID)
diag2=fread(paste(FOLDER, 'Primary_Diagnosis.csv', sep='/'), 
            colClasses = c("PATNO"="character")) %>%
  filter(!(is.na(PATNO))) %>% 
  mutate(DXCODE = as.numeric(PRIMDIAG)) %>% 
  select(PATNO, EVENT_ID, DXCODE) %>% 
  inner_join(., EDtable, by = c("PATNO", "EVENT_ID")) %>%
  arrange(DATE) %>% 
  distinct(PATNO, .keep_all = T) %>%
  left_join(., dxcode, by = c("DXCODE"="CODE")) %>% 
  rename(DATE_INITDX = DATE, DX_INIT = DX, EVENT_INIT=EVENT_ID) 
diag_PPMI = full_join(diag1, diag2, by = "PATNO") %>% select(-starts_with("DXCODE"))
cat("Contingency table for DX_LAST x RECRUIT cohort. The file only covers primary PPMI cohorts")
left_join(STATUS, diag_PPMI, by = "PATNO") %>% mutate(DX_LAST = as.character(DX_LAST)) %>% with(table(DX_LAST, RECRUIT, useNA = "always"))

cat("Further pull the initial and the last DX from 'Prodromal_Diagnostic_Questionnaire.csv'.")
diag1=fread(paste(FOLDER, 'Prodromal_Diagnostic_Questionnaire.csv', sep='/'), 
            colClasses = c("PATNO"="character")) %>%
  filter(!(is.na(PATNO))) %>% 
  mutate(DXCODE = as.numeric(PRIMDIAG)) %>% 
  select(PATNO, EVENT_ID, DXCODE) %>% 
  inner_join(., EDtable, by = c("PATNO", "EVENT_ID")) %>%
  arrange(desc(DATE)) %>% 
  distinct(PATNO, .keep_all = T) %>%
  left_join(., dxcode, by = c("DXCODE"="CODE")) %>% 
  rename(DATE_LASTDX = DATE, DX_LAST = DX, EVENT_LAST=EVENT_ID)
diag2=fread(paste(FOLDER, 'Prodromal_Diagnostic_Questionnaire.csv', sep='/'), 
            colClasses = c("PATNO"="character")) %>%
  filter(!(is.na(PATNO))) %>% 
  mutate(DXCODE = as.numeric(PRIMDIAG)) %>% 
  select(PATNO, EVENT_ID, DXCODE) %>% 
  inner_join(., EDtable, by = c("PATNO", "EVENT_ID")) %>%
  arrange(DATE) %>% 
  distinct(PATNO, .keep_all = T) %>%
  left_join(., dxcode, by = c("DXCODE"="CODE")) %>% 
  rename(DATE_INITDX = DATE, DX_INIT = DX, EVENT_INIT=EVENT_ID) 
diag_PROD = full_join(diag1, diag2, by = "PATNO") %>% select(-starts_with("DXCODE"))
left_join(STATUS, diag_PROD, by = "PATNO") %>%  mutate(DX_LAST = as.character(DX_LAST)) %>% with(table(DX_LAST, RECRUIT, useNA = "always"))

Obatin the initial diagnosis, and the latest diagnosis from 'Primary_Diagnosis.csv'.

Contingency table for DX_LAST x RECRUIT cohort. The file only covers primary PPMI cohorts

             RECRUIT
DX_LAST       GENPD GENUN  HC  PD PRODROMA REGPD REGUN <NA>
  CBD             0     0   1   1        0     0     0    0
  DLB             0     0   0   3        0     0     0    0
  ET              0     0   2   4        0     0     0    0
  MSA             0     0   0   5        0     0     0    0
  NoDisease       0     0 222   2        0     0     0    0
  OtherD          0     0   6   6        0     0     0    0
  Psychogenic     0     0   0   2        0     0     0    0
  iPD             0     0   3 531        0     0     0    0
  <NA>          269   363   7  16      216   213   269    0

Further pull the initial and the last DX from 'Prodromal_Diagnostic_Questionnaire.csv'.

           RECRUIT
DX_LAST     GENPD GENUN  HC  PD PRODROMA REGPD REGUN <NA>
  DLB           0     0   0   0        2     0     0    0
  DrugPKS       0     1   0   0        0     0     1    0
  ET            0    10   0   0        1     0     2    0
  Hemi_PKS      0     1   0   0        0     0     0    0
  MSA           0     0   0   0        2     0     0    0
  NoDisease     0   305   0   0       55     0   242    0
  OtherD        0     2   0   0        5     0     8    0
  PrdrM         0    21   0   0        8     0     5    0
  PrdrNM        0     4   0   0      103     0     1    0
  iPD           0    11   0   0       12     0     3    0
  <NA>        269     8 241 570       28   213     7    0

In [5]:
diag = bind_rows(diag_PPMI, diag_PROD) %>% mutate(DAYS_DXDIFF = DATE_LASTDX - DATE_INITDX)
cat("\nCompare the initial diagnosis and the last one. Note this contingency table includes people only had SC visit")
diag %>% mutate_at(vars(starts_with("DX")), as.character) %>% 
  with(table(DX_LAST, DX_INIT, useNA = "always"))
cat("\nCompare the initial diagnosis and the last one. Exclude people only had SC visit")
diag %>% mutate_at(vars(starts_with("DX")), as.character) %>% 
  filter(DAYS_DXDIFF>0) %>% 
  with(table(DX_LAST, DX_INIT, useNA = "always"))


Compare the initial diagnosis and the last one. Note this contingency table includes people only had SC visit

             DX_INIT
DX_LAST       DLB  ET Hemi_PKS NoDisease OtherD PrdrM PrdrNM Psychogenic iPD
  CBD           0   0        0         1      0     0      0           0   1
  DLB           1   0        0         0      0     0      2           0   2
  DrugPKS       0   0        0         2      0     0      0           0   0
  ET            0  11        0         6      0     0      0           0   2
  Hemi_PKS      0   0        1         0      0     0      0           0   0
  MSA           0   0        0         0      1     0      1           0   5
  NoDisease     0   2        0       821      1     0      0           0   2
  OtherD        0   0        0         5     16     0      1           0   5
  PrdrM         0   0        0        22      0     5      7           0   0
  PrdrNM        0   0        0         2      0     0    106           0   0
  Psychogenic   0   0        0         0      0     0      0           1   1
  iPD           0   0        0         7      1     2  


Compare the initial diagnosis and the last one. Exclude people only had SC visit

             DX_INIT
DX_LAST        ET NoDisease OtherD PrdrM PrdrNM iPD <NA>
  CBD           0         1      0     0      0   1    0
  DLB           0         0      0     0      2   2    0
  DrugPKS       0         2      0     0      0   0    0
  ET            3         6      0     0      0   2    0
  MSA           0         0      1     0      1   5    0
  NoDisease     2       495      1     0      0   2    0
  OtherD        0         5      3     0      1   5    0
  PrdrM         0        22      0     1      7   0    0
  PrdrNM        0         2      0     0     39   0    0
  Psychogenic   0         0      0     0      0   1    0
  iPD           0         7      1     2      8 414    0
  <NA>          0         0      0     0      0   0    0

In [6]:
temp = left_join(STATUS, diag, by = "PATNO")
cat("Now all people enrolled have diagnosis. \n UP : recruitment category and inital diagnosis \n BOTTOM : the same but with latest diagnosis. \n 13 recruited as PD were diagnosed differently")
temp %>% filter(ENROLL_STATUS == "Enrolled") %>% mutate(DX_INIT = as.character(DX_INIT)) %>% with(table(DX_INIT, RECRUIT, useNA = "always"))
temp %>% filter(ENROLL_STATUS == "Enrolled") %>% mutate(DX_LAST = as.character(DX_LAST)) %>% with(table(DX_LAST, RECRUIT, useNA = "always"))

Now all people enrolled have diagnosis. 
 UP : recruitment category and inital diagnosis 
 BOTTOM : the same but with latest diagnosis. 
 13 recruited as PD were diagnosed differently

           RECRUIT
DX_INIT     GENPD GENUN  HC  PD PRODROMA REGPD REGUN <NA>
  ET            0     6   1   0        0     0     2    0
  NoDisease     0   319 166   0        2     0   240    0
  OtherD        0     2   2   0        2     0     7    0
  PrdrM         0     2   0   0        1     0     3    0
  PrdrNM        0     1   0   0       55     0     1    0
  iPD           0     7   0 413        0     0     3    0
  <NA>        243     3   0   0        0   198     0    0

           RECRUIT
DX_LAST     GENPD GENUN  HC  PD PRODROMA REGPD REGUN <NA>
  CBD           0     0   1   1        0     0     0    0
  DLB           0     0   0   2        2     0     0    0
  DrugPKS       0     1   0   0        0     0     1    0
  ET            0     9   2   2        0     0     2    0
  MSA           0     0   0   4        2     0     0    0
  NoDisease     0   292 158   0        1     0   236    0
  OtherD        0     2   5   4        1     0     8    0
  PrdrM         0    20   0   0        6     0     5    0
  PrdrNM        0     3   0   0       37     0     1    0
  iPD           0    10   3 400       11     0     3    0
  <NA>        243     3   0   0        0   198     0    0

In [7]:
cat("Create New Status/Diagnosis/Image Indicator")
temp1= temp %>% filter(!is.na(DX_INIT)) %>%
  mutate(DIAG = paste(
  ifelse(ENROLL_STATUS=="Enrolled", "In_", "Out"),
  case_when(
    is.na(DX_LAST) ~ "XXX",
    DX_LAST == "iPD" ~ "iPD",
    DX_LAST == "NoDisease" ~ "CTR",
    DX_LAST == "PrdrM" | DX_LAST == "PrdrNM" ~ "PRD",
    DX_LAST == "OtherD" ~ "DFR",
    DX_LAST == "ET" ~ "EST",
    DX_LAST == "DrugPKS" ~ "DRG",
    DX_LAST == "Psychogenic" ~ "PSY",
    TRUE ~ as.character(DX_LAST)),
  case_when(
    IMG == "SWEDD" ~ "SWEDD",
    IMG == "no image" ~ "NoImg",
    TRUE ~ "YsImg"),
  ifelse(SUBCAT=="" & I(RECRUIT %in% c("PD", "HC")) , "PPMI", SUBCAT),
  sep = "_"
))
cat("STATUS of PPMI/HC cohort")
temp1 %>% .[grep("PPMI", .$DIAG), ] %>% 
  with(table(DIAG, RECRUIT, useNA = "always"))
temp1 %>% .[-grep("PPMI", .$DIAG), ] %>% 
  with(table(DIAG, RECRUIT, useNA = "always"))

Create New Status/Diagnosis/Image IndicatorSTATUS of PPMI/HC cohorts.

                    RECRUIT
DIAG                  HC  PD <NA>
  In__CBD_YsImg_PPMI   1   1    0
  In__CTR_YsImg_PPMI 158   0    0
  In__DFR_SWEDD_PPMI   0   4    0
  In__DFR_YsImg_PPMI   5   0    0
  In__DLB_YsImg_PPMI   0   2    0
  In__EST_SWEDD_PPMI   0   2    0
  In__EST_YsImg_PPMI   2   0    0
  In__MSA_YsImg_PPMI   0   4    0
  In__iPD_NoImg_PPMI   0   5    0
  In__iPD_SWEDD_PPMI   0  47    0
  In__iPD_YsImg_PPMI   3 348    0
  Out_CTR_SWEDD_PPMI   0   2    0
  Out_CTR_YsImg_PPMI  64   0    0
  Out_DFR_SWEDD_PPMI   0   2    0
  Out_DFR_YsImg_PPMI   1   0    0
  Out_DLB_NoImg_PPMI   0   1    0
  Out_EST_NoImg_PPMI   0   1    0
  Out_EST_YsImg_PPMI   0   1    0
  Out_MSA_YsImg_PPMI   0   1    0
  Out_PSY_NoImg_PPMI   0   1    0
  Out_PSY_SWEDD_PPMI   0   1    0
  Out_iPD_NoImg_PPMI   0  17    0
  Out_iPD_SWEDD_PPMI   0  21    0
  Out_iPD_YsImg_PPMI   0  93    0
  <NA>                 0   0    0

STATUS of other cohorts.

                         RECRUIT
DIAG                      GENUN PRODROMA REGUN <NA>
  In__CTR_NoImg_GBA+          5        0    89    0
  In__CTR_NoImg_GBA-          0        0     1    0
  In__CTR_NoImg_LRRK2+        7        0   133    0
  In__CTR_NoImg_LRRK2-        1        0    10    0
  In__CTR_NoImg_SNCA+         0        0     3    0
  In__CTR_YsImg_GBA+        124        0     0    0
  In__CTR_YsImg_HYP           0        1     0    0
  In__CTR_YsImg_LRRK2+      149        0     0    0
  In__CTR_YsImg_SNCA+         6        0     0    0
  In__DFR_NoImg_GBA+          0        0     5    0
  In__DFR_NoImg_LRRK2+        0        0     2    0
  In__DFR_NoImg_SNCA+         0        0     1    0
  In__DFR_YsImg_LRRK2+        2        0     0    0
  In__DFR_YsImg_RBD           0        1     0    0
  In__DLB_YsImg_RBD           0        2     0    0
  In__DRG_NoImg_LRRK2+        0        0     1    0
  In__DRG_YsImg_LRRK2+        1        0     0    0
  In__EST_NoImg_GBA+          0