# CRC - CMS subtypes v. Hypo-macp
```
pi:ababaian
files: ~/Crown/data2/cms_crc
start: 2019 07 26
complete : 2019 07 30
```
## Introduction

Consensus molecular subtypes (CMS) of colorectal cancer is a integrated classifiction of this cancer consiering RNA-seq,DNA methylation and other clinical varaibles. This is the gold-standard for the classification of CRC into distinct sub-types [Guinney et al., 2015](https://www.ncbi.nlm.nih.gov/pubmed/26457759).

In particular CMS Type 2 (CMS2) is marked by high SCNA (proliferation), WNT and MYC activation. The CMS classification has been applied to a sub-set of TCGA data and can be downloaded [here](https://www.synapse.org/#!Synapse:syn2634729). 


## Objective / Hypothesis

There is an enrichment of hypo-macp phenotype in TCGA-COAD / TCGA-READ patients classifed as CMS2, relatie to CMS1,3 or 4.


## Materials and Methods


The `tcga.macp.RData` analysis file from `~/crown/data2/tcga2_gvcf/` (publication version at time of writing) was copied to `tcga.macp_190727.RData`. This data.frame will be used for macp measurements (A replicates).

The `cms_tcga.txt` is the TCGA samples sub-set of the `cms.label.txt` file from the original publication, downloaded [here](https://www.synapse.org/#!Synapse:syn2634729) on 190711.

Data imported in R and analyzed there. Exact Binomial test will be used to measure for over-representation of CMS subtypes in binary classified hypo-macp. The mean and variance of macp modification in each group will be compared to one another. 

In [1]:
cd ~/Crown/data2/cms_crc/

cat cms_crc.Rmd

---
title: "CRC_CMS"
output: html_document
---

Date: 190726

```{r setup,c("darkorchid") include=FALSE}
library(reshape2)
library(ggplot2)
```

# Import + Intersect Data

```{r}
# Load MACP df and parse
  load('tcga.macp_190727.Rdata')

# RAF / VAF
  MACP$RAF = MACP$T / MACP$DP
  MACP$VAF = 1 - MACP$RAF
  
# Subset data for analysis
# COAD and READ cohorts only
  Normals = which( grepl( '11', MACP$'lib.code') & ( grepl('TCGA-COAD', MACP$cohort) | grepl('TCGA-READ', MACP$cohort)))
  Cancers = which(!grepl( '11', MACP$'lib.code') & ( grepl('TCGA-COAD', MACP$cohort) | grepl('TCGA-READ', MACP$cohort)))

# Calculate global "Normal" 99% quantile
  q95 = quantile( MACP$VAF[ Normals ], 0.05)
  q99 = quantile( MACP$VAF[ Normals ], 0.01)
  sd2 = mean(MACP$VAF[Normals]) - 2*sd(MACP$VAF[Normals])
  sd3 = mean(MACP$VAF[Normals]) - 3*sd(MACP$VAF[Normals])
  #sd3 = 0.10

# Define hypo-mod libraries
  MACP$hypo.macp = ".normo"
  MACP$hypo.macp[which(MACP$VAF <= sd3

## Results


```
Exact binomial test

data:  length(which(MACP$cms.hypo == "CMS1.hypo")) and length(which(MACP$cms == "CMS1"))
number of successes = 28, number of trials = 76, p-value = 0.4181
alternative hypothesis: true probability of success is not equal to 0.4157895
95 percent confidence interval:
 0.2605756 0.4868564
sample estimates:
probability of success 
             0.3684211 


	Exact binomial test

data:  length(which(MACP$cms.hypo == "CMS2.hypo")) and length(which(MACP$cms == "CMS2"))
number of successes = 99, number of trials = 219, p-value = 0.3038
alternative hypothesis: true probability of success is not equal to 0.4157895
95 percent confidence interval:
 0.3849047 0.5205204
sample estimates:
probability of success 
             0.4520548 


	Exact binomial test

data:  length(which(MACP$cms.hypo == "CMS3.hypo")) and length(which(MACP$cms == "CMS3"))
number of successes = 29, number of trials = 72, p-value = 0.905
alternative hypothesis: true probability of success is not equal to 0.4157895
95 percent confidence interval:
 0.2887920 0.5250201
sample estimates:
probability of success 
             0.4027778 


	Exact binomial test

data:  length(which(MACP$cms.hypo == "CMS4.hypo")) and length(which(MACP$cms == "CMS4"))
number of successes = 55, number of trials = 143, p-value = 0.4975
alternative hypothesis: true probability of success is not equal to 0.4157895
95 percent confidence interval:
 0.3045443 0.4695715
sample estimates:
probability of success 
             0.3846154 

             Df Sum Sq Mean Sq F value Pr(>F)
cms           4  0.094 0.02341   1.104  0.354
Residuals   565 11.983 0.02121               

	Fisher's Exact Test for Count Data

data:  
p-value = 0.2277
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.3964906 1.2471432
sample estimates:
odds ratio 
  0.707898 


	Fisher's Exact Test for Count Data

data:  
p-value = 0.7363
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.4227787 1.7692961
sample estimates:
odds ratio 
 0.8657943 


	Fisher's Exact Test for Count Data

data:  
p-value = 0.884
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.5019211 1.7205283
sample estimates:
odds ratio 
 0.9335966 


	Fisher's Exact Test for Count Data

data:  
p-value = 0.496
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.6893966 2.1893830
sample estimates:
odds ratio 
  1.222431 


	Fisher's Exact Test for Count Data

data:  
p-value = 0.232
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.8408619 2.0773990
sample estimates:
odds ratio 
  1.318979 


	Fisher's Exact Test for Count Data

data:  
p-value = 0.8824
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.5781096 1.9995404
sample estimates:
odds ratio 
  1.078686 


	Welch Two Sample t-test

data:  MACP$VAF[which(MACP$cms != "CMS2")] and (MACP$VAF[which(MACP$cms == "CMS2")])
t = 1.801, df = 460.8, p-value = 0.07236
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.002057673  0.047213560
sample estimates:
mean of x mean of y 
0.3380758 0.3154979 

```


There is no significant association between any CMS classification and hypo-macp status or hypo-MACP VAF levels. The average VAF in CMS2 samples is lower then all the other classifications at 0.315 vs. 0.338, but this is not significant.

## Discussion
