-
Notifications
You must be signed in to change notification settings - Fork 1
/
KRAS_demo.qmd
51 lines (33 loc) · 2.42 KB
/
KRAS_demo.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# KRAS Analysis Demo
## Motivation
In the paper ["Tumor RAS Gene Expression Levels Are Influenced by the Mutational Status of RAS Genes and Both Upstream and Downstream RAS Pathway Genes"](https://journals.sagepub.com/doi/pdf/10.1177/1176935117711944), the authors studied relationship between RAS gene mutational status and messenger RNA expression. They saw higher levels of KRAS expression for samples that have KRAS mutation relative to samples without KRAS mutation for several cancer subtypes. The analysis was conducted using patient data from the [The Cancer Genome Atlas (TCGA)](https://www.cancer.gov/ccg/research/genome-sequencing/tcga) project, and *we are curious whether similar trends hold via cell line models.*
![From the figure above, KRAS expression is elevated in KRAS-mutant samples from lung, pancreatic, and colon adenocarcinomas relative to WT samples.](https://raw.githubusercontent.com/fhdsl/S1_Intro_to_R/main/images/kras.png){alt="From the figure above, KRAS expression is elevated in KRAS-mutant samples from lung, pancreatic, and colon adenocarcinomas relative to WT samples." width="450"}
The cell line models we use is from the [Dependency Map Project](https://depmap.org/portal/home/) (DepMap), where over a thousand cancer cell lines were profiled for various genomic features, including mutational status and RNA expression. Below are the analysis code to re-create the analysis:
## Analysis
### Load analysis package and DepMap data in
```{r, message=FALSE}
rm(list = ls())
library(tidyverse)
load(url("https://github.com/fhdsl/Intro_to_R/raw/main/classroom_data/CCLE.RData"))
```
### Examine the number of cell lines profiled for `metadata`, `expression`, `mutation`
```{r}
dim(metadata)
dim(expression)
dim(mutation)
```
### Examine the frequency of cancer subtypes in `metadata`
```{r}
knitr::kable(table(metadata$OncotreePrimaryDisease))
```
### Filter rows for cancer subtype in metadata, then join datasets together
```{r}
metadata_filtered = metadata %>%
filter(OncotreePrimaryDisease == "Rhabdoid Cancer" | OncotreePrimaryDisease =="Colorectal Adenocarcinoma" | OncotreePrimaryDisease == "Pancreatic Adenocarcinoma")
analysis = full_join(metadata_filtered, expression, by = "ModelID")
analysis = full_join(analysis, mutation, by = "ModelID")
```
### Create figure
```{r}
ggplot(drop_na(analysis), aes(x = KRAS_Mut, y = KRAS_Exp)) + geom_boxplot() + facet_wrap(~OncotreePrimaryDisease) + theme_bw()
```