/
initial_map_qc.Rmd
53 lines (37 loc) · 1.33 KB
/
initial_map_qc.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
title: "initial_map_qc"
author: "Briana Mittleman"
date: "5/26/2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
I will use this analysis to look at initial mapping QC for the two mappers I am using.
```{r}
library(workflowr)
library(ggplot2)
library(tidyr)
library(reshape2)
library(dplyr)
```
I created a csv with the number of reads, mapped reads, and proportion of reads mapped per library.
```{r}
subj_map= read.csv("../data/reads_mapped_three_prime_seq.csv", header=TRUE, stringsAsFactors = FALSE)
subj_map$line=as.factor(subj_map$line)
subj_map$fraction=as.factor(subj_map$fraction)
```
Summaries for each number:
```{r}
summary(subj_map$reads)
summary(subj_map$mapped)
summary(subj_map$prop_mapped)
```
Look at this graphically:
```{r}
subj_melt=melt(subj_map, id.vars=c("line", "fraction"), measure.vars = c("reads", "mapped", "prop_mapped"))
```
```{r}
subj_prop_mapped= subj_melt %>% filter(variable=="prop_mapped")
ggplot(subj_prop_mapped, aes(y=value, x=line, fill=fraction)) + geom_bar(stat="identity",position="dodge") + labs( title="Proportion of reads mapped with Subjunc") + ylab("Proportion mapped") + geom_hline(yintercept = mean(subj_prop_mapped$value)) + annotate("text",4, mean(subj_prop_mapped$value)- .1, vjust = -1, label = "Mean mapping proportion= .702")
```