Plotting Lysis Curves in R

Cody Martin 6/18/2021

Installation

You can install R here: https://cran.r-project.org/mirrors.html

Just choose the location closest to you

You can install R studio, a fantastic interactive coding environment for R, here: https://www.rstudio.com/products/rstudio/download/

Then you can open the README.Rmd file from the GitHub repository, and use the code below.

More R help

Need more R help? Checkout my Intro to R document: https://github.com/cody-mar10/intro_R/blob/main/README.md

Scroll to the end to have a copy & paste template to plot code.

Load packages

First load your packages. Normally, you would need to install any packages you don’t have loaded, but I am forcing the code to install packages for you if they are not already.

pkgs <- c("tidyverse", "ggprism", "ggrepel") # note tidyverse includes ggplot2 and dplyr
# Check if packages are installed
for (p in pkgs) {
  if(! p %in% installed.packages()){
    install.packages(p, dependencies = TRUE)
  }
}

# Load packages
invisible(lapply(pkgs, library, character.only=T))

# Note normally you can just do:
# library(tidyverse) for example

Read in data

Here, I read in my wide-formatted lysis curve data. Although I have simple names, you should keep your names useful for coding. Here is an example:

pRE gp60-63 = BAD!!! Don’t ever use spaces
pRE_gp60-63 = GOOD Use _ instead of spaces

Be consistent in your sample naming. Furthermore, you first column must be the time column.

file <- "data/simpledata.csv"
data <- read_csv(file)

# change first column name to "Time"
colnames(data)[1] <- "Time"

data

Time	MG1655	N4	A	B	C	D	E
0	0.223	0.239	0.237	0.231	0.228	0.235	0.231
30	0.523	0.488	0.489	0.505	0.510	0.506	0.504
35	0.620	0.558	0.441	0.505	0.550	0.503	0.510
40	0.685	0.612	0.212	0.354	0.517	0.332	0.379
45	0.772	0.683	0.110	0.222	0.428	0.176	0.244
50	0.844	0.718	0.089	0.137	0.371	0.118	0.174
55	0.888	0.789	0.067	0.111	0.354	0.098	0.136
60	0.968	0.815	0.068	0.105	0.346	0.095	0.126
70	1.390	0.859	0.051	0.091	0.331	0.108	0.111
80	1.544	0.962	0.053	0.068	0.151	0.086	0.074
93	2.010	1.224	0.048	0.070	0.125	0.064	0.084

Wide to Long formatting

To proceed, we need to take your wide-formatted data and convert it to long-formatted data. This way each row will be a singular observation. This is how computational software likes to work with data since the software can use efficient vectorized operations.

If you only have one variable to test in your lysis curve like different genetic backgrounds, you will only need separately colored lines to plot for the visual difference. This will be a “simple” plot, and we can re-format your data easily.

However, if you have multiple varibles in your lysis curve like genetic background and +/- some chemical like DNP, nalidixic acid, chloramphenicol, etc. then READ CAREFULLY!!!

You should name your sample columns with ODs with all variable levels separated by _ UNDERSCORES ONLY!!!

Example: suppose I have genotype A and B, and +/- DNP, I would name my columns like this:

A_- B_- A_+ B_+

DO NOT, and I repeat DO NOT!!! use _ anywhere else. This is because the long formatting code will separate the column name into separate columns at the _ delimiter, so the above columns would separate into columns like this:

Genotype	DNP
A	-
B	-
A	+
B	+

I’ve included an example of both simple and multivariable data sets:

data/simpledata.csv
data/complexdata.csv

Convert data to long format

The gather function from dplyr will convert data from wide to long, turning all numeric data into a single stacked column.

wideToLong <- function(data, variables=c()) {
  # check if any columns have the _ delimiter specifying multiple varibles
  var_check <- grepl("_", colnames(data))
  
  # if any columns have the _ delimiter
  if (TRUE %in% var_check && length(variables) != 0) { 
    data_long <- data %>% 
      gather(key="Sample", value = "OD", -Time) %>% 
      separate(Sample, sep="_", remove=F, into=variables)
  } else {
    data_long <- data %>% 
      gather(key="Sample", value = "OD", -Time)
  }
  
  return(data_long)
}

data_long <- wideToLong(data=data)
data_long %>% head(10)

Time	Sample	OD
0	MG1655	0.223
30	MG1655	0.523
35	MG1655	0.620
40	MG1655	0.685
45	MG1655	0.772
50	MG1655	0.844
55	MG1655	0.888
60	MG1655	0.968
70	MG1655	1.390
80	MG1655	1.544

Plotting

I’ve made extensive use of the ggplot2, ggprism, and ggrepel packages to make the best plots you will ever set your gaze on. This code is mostly fullproof. If you only have simple plots, it should be fine.

Simple plots

Here is how I plot my graphs using my customizations:

# define custom offset to move line labels away from axis
offset <- max(data_long$Time)*0.035

# ggprism has default colors to use, but I want to reorder them
cols = ggprism_data$colour_palettes$colors[c(6,1:5,7:20)]

# make ggplot object
simpleplot <- function(data_long) {
  # set minor ticks based on max OD
  # y-axis needs to look different if OD values rise above 1 due to log scale
  if (max(data_long$OD, na.rm = T) > 1) {
    max_yax = 10
    y_minor = c(rep(1:9, 3)*(10^rep(-2:0, each=9)), 10) # minor ticks
  } else {
    max_yax = 1
    y_minor = c(rep(1:9, 2)*(10^rep(c(-2, -1), each=9)), 1) # minor ticks
  }
  
  # makeplot
  g <- data_long %>% 
      ggplot(aes(x=Time, y=OD)) +
      geom_line(aes(color=Sample), size=1.25) +
      geom_point(aes(shape=Sample), color="black", size=3.5) +
      geom_text_repel(data=subset(data_long, Time == max(data_long$Time)), # labels next to lines
                      aes(label=Sample, 
                          color=Sample, 
                          x=Inf, # put label off plot
                          y=OD), # put label at same height as last data point
                      direction="y",
                      xlim=c(max(data_long$Time)+offset, Inf), # offset labels
                      min.segment.length=Inf, # won't draw lines
                      hjust=0, # left justify
                      size=5,
                      fontface="bold") +
      scale_shape_prism(palette = "complete") + # change prism bullet shape palette
      scale_color_manual(values=cols) +
      scale_y_log10(limit=c(0.01,max_yax), # put y on log10 scale
                         minor_breaks=y_minor,
                         guide=guide_prism_minor(),
                         expand=c(0,0)) + 
      scale_x_continuous(minor_breaks=seq(0,max(data_long$Time),by=10),
                         guide=guide_prism_minor(),
                         expand=c(0,0)) + 
      labs(x="Time (min)",
           y="A550") +
      theme_prism(border=T) + # theme like prism plot
      coord_cartesian(clip="off") +
      theme(aspect.ratio=1/1, 
            legend.position = "none",
            plot.margin=unit(c(1,5,1,1), "lines"))
  return(g)
}

simpleplot(data_long)

Complex plots

Let’s suppose you did a lysis curve testing more than just one variable like in data/complexdata.csv. Let’s first take a look at the data:

Complex input

file <- "data/complexdata.csv"
data <- read_csv(file)
colnames(data)[1] <- "Time"

data

Time	pRE_-N4_None	gp60-63_-N4_None	pRE_+N4_t=20	gp60-63_+N4_t=20	pRE_+N4_t=25	gp60-63_+N4_t=25	pRE_+N4_t=30	gp60-63_+N4_t=30
0	0.200	0.201	0.219	0.208	0.217	0.204	0.219	0.204
15	0.337	0.307	0.331	0.295	0.335	0.314	0.335	0.312
20	0.366	0.326	0.363	0.377	0.365	0.357	0.374	0.363
25	0.391	0.374	0.420	0.378	0.415	0.406	0.403	0.396
30	0.440	0.421	0.467	0.414	0.465	0.393	0.449	0.444
35	0.501	0.395	0.538	0.374	0.519	0.338	0.511	0.268
40	0.552	0.334	0.572	0.247	0.588	0.191	0.571	0.193
45	0.607	0.105	0.616	0.159	0.624	0.094	0.614	0.063
50	0.662	0.056	0.692	0.125	0.683	0.072	0.690	0.056
55	0.725	0.050	0.730	0.128	0.766	0.070	0.779	0.049
60	0.816	0.061	0.813	0.126	0.860	0.075	0.792	0.059

Wide to Long

If you take a look at the function that converts the data from wide to long, you will notice you can input variable names. This tells the code to split your sample columns into n columns for each variable you have using the separate function from dplyr.

wideToLong <- function(data, variables=c()) {
  # check if any columns have the _ delimiter specifying multiple varibles
  var_check <- grepl("_", colnames(data))
  
  # if any columns have the _ delimiter
  if (TRUE %in% var_check && length(variables) != 0) { 
    data_long <- data %>% 
      gather(key="Sample", value = "OD", -Time) %>% 
      separate(Sample, sep="_", remove=F, into=variables)
  } else {
    data_long <- data %>% 
      gather(key="Sample", value = "OD", -Time)
  }
  
  return(data_long)
}

In this dataset, the variables are genotype, N4 addition, and Time of addition, so I code the variable names as Genotype, N4_add, and Time_add.

# define your variables
var <- c("Genotype", "N4_add", "Time_add")
data_long <- wideToLong(data, var)
data_long %>% head(10)

Time	Sample	Genotype	N4_add	Time_add	OD
0	pRE_-N4_None	pRE	-N4	None	0.200
15	pRE_-N4_None	pRE	-N4	None	0.337
20	pRE_-N4_None	pRE	-N4	None	0.366
25	pRE_-N4_None	pRE	-N4	None	0.391
30	pRE_-N4_None	pRE	-N4	None	0.440
35	pRE_-N4_None	pRE	-N4	None	0.501
40	pRE_-N4_None	pRE	-N4	None	0.552
45	pRE_-N4_None	pRE	-N4	None	0.607
50	pRE_-N4_None	pRE	-N4	None	0.662
55	pRE_-N4_None	pRE	-N4	None	0.725

See how there are extra columns in our long-formatted data frame based on the variables we input! We can use these extra columns to better distinguish all our curves in the lysis curve.

Complex Plotting

Currently, you are limited to only 3 different variables. Since there are only 9 possible spots in the shaker bath, I don’t think it is possible to have more than 3 different total conditions. You can modify the code below if you somehow have 4 or more different experimental conditions tested.

# ggprism has default colors to use, but I want to reorder them
cols = ggprism_data$colour_palettes$colors[c(6,1:5,7:20)]

complexplot <- function(data_long, variables) {
  if (max(data_long$OD, na.rm = T) > 1) {
    max_yax = 10
    y_minor = c(rep(1:9, 3)*(10^rep(-2:0, each=9)), 10) # minor ticks
  } else {
    max_yax = 1
    y_minor = c(rep(1:9, 2)*(10^rep(c(-2, -1), each=9)), 1) # minor ticks
  }
  
  variables = rep(variables, 2)
  g <- data_long %>% 
    ggplot(aes_string(x="Time", y="OD", 
                      color=variables[1], linetype=variables[2],
                      shape=variables[3]
                      )) +
    geom_line(size=1.25) +
    geom_point(color="black", fill="black", size=3.5) +
    scale_shape_prism(palette = "complete") + # change prism bullet shape palette
    scale_color_manual(values=cols) +
    scale_y_log10(limit=c(0.01,max_yax), # put y on log10 scale
                  minor_breaks=y_minor,
                  guide=guide_prism_minor(),
                  expand=c(0,0)) + 
    scale_x_continuous(breaks=seq(0,max(data_long$Time),by=10),
                       guide=guide_prism_minor(),
                       expand=c(0,0)) + 
    labs(x="Time (min)",
         y="A550",
         color=variables[1], 
         linetype=variables[2],
         shape=variables[3]
         ) +
    theme_prism(border=T) + # theme like prism plot
    coord_cartesian(clip="off") +
    theme(aspect.ratio=1/1, 
          legend.title = element_text(),
          plot.margin=unit(c(1,5,1,1), "lines"))
  return(g)
}

complexplot(data_long, var)

Saving output

To save your plots as png images, you can just use this simple code:

save = paste0(strsplit(basename(file), ".csv")[[1]], ".png")
png(save, width=7.5, height=7.5, units="in", res=200)
## MAKE PLOT IN HERE
## IE do this
simpleplot(data_long)
dev.off()

Final example

This dataset has two different experimental variables: strain and addition of N4. Everything works exactly as described, but just to show the generality of my complexplot code:

file <- "data/complexdata2.csv"
data <- read_csv(file)
colnames(data)[1] <- "Time"
data

Time	pRE_-N4	gp60-63_-N4	gp63-T65I_-N4	gp63-T71A_-N4	pRE_+N4	gp60-63_+N4	gp63-T65I_+N4	gp63-T71A_+N4
0	0.208	0.176	0.179	0.167	0.190	0.167	0.176	0.168
15	0.300	0.267	0.260	0.251	0.299	0.262	0.268	0.250
20	0.326	0.302	0.279	0.265	0.326	0.296	0.272	0.291
25	0.380	0.321	0.224	0.305	0.383	0.337	0.178	0.318
30	0.441	0.382	0.102	0.263	0.475	0.380	0.080	0.268
35	0.485	0.390	0.038	0.181	0.505	0.368	0.040	0.195
40	0.550	0.273	0.030	0.066	0.566	0.308	0.044	0.089
45	0.571	0.111	0.032	0.036	0.574	0.233	0.032	0.052
50	0.671	0.078	0.031	0.033	0.637	0.212	0.031	0.039
55	0.702	0.081	0.027	0.036	0.711	0.216	0.035	0.036
60	0.745	0.076	0.037	0.037	0.777	0.223	0.040	0.037

var <- c("Strain", "N4_addition")
data_long <- wideToLong(data, var)
data_long %>% head(10)

Time	Sample	Strain	N4_addition	OD
0	pRE_-N4	pRE	-N4	0.208
15	pRE_-N4	pRE	-N4	0.300
20	pRE_-N4	pRE	-N4	0.326
25	pRE_-N4	pRE	-N4	0.380
30	pRE_-N4	pRE	-N4	0.441
35	pRE_-N4	pRE	-N4	0.485
40	pRE_-N4	pRE	-N4	0.550
45	pRE_-N4	pRE	-N4	0.571
50	pRE_-N4	pRE	-N4	0.671
55	pRE_-N4	pRE	-N4	0.702

complexplot(data_long, var)

pkgs <- c("tidyverse", "ggprism", "ggrepel") # note tidyverse includes ggplot2 and dplyr
# Check if packages are installed
for (p in pkgs) {
  if(! p %in% installed.packages()){
    install.packages(p, dependencies = TRUE)
  }
}

# Load packages
invisible(lapply(pkgs, library, character.only=T))

# Read in WIDE FORMATED data

### USER INPUT - CHANGE THIS LINE ###
file <- "data/simpledata.csv"
data <- read_csv(file)

### USER INPUT - CHANGE THIS LINE ###
### Input your variable names in quotes followed by ,
### like this c("var1", "var2")
### If you only have one variable like strain/genotype,
### you can leave this line UNCHANGED.
var <- c()

# Reformat data into long format
# Rename first column with time to be Time
# first column MUSTTTTT BE TIME
colnames(data)[1] <- "Time"

wideToLong <- function(data, variables=c()) {
  # check if any columns have the _ delimiter specifying multiple varibles
  var_check <- grepl("_", colnames(data))
  
  if (TRUE %in% var_check && length(variables) != 0) { # if any columns have the _ delimiter
    data_long <- data %>% 
      gather(key="Sample", value = "OD", -Time) %>% 
      separate(Sample, sep="_", remove=F, into=variables)
  } else {
    data_long <- data %>% 
      gather(key="Sample", value = "OD", -Time)
  }
  
  return(data_long)
}

data_long <- wideToLong(data=data, var)

# define custom offset to move line labels away from axis
offset <- max(data_long$Time)*0.035

# ggprism has default colors to use, but I want to reorder them
cols = ggprism_data$colour_palettes$colors[c(6,1:5,7:20)]

# make ggplot object
simpleplot <- function(data_long) {
  if (max(data_long$OD, na.rm = T) > 1) {
    max_yax = 10
    y_minor = c(rep(1:9, 3)*(10^rep(-2:0, each=9)), 10) # minor ticks
  } else {
    max_yax = 1
    y_minor = c(rep(1:9, 2)*(10^rep(c(-2, -1), each=9)), 1) # minor ticks
  }
  
  g <- data_long %>% 
      ggplot(aes(x=Time, y=OD)) +
      geom_line(aes(color=Sample), size=1.25) +
      geom_point(aes(shape=Sample), fill="black", size=3.5) +
      geom_text_repel(data=subset(data_long, Time == max(data_long$Time)), # labels next to lines
                      aes(label=Sample, 
                          color=Sample, 
                          x=Inf, # put label off plot
                          y=OD), # put label at same height as last data point
                      direction="y",
                      xlim=c(max(data_long$Time)+offset, Inf), # offset labels
                      min.segment.length=Inf, # won't draw lines
                      hjust=0, # left justify
                      size=5,
                      fontface="bold") +
      scale_shape_prism(palette = "complete") + # change prism bullet shape palette
      scale_color_manual(values=cols) +
      scale_y_log10(limit=c(0.01,max_yax), # put y on log10 scale
                         minor_breaks=y_minor,
                         guide=guide_prism_minor(),
                         expand=c(0,0)) + 
      scale_x_continuous(minor_breaks=seq(0,max(data_long$Time),by=10),
                         guide=guide_prism_minor(),
                         expand=c(0,0)) + 
      labs(x="Time (min)",
           y="A550") +
      theme_prism(border=T) + # theme like prism plot
      coord_cartesian(clip="off") +
      theme(aspect.ratio=1/1, 
            legend.position = "none",
            plot.margin=unit(c(1,5,1,1), "lines"))
  return(g)
}

complexplot <- function(data_long, variables) {
  if (max(data_long$OD, na.rm = T) > 1) {
    max_yax = 10
    y_minor = c(rep(1:9, 3)*(10^rep(-2:0, each=9)), 10) # minor ticks
  } else {
    max_yax = 1
    y_minor = c(rep(1:9, 2)*(10^rep(c(-2, -1), each=9)), 1) # minor ticks
  }
  
  variables = rep(variables, 2)
  g <- data_long %>% 
    ggplot(aes_string(x="Time", y="OD", 
                      color=variables[1], linetype=variables[2],
                      shape=variables[3]
                      )) +
    geom_line(size=1.25) +
    geom_point(color="black", fill="black", size=3.5) +
    scale_shape_prism(palette = "complete") + # change prism bullet shape palette
    scale_color_manual(values=cols) +
    scale_y_log10(limit=c(0.01,max_yax), # put y on log10 scale
                  minor_breaks=y_minor,
                  guide=guide_prism_minor(),
                  expand=c(0,0)) + 
    scale_x_continuous(breaks=seq(0,max(data_long$Time),by=10),
                       guide=guide_prism_minor(),
                       expand=c(0,0)) + 
    labs(x="Time (min)",
         y="A550",
         color=variables[1], 
         linetype=variables[2],
         shape=variables[3]
         ) +
    theme_prism(border=T) + # theme like prism plot
    coord_cartesian(clip="off") +
    theme(aspect.ratio=1/1, 
          legend.title = element_text(),
          plot.margin=unit(c(1,5,1,1), "lines"))
  return(g)
}

# save plot as .png
save = paste0(strsplit(basename(file), ".csv")[[1]], ".png")
png(save, width=7.5, height=7.5, units="in", res=200)
if (ncol(data_long) == 3) {
  simpleplot(data_long=data_long)
} else {
  complexplot(data_long=data_long, variables=var)
}
dev.off()

## or if you want to save as a svg file for making full figures in inkscape
save = paste0(strsplit(basename(file), ".csv")[[1]], ".svg")
svg(save, width=7.5, height=7.5)
if (ncol(data_long) == 3) {
  simpleplot(data_long=data_long)
} else {
  complexplot(data_long=data_long, variables=var)
}
dev.off()

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README_files/figure-gfm		README_files/figure-gfm
data		data
README.Rmd		README.Rmd
README.md		README.md
complexdata.png		complexdata.png
plot_lysiscurves.R		plot_lysiscurves.R
simpledata.png		simpledata.png
simpledata.svg		simpledata.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_files/figure-gfm

README_files/figure-gfm

data

data

README.Rmd

README.Rmd

README.md

README.md

complexdata.png

complexdata.png

plot_lysiscurves.R

plot_lysiscurves.R

simpledata.png

simpledata.png

simpledata.svg

simpledata.svg

Repository files navigation

Plotting Lysis Curves in R

Installation

More R help

Load packages

Read in data

Wide to Long formatting

Convert data to long format

Plotting

Simple plots

Complex plots

Complex input

Wide to Long

Complex Plotting

Saving output

Final example

About

Releases

Packages

Languages

cody-mar10/lysis_curves

Folders and files

Latest commit

History

Repository files navigation

Plotting Lysis Curves in R

Installation

More R help

Load packages

Read in data

Wide to Long formatting

Convert data to long format

Plotting

Simple plots

Complex plots

Complex input

Wide to Long

Complex Plotting

Saving output

Final example

About

Resources

Stars

Watchers

Forks

Languages