Skip to content

Commit

Permalink
Merge pull request #40 from stemangiola/fix_plot_sizes
Browse files Browse the repository at this point in the history
Change figure size from 40 to 70%
  • Loading branch information
mblue9 committed Jul 27, 2020
2 parents 227c4a1 + 0f73c04 commit a1191a9
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 21 deletions.
12 changes: 6 additions & 6 deletions vignettes/solutions.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Questions:

Suggested answers are below. You might have some different code e.g. to customise the volcano plot as you like. Feel free to comment on any of these solutions in the workshop website as described [here](https://github.com/stemangiola/bioc_2020_tidytranscriptomics/blob/master/CONTRIBUTING.md).

```{r out.width = "40%", message=FALSE, warning=FALSE}
```{r out.width = "70%", message=FALSE, warning=FALSE}
# load libraries
# tidyverse core packages
Expand Down Expand Up @@ -67,7 +67,7 @@ Answer: PC1: 47%, PC2: 25%

What do PC1 and PC2 represent?

```{r out.width = "40%"}
```{r out.width = "70%"}
counts_scal_PCA %>%
pivot_sample() %>%
ggplot(aes(x=PC1, y=PC2, colour=condition, shape=type)) +
Expand Down Expand Up @@ -113,7 +113,7 @@ Answer: FBgn0025111

3. What code can generate a heatmap of variable genes (starting from count_scaled)?

```{r out.width = "40%"}
```{r out.width = "70%"}
counts_scaled %>%
# filter lowly abundant
Expand All @@ -134,7 +134,7 @@ counts_scaled %>%

4. What code can you use to visualise expression of the pasilla gene (gene id: FBgn0261552)

```{r out.width = "40%"}
```{r out.width = "70%"}
counts_scaled %>%
# extract counts for pasilla gene
Expand All @@ -150,7 +150,7 @@ counts_scaled %>%

5. What code can generate an interactive volcano plot that has gene ids showing on hover?

```{r out.width = "40%"}
```{r out.width = "70%"}
p <- counts_de %>%
pivot_transcript() %>%
Expand All @@ -177,7 +177,7 @@ Tip: You can use "text" instead of "label" if you don't want the column name to

6. What code can generate a heatmap of the top 100 DE genes?

```{r out.width = "40%"}
```{r out.width = "70%"}
top100 <-
counts_de %>%
pivot_transcript() %>%
Expand Down
14 changes: 7 additions & 7 deletions vignettes/supplementary.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ counts_tt %>%

We can also check how many counts we have for each sample by making a bar plot. This helps us see whether there are any major discrepancies between the samples more easily.

```{r out.width = "40%"}
```{r out.width = "70%"}
ggplot(counts_tt, aes(x=sample, weight=counts, fill=sample)) +
geom_bar() +
theme_bw()
Expand All @@ -77,14 +77,14 @@ As we are using ggplot2, we can also easily view by any other variable that's a

We can colour by dex treatment.

```{r out.width = "40%"}
```{r out.width = "70%"}
ggplot(counts_tt, aes(x=sample, weight=counts, fill=dex)) +
geom_bar() +
theme_bw()
```
We can colour by cell line.

```{r out.width = "40%"}
```{r out.width = "70%"}
ggplot(counts_tt, aes(x=sample, weight=counts, fill=cell)) +
geom_bar() +
theme_bw()
Expand All @@ -93,7 +93,7 @@ ggplot(counts_tt, aes(x=sample, weight=counts, fill=cell)) +

## How to examine normalised counts with boxplots

```{r out.width = "40%"}
```{r out.width = "70%"}
# scale counts
counts_scaled <- counts_tt %>% scale_abundance(factor_of_interest = dex)
Expand All @@ -111,7 +111,7 @@ counts_scaled %>%

## How to create MDS plot

```{r out.width = "40%"}
```{r out.width = "70%"}
airway %>%
tidybulk() %>%
scale_abundance(factor_of_interest=dex) %>%
Expand All @@ -126,7 +126,7 @@ airway %>%

MA plots enable us to visualise amount of expression (logCPM) versus logFC. Highly expressed genes are towards the right of the plot. We can also colour significant genes (e.g. genes with FDR < 0.05)

```{r out.width = "40%"}
```{r out.width = "70%"}
# perform differential testing
counts_de <-
counts_tt %>%
Expand All @@ -147,7 +147,7 @@ counts_de %>%

A more informative MA plot, integrating some of the packages in tidyverse.

```{r out.width = "40%", warning=FALSE}
```{r out.width = "70%", warning=FALSE}
counts_de %>%
pivot_transcript() %>%
Expand Down
16 changes: 8 additions & 8 deletions vignettes/tidytranscriptomics.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ Measuring gene expression on a genome-wide scale has become common practice over

There are many steps involved in analysing an RNA sequencing dataset. The main steps for a differential expression analysis are shown in the figure below. Sequenced reads are aligned to a reference genome, then the number of reads mapped to each gene can be counted. This results in a table of counts, which is what we perform statistical analyses on in R. While mapping and counting are important and necessary tasks, today we will be starting from the count data and showing how differential expression analysis can be performed in a friendly way using tidybulk.

```{r, echo=FALSE, out.width = "40%"}
```{r, echo=FALSE, out.width = "70%"}
knitr::include_graphics("../inst/vignettes/bioc2020tidybulkpipeline-01.png")
```

Expand Down Expand Up @@ -202,7 +202,7 @@ After we run `scale_abundance` we should see some columns have been added at the

We can visualise the difference of abundance densities before and after scaling. As tidybulk output is compatible with tidyverse, we can simply pipe it into standard tidyverse functions such as `filter`, `pivot_longer` and `ggplot`. We can also take advantage of ggplot's `facet_wrap` to easily create multiple plots.

```{r out.width = "40%"}
```{r out.width = "70%"}
counts_scaled %>%
filter(!lowly_abundant) %>%
pivot_longer(cols = c("counts", "counts_scaled"), names_to = "source", values_to = "abundance") %>%
Expand Down Expand Up @@ -302,7 +302,7 @@ counts_scal_PCA %>% pivot_sample()

We can now plot the reduced dimensions.

```{r out.width = "40%"}
```{r out.width = "70%"}
# PCA plot
counts_scal_PCA %>%
pivot_sample() %>%
Expand All @@ -319,7 +319,7 @@ The samples separate by treatment on PC1 which is what we hope to see. PC2 separ

An alternative to principal component analysis for examining relationships between samples is using hierarchical clustering. Heatmaps are a nice visualisation to examine hierarchical clustering of your samples. tidybulk has a simple function we can use, `keep_variable`, to extract the most variable genes which we can then plot with tidyHeatmap.

```{r out.width = "40%"}
```{r out.width = "70%"}
counts_scaled %>%
# filter lowly abundant
Expand Down Expand Up @@ -461,7 +461,7 @@ topgenes_symbols

Volcano plots are a useful genome-wide plot for checking that the analysis looks good. Volcano plots enable us to visualise the significance of change (p-value) versus the fold change (logFC). Highly significant genes are towards the top of the plot. We can also colour significant genes (e.g. genes with false-discovery rate < 0.05)

```{r out.width = "40%"}
```{r out.width = "70%"}
# volcano plot, minimal
counts_de %>%
filter(!lowly_abundant) %>%
Expand All @@ -474,7 +474,7 @@ counts_de %>%

A more informative plot, integrating some of the packages in tidyverse.

```{r out.width = "40%", warning=FALSE}
```{r out.width = "70%", warning=FALSE}
counts_de %>%
pivot_transcript() %>%
Expand All @@ -501,7 +501,7 @@ Before following up on the differentially expressed genes with further lab work,

With stripcharts we can see if replicates tend to group together and how the expression compares to the other groups. We'll also add a box plot to show the distribution.

```{r out.width = "40%"}
```{r out.width = "70%"}
strip_chart <-
counts_scaled %>%
Expand All @@ -525,7 +525,7 @@ A really nice feature of using tidyverse and ggplot2 is that we can make interac

We can also specify which parameters from the `aes` we want to show up when we hover over the plot with `tooltip`.

```{r, out.width = "40%", warning=FALSE}
```{r, out.width = "70%", warning=FALSE}
strip_chart %>% ggplotly(tooltip = c("label", "y"))
```

Expand Down

0 comments on commit a1191a9

Please sign in to comment.