Skip to content

Commit

Permalink
update the course website
Browse files Browse the repository at this point in the history
Former-commit-id: f597bbf
  • Loading branch information
wikiselev committed Oct 30, 2017
1 parent 9f48740 commit f77d10e
Show file tree
Hide file tree
Showing 43 changed files with 38,765 additions and 188 deletions.
Binary file added deng/deng-reads.rds
Binary file not shown.
1 change: 0 additions & 1 deletion deng/deng-reads.rds.REMOVED.git-id

This file was deleted.

4 changes: 2 additions & 2 deletions docs/07-exprs-qc.md
Expand Up @@ -521,10 +521,10 @@ Perform exactly the same QC analysis with read counts of the same Blischak data.
## [27] shinydashboard_0.6.1 shiny_1.0.5
## [29] rrcov_1.4-3 compiler_3.4.2
## [31] backports_1.1.1 assertthat_0.2.0
## [33] Matrix_1.2-7.1 lazyeval_0.2.0
## [33] Matrix_1.2-7.1 lazyeval_0.2.1
## [35] htmltools_0.3.6 quantreg_5.34
## [37] tools_3.4.2 bindrcpp_0.2
## [39] gtable_0.2.0 glue_1.1.1
## [39] gtable_0.2.0 glue_1.2.0
## [41] GenomeInfoDbData_0.99.0 reshape2_1.4.2
## [43] dplyr_0.7.4 Rcpp_0.12.13
## [45] trimcluster_0.1-2 sgeostat_1.0-27
Expand Down
4 changes: 2 additions & 2 deletions docs/08-exprs-qc-reads.md
Expand Up @@ -428,10 +428,10 @@ sessionInfo()
## [27] shinydashboard_0.6.1 shiny_1.0.5
## [29] rrcov_1.4-3 compiler_3.4.2
## [31] backports_1.1.1 assertthat_0.2.0
## [33] Matrix_1.2-7.1 lazyeval_0.2.0
## [33] Matrix_1.2-7.1 lazyeval_0.2.1
## [35] htmltools_0.3.6 quantreg_5.34
## [37] tools_3.4.2 bindrcpp_0.2
## [39] gtable_0.2.0 glue_1.1.1
## [39] gtable_0.2.0 glue_1.2.0
## [41] GenomeInfoDbData_0.99.0 reshape2_1.4.2
## [43] dplyr_0.7.4 Rcpp_0.12.13
## [45] trimcluster_0.1-2 sgeostat_1.0-27
Expand Down
4 changes: 2 additions & 2 deletions docs/09-exprs-overview.md
Expand Up @@ -265,7 +265,7 @@ Perform the same analysis with read counts of the Blischak data. Use `tung/reads
## [7] blob_1.1.0 GenomeInfoDbData_0.99.0
## [9] vipor_0.4.5 yaml_2.1.14
## [11] RSQLite_2.0 backports_1.1.1
## [13] lattice_0.20-34 glue_1.1.1
## [13] lattice_0.20-34 glue_1.2.0
## [15] limma_3.32.10 digest_0.6.12
## [17] XVector_0.16.0 colorspace_1.3-2
## [19] cowplot_0.8.0 htmltools_0.3.6
Expand All @@ -275,7 +275,7 @@ Perform the same analysis with read counts of the Blischak data. Use `tung/reads
## [27] bookdown_0.5 zlibbioc_1.22.0
## [29] xtable_1.8-2 scales_0.5.0
## [31] Rtsne_0.13 tibble_1.3.4
## [33] lazyeval_0.2.0 magrittr_1.5
## [33] lazyeval_0.2.1 magrittr_1.5
## [35] mime_0.5 memoise_1.1.0
## [37] evaluate_0.10.1 beeswarm_0.2.3
## [39] shinydashboard_0.6.1 tools_3.4.2
Expand Down
4 changes: 2 additions & 2 deletions docs/10-exprs-overview-reads.md
Expand Up @@ -178,7 +178,7 @@ sessionInfo()
## [7] blob_1.1.0 GenomeInfoDbData_0.99.0
## [9] vipor_0.4.5 yaml_2.1.14
## [11] RSQLite_2.0 backports_1.1.1
## [13] lattice_0.20-34 glue_1.1.1
## [13] lattice_0.20-34 glue_1.2.0
## [15] limma_3.32.10 digest_0.6.12
## [17] XVector_0.16.0 colorspace_1.3-2
## [19] cowplot_0.8.0 htmltools_0.3.6
Expand All @@ -188,7 +188,7 @@ sessionInfo()
## [27] bookdown_0.5 zlibbioc_1.22.0
## [29] xtable_1.8-2 scales_0.5.0
## [31] Rtsne_0.13 tibble_1.3.4
## [33] lazyeval_0.2.0 magrittr_1.5
## [33] lazyeval_0.2.1 magrittr_1.5
## [35] mime_0.5 memoise_1.1.0
## [37] evaluate_0.10.1 beeswarm_0.2.3
## [39] shinydashboard_0.6.1 tools_3.4.2
Expand Down
4 changes: 2 additions & 2 deletions docs/11-confounders.md
Expand Up @@ -158,7 +158,7 @@ Perform the same analysis with read counts of the Blischak data. Use `tung/reads
## [7] blob_1.1.0 GenomeInfoDbData_0.99.0
## [9] vipor_0.4.5 yaml_2.1.14
## [11] RSQLite_2.0 backports_1.1.1
## [13] lattice_0.20-34 glue_1.1.1
## [13] lattice_0.20-34 glue_1.2.0
## [15] limma_3.32.10 digest_0.6.12
## [17] XVector_0.16.0 colorspace_1.3-2
## [19] cowplot_0.8.0 htmltools_0.3.6
Expand All @@ -167,7 +167,7 @@ Perform the same analysis with read counts of the Blischak data. Use `tung/reads
## [25] pkgconfig_2.0.1 biomaRt_2.32.1
## [27] bookdown_0.5 zlibbioc_1.22.0
## [29] xtable_1.8-2 scales_0.5.0
## [31] tibble_1.3.4 lazyeval_0.2.0
## [31] tibble_1.3.4 lazyeval_0.2.1
## [33] magrittr_1.5 mime_0.5
## [35] memoise_1.1.0 evaluate_0.10.1
## [37] beeswarm_0.2.3 shinydashboard_0.6.1
Expand Down
4 changes: 2 additions & 2 deletions docs/12-confounders-reads.md
Expand Up @@ -70,7 +70,7 @@ knit: bookdown::preview_chapter
## [7] blob_1.1.0 GenomeInfoDbData_0.99.0
## [9] vipor_0.4.5 yaml_2.1.14
## [11] RSQLite_2.0 backports_1.1.1
## [13] lattice_0.20-34 glue_1.1.1
## [13] lattice_0.20-34 glue_1.2.0
## [15] limma_3.32.10 digest_0.6.12
## [17] XVector_0.16.0 colorspace_1.3-2
## [19] cowplot_0.8.0 htmltools_0.3.6
Expand All @@ -79,7 +79,7 @@ knit: bookdown::preview_chapter
## [25] pkgconfig_2.0.1 biomaRt_2.32.1
## [27] bookdown_0.5 zlibbioc_1.22.0
## [29] xtable_1.8-2 scales_0.5.0
## [31] tibble_1.3.4 lazyeval_0.2.0
## [31] tibble_1.3.4 lazyeval_0.2.1
## [33] magrittr_1.5 mime_0.5
## [35] memoise_1.1.0 evaluate_0.10.1
## [37] beeswarm_0.2.3 shinydashboard_0.6.1
Expand Down
4 changes: 2 additions & 2 deletions docs/13-exprs-norm.md
Expand Up @@ -664,7 +664,7 @@ Perform the same analysis with read counts of the `tung` data. Use `tung/reads.r
## [5] tools_3.4.2 backports_1.1.1
## [7] DT_0.2 R6_2.2.2
## [9] hypergeo_1.2-13 vipor_0.4.5
## [11] DBI_0.7 lazyeval_0.2.0
## [11] DBI_0.7 lazyeval_0.2.1
## [13] colorspace_1.3-2 gridExtra_2.3
## [15] moments_0.14 bit_1.1-12
## [17] compiler_3.4.2 orthopolynom_1.0-5
Expand All @@ -691,7 +691,7 @@ Perform the same analysis with read counts of the `tung` data. Use `tung/reads.r
## [59] splines_3.4.2 locfit_1.5-9.1
## [61] igraph_1.1.2 rjson_0.2.15
## [63] reshape2_1.4.2 biomaRt_2.32.1
## [65] XML_3.98-1.9 glue_1.1.1
## [65] XML_3.98-1.9 glue_1.2.0
## [67] evaluate_0.10.1 data.table_1.10.4-3
## [69] deSolve_1.20 httpuv_1.3.5
## [71] gtable_0.2.0 assertthat_0.2.0
Expand Down
4 changes: 2 additions & 2 deletions docs/14-exprs-norm-reads.md
Expand Up @@ -211,7 +211,7 @@ output: html_document
## [5] tools_3.4.2 backports_1.1.1
## [7] DT_0.2 R6_2.2.2
## [9] hypergeo_1.2-13 vipor_0.4.5
## [11] DBI_0.7 lazyeval_0.2.0
## [11] DBI_0.7 lazyeval_0.2.1
## [13] colorspace_1.3-2 gridExtra_2.3
## [15] moments_0.14 bit_1.1-12
## [17] compiler_3.4.2 orthopolynom_1.0-5
Expand All @@ -238,7 +238,7 @@ output: html_document
## [59] splines_3.4.2 locfit_1.5-9.1
## [61] igraph_1.1.2 rjson_0.2.15
## [63] reshape2_1.4.2 biomaRt_2.32.1
## [65] XML_3.98-1.9 glue_1.1.1
## [65] XML_3.98-1.9 glue_1.2.0
## [67] evaluate_0.10.1 data.table_1.10.4-3
## [69] deSolve_1.20 httpuv_1.3.5
## [71] gtable_0.2.0 assertthat_0.2.0
Expand Down
4 changes: 2 additions & 2 deletions docs/15-remove-conf.md
Expand Up @@ -558,10 +558,10 @@ Perform the same analysis with read counts of the `tung` data. Use `tung/reads.r
## [19] shinydashboard_0.6.1 shiny_1.0.5
## [21] compiler_3.4.2 backports_1.1.1
## [23] assertthat_0.2.0 Matrix_1.2-7.1
## [25] lazyeval_0.2.0 htmltools_0.3.6
## [25] lazyeval_0.2.1 htmltools_0.3.6
## [27] tools_3.4.2 igraph_1.1.2
## [29] bindrcpp_0.2 gtable_0.2.0
## [31] glue_1.1.1 GenomeInfoDbData_0.99.0
## [31] glue_1.2.0 GenomeInfoDbData_0.99.0
## [33] dplyr_0.7.4 Rcpp_0.12.13
## [35] rtracklayer_1.36.6 stringr_1.2.0
## [37] mime_0.5 hypergeo_1.2-13
Expand Down
4 changes: 2 additions & 2 deletions docs/16-remove-conf-reads.md
Expand Up @@ -415,10 +415,10 @@ ggplot(dod, aes(Normalisation, Individual, fill=kBET)) +
## [19] shinydashboard_0.6.1 shiny_1.0.5
## [21] compiler_3.4.2 backports_1.1.1
## [23] assertthat_0.2.0 Matrix_1.2-7.1
## [25] lazyeval_0.2.0 htmltools_0.3.6
## [25] lazyeval_0.2.1 htmltools_0.3.6
## [27] tools_3.4.2 igraph_1.1.2
## [29] bindrcpp_0.2 gtable_0.2.0
## [31] glue_1.1.1 GenomeInfoDbData_0.99.0
## [31] glue_1.2.0 GenomeInfoDbData_0.99.0
## [33] dplyr_0.7.4 Rcpp_0.12.13
## [35] rtracklayer_1.36.6 stringr_1.2.0
## [37] mime_0.5 hypergeo_1.2-13
Expand Down
11 changes: 7 additions & 4 deletions docs/18-clustering.md
Expand Up @@ -67,6 +67,7 @@ plotPCA(deng, colour_by = "cell_type2")


\begin{center}\includegraphics{18-clustering_files/figure-latex/unnamed-chunk-5-1} \end{center}
As you can see, the early cell types separate quite well, but the three blastocyst timepoints are more difficult to distinguish.

### SC3

Expand Down Expand Up @@ -196,14 +197,16 @@ adjustedRandIndex(colData(deng)$cell_type2, colData(deng)$sc3_10_clusters)
## [1] 0.7705208
```

Note, that one can also run `SC3` in an interactive `Shiny` session:
__Note__ `SC3` can also be run in an interactive `Shiny` session:

```r
sc3_interactive(deng)
```

This command will open `SC3` in a web browser.

__Note__ Due to direct calculation of distances `SC3` becomes very slow when the number of cells is $>5000$. For large datasets containing up to $10^5$ cells we recomment using `Seurat` (see chapter \@ref(seurat-chapter)).

* __Exercise 1__: Run `SC3` for $k$ from 8 to 12 and explore different clustering solutions in your web browser.

* __Exercise 2__: Which clusters are the most stable when $k$ is changed from 8 to 12? (Look at the "Stability" tab)
Expand Down Expand Up @@ -258,7 +261,7 @@ __Our solution__:
\caption{Clustering solutions of pcaReduce method for $k=2$.}(\#fig:clust-pca-reduce2)
\end{figure}

__Exercise 6__: Compare the results between `SC3` and the original publication cell types for $k=10$.
__Exercise 6__: Compare the results between `pcaReduce` and the original publication cell types for $k=10$.

__Our solution__:

Expand Down Expand Up @@ -490,10 +493,10 @@ __Exercise 11__: Is using the singleton cluster criteria for finding __k__ a goo
## [17] shinydashboard_0.6.1 shiny_1.0.5
## [19] rrcov_1.4-3 compiler_3.4.2
## [21] backports_1.1.1 assertthat_0.2.0
## [23] Matrix_1.2-7.1 lazyeval_0.2.0
## [23] Matrix_1.2-7.1 lazyeval_0.2.1
## [25] limma_3.32.10 htmltools_0.3.6
## [27] tools_3.4.2 bindrcpp_0.2
## [29] gtable_0.2.0 glue_1.1.1
## [29] gtable_0.2.0 glue_1.2.0
## [31] GenomeInfoDbData_0.99.0 reshape2_1.4.2
## [33] dplyr_0.7.4 doRNG_1.6.6
## [35] Rcpp_0.12.13 gdata_2.18.0
Expand Down
2 changes: 1 addition & 1 deletion docs/19-dropouts.md
Expand Up @@ -332,7 +332,7 @@ Plot the expression of the features for each of the other methods. Which appear
## [27] gdata_2.18.0 magrittr_1.5
## [29] backports_1.1.1 gplots_3.0.1
## [31] htmltools_0.3.6 MASS_7.3-45
## [33] bbmle_1.0.19 KernSmooth_2.23-15
## [33] bbmle_1.0.20 KernSmooth_2.23-15
## [35] stringi_1.1.5 RCurl_1.95-4.8
## [37] hypergeo_1.2-13
```
6 changes: 3 additions & 3 deletions docs/20-pseudotime.md
Expand Up @@ -549,7 +549,7 @@ __Exercise 7__: Repeat the exercise using a subset of the genes, e.g. the set of
## loaded via a namespace (and not attached):
## [1] backports_1.1.1 Hmisc_4.0-3
## [3] RcppEigen_0.3.3.3.0 plyr_1.8.4
## [5] igraph_1.1.2 lazyeval_0.2.0
## [5] igraph_1.1.2 lazyeval_0.2.1
## [7] sp_1.2-5 densityClust_0.3
## [9] fastICA_1.2-1 digest_0.6.12
## [11] htmltools_0.3.6 gdata_2.18.0
Expand All @@ -562,7 +562,7 @@ __Exercise 7__: Repeat the exercise using a subset of the genes, e.g. the set of
## [25] lme4_1.1-14 spatstat_1.53-2
## [27] spatstat.data_1.1-1 bindr_0.1
## [29] survival_2.40-1 zoo_1.8-0
## [31] glue_1.1.1 polyclip_1.6-1
## [31] glue_1.2.0 polyclip_1.6-1
## [33] gtable_0.2.0 zlibbioc_1.22.0
## [35] XVector_0.16.0 MatrixModels_0.4-1
## [37] car_2.1-5 DEoptimR_1.0-8
Expand Down Expand Up @@ -605,6 +605,6 @@ __Exercise 7__: Repeat the exercise using a subset of the genes, e.g. the set of
## [111] grid_3.4.2 rpart_4.1-10
## [113] class_7.3-14 minqa_1.2.4
## [115] rmarkdown_1.6 Rtsne_0.13
## [117] TTR_0.23-2 bbmle_1.0.19
## [117] TTR_0.23-2 bbmle_1.0.20
## [119] shiny_1.0.5 base64enc_0.1-3
```
8 changes: 4 additions & 4 deletions docs/21-imputation.md
Expand Up @@ -22,9 +22,9 @@ As discussed previously, one of the main challenges when analyzing scRNA-seq dat
* The gene was expressed, but for some reason the transcripts were lost somewhere prior to sequencing
* The gene was expressed and transcripts were captured and turned into cDNA, but the sequencing depth was not sufficient to produce any reads.

Thus, dropouts could be result of experimental shortcomings, and if this is the case then we would like to provide computational corrections. One possible solution is to impute the dropouts in the expression matrix. To be able to impute gene expression values, one must have an underlying model. However, since we do not know which dropout events are technical artefacts and which correspond to the transcript being truly absent, imputation is a difficult challenges.
Thus, dropouts could be result of experimental shortcomings, and if this is the case then we would like to provide computational corrections. One possible solution is to impute the dropouts in the expression matrix. To be able to impute gene expression values, one must have an underlying model. However, since we do not know which dropout events are technical artefacts and which correspond to the transcript being truly absent, imputation is a difficult challenge.

To the best of our knowledge, there are currently two different imputation methods available: MAGIC [@Van_Dijk2017-bh] and scImpute [@Li2017-tz]. [MAGIC](https://github.com/pkathail/magic) is only available for Python or Matlab, but we will run it from within R.
To the best of our knowledge, there are currently two different imputation methods available: [MAGIC](https://github.com/pkathail/magic) [@Van_Dijk2017-bh] and [scImpute](https://github.com/Vivianstats/scImpute) [@Li2017-tz]. MAGIC is only available for Python or Matlab, but we will run it from within R.

### scImpute

Expand Down Expand Up @@ -243,7 +243,7 @@ __Exercise:__ What is the difference between `scImpute` and `MAGIC` based on the
## [5] tools_3.4.2 backports_1.1.1
## [7] doRNG_1.6.6 R6_2.2.2
## [9] vipor_0.4.5 KernSmooth_2.23-15
## [11] DBI_0.7 lazyeval_0.2.0
## [11] DBI_0.7 lazyeval_0.2.1
## [13] colorspace_1.3-2 gridExtra_2.3
## [15] bit_1.1-12 compiler_3.4.2
## [17] pkgmaker_0.22 labeling_0.3
Expand Down Expand Up @@ -271,7 +271,7 @@ __Exercise:__ What is the difference between `scImpute` and `MAGIC` based on the
## [61] splines_3.4.2 locfit_1.5-9.1
## [63] rjson_0.2.15 rngtools_1.2.4
## [65] reshape2_1.4.2 codetools_0.2-15
## [67] biomaRt_2.32.1 glue_1.1.1
## [67] biomaRt_2.32.1 glue_1.2.0
## [69] XML_3.98-1.9 evaluate_0.10.1
## [71] data.table_1.10.4-3 httpuv_1.3.5
## [73] gtable_0.2.0 assertthat_0.2.0
Expand Down
6 changes: 3 additions & 3 deletions docs/22-de-intro.md
Expand Up @@ -8,7 +8,7 @@ knit: bookdown::preview_chapter

### Bulk RNA-seq

One of the most common types of analyses when analyzing bulk RNA-seq
One of the most common types of analyses when working with bulk RNA-seq
data is to identify differentially expressed genes. By comparing the
genes that change between two conditions, e.g. mutant and wild-type or
stimulated and unstimulated, it is possible to characterize the
Expand All @@ -21,7 +21,7 @@ have been developed for bulk RNA-seq. Moreover, there are also
extensive
[datasets](http://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-9-r95)
available where the RNA-seq data has been validated using
RT-qPCR. These data can be used to benchmark DE finding algorithms.
RT-qPCR. These data can be used to benchmark DE finding algorithms and the available evidence suggests that the algorithms are performing quite well.

### Single cell RNA-seq

Expand All @@ -37,7 +37,7 @@ Unlike bulk RNA-seq, we generally have a large number of samples (i.e. cells) fo

There are two main approaches to comparing distributions. Firstly, we can use existing statistical models/distributions and fit the same type of model to the expression in each group then test for differences in the parameters for each model, or test whether the model fits better if a particular paramter is allowed to be different according to group. For instance in Chapter \@ref(dealing-with-confounders) we used edgeR to test whether allowing mean expression to be different in different batches significantly improved the fit of a negative binomial model of the data.

Alternatively, we can use a non-parametric test which does not assume expression values follow any particular distribution, e.g. the [Kolmogorov-Smirnov test (KS-test)](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test). Non-parametric tests generally convert observed expression values to ranks and test whether the distribution of ranks for one group are signficantly different from the distribution of ranks for the other group. However, some non-parametric methods fail in the presence of a large number of tied values, such as the case for dropouts (zeros) in single-cell RNA-seq expression data. Moreover, if the conditions for a parametric test hold, then it will typically be more powerful than a non-parametric test.
Alternatively, we can use a non-parametric test which does not assume that expression values follow any particular distribution, e.g. the [Kolmogorov-Smirnov test (KS-test)](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test). Non-parametric tests generally convert observed expression values to ranks and test whether the distribution of ranks for one group are signficantly different from the distribution of ranks for the other group. However, some non-parametric methods fail in the presence of a large number of tied values, such as the case for dropouts (zeros) in single-cell RNA-seq expression data. Moreover, if the conditions for a parametric test hold, then it will typically be more powerful than a non-parametric test.

### Models of single-cell RNASeq data

Expand Down

0 comments on commit f77d10e

Please sign in to comment.