Skip to content

Commit

Permalink
Update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
cangermueller committed Apr 14, 2017
1 parent 093958d commit 1ce800c
Showing 1 changed file with 12 additions and 14 deletions.
26 changes: 12 additions & 14 deletions examples/notebooks/stats/index.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,6 @@
"This tutorial describes how to predict inter-cell statistics such as the mean methylation rate or variance across cells."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Table of Contents\n",
"* [Initialization](#Initialization)\n",
"* [Creating DeepCpG data files](#Creating-DeepCpG-data-files)\n",
"* [Model training](#Model-training)\n",
"* [Model evaluation](#Model-evaluation)"
]
},
{
"cell_type": "markdown",
"metadata": {
Expand All @@ -43,7 +32,10 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"We first initialize some variables that will be used throughout the tutorial. `test_mode=1` should be used for testing purposes, which speeds up computations by only using a subset of the data. For real applications, `test_mode=0` should be used."
]
Expand Down Expand Up @@ -85,7 +77,10 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"`dcpg_data.py` provides the arguments `--cpg_stats` and `--win_stats` to compute statistics across cells for single CpG sites or in windows of lengths `--win_stats_wlen` centred on CpG sites, respectively. Supported statistics are described in the [documentation](http://deepcpg.readthedocs.io/en/latest/data.html#predicting-statistics) and include the mean methylation rate (`mean`), variance (`var`), and if a CpG site is differentially methylated (`diff`). With `--cpg_stats_cov`, per-CpG statistics will be computed only for CpG sites that are covered by at least the specified number of cells. If this number of too low, estimated statistics might be unreliable in lowly covered regions. We will compute the mean methylation rate and variance across cells in windows of different lengths, and if CpG sites with at least three observations are differentially methylated."
]
Expand Down Expand Up @@ -168,7 +163,10 @@
},
{
"cell_type": "markdown",
"metadata": {},
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"We will train a DNA model to predict mean methylation rates, cell-to-cell variance, and differentially methylated CpG sites from the DNA sequence alone. However, you could train a CpG model or Joint model to also use neighboring CpG sites for making predictions. To predict all per-CpG and window-based statistics computed by `dcpg_data.py` instead of methylation states, we are running `dcpg_train.py` with `--output_names 'cpg_stats/.*' 'win_stats/.*'`. You could use `--output_names '.*'` to predict both methylation states and statistics."
]
Expand Down

0 comments on commit 1ce800c

Please sign in to comment.