minor wording/renaming tweaks in tutorial (#252)

* minor wording/renaming tweaks * Update dandelion_singularity.ipynb --------- Co-authored-by: zktuong <kt16@sanger.ac.uk>
zktuong · Feb 15, 2023 · 32868b5 · 32868b5
1 parent 878c1a0
commit 32868b5
Show file tree

Hide file tree

Showing 6 changed files with 129 additions and 135 deletions.
diff --git a/container/dandelion_singularity.ipynb b/container/dandelion_singularity.ipynb
@@ -902,7 +902,7 @@
       "source": [
         "# Post-processing - V(D)J analysis\n",
         "\n",
-        "We will now switch to the pro-processing tutorial. Let's follow the rest of the [original tutorial](https://sc-dandelion.readthedocs.io/en/latest/notebooks/Q2-analysis.html)."
+        "We will now switch to the post-processing tutorial. Let's follow the rest of the [original tutorial](https://sc-dandelion.readthedocs.io/en/latest/notebooks/Q2-analysis.html)."
       ],
       "metadata": {
         "id": "WNAZ6p9xXJlz"

diff --git a/docs/notebooks/Q1-singularity-preprocessing.ipynb b/docs/notebooks/Q1-singularity-preprocessing.ipynb
@@ -19,11 +19,11 @@
     "\n",
     "## Setup and running\n",
     "\n",
-    "Once you have [Singularity installed](https://sylabs.io/guides/3.0/user-guide/installation.html), you can download the Dandelion container. This command will create `sc-dandelion_latest.sif`, note its location.\n",
+    "Once you have [Singularity installed](https://sylabs.io/guides/3.0/user-guide/installation.html), you can download the [Dandelion container](https://cloud.sylabs.io/library/kt16/default/sc-dandelion). This command will create `sc-dandelion_latest.sif`, note its location.\n",
     "\n",
     "    singularity pull library://kt16/default/sc-dandelion:latest\n",
     "\n",
-    "In order to prepare your BCR data for ingestion, create a folder for each sample you'd like to analyse, name it with your sample ID, and store the Cell Ranger `all_contig_annotations.csv` and `all_contig.fasta` output files inside.\n",
+    "In order to prepare your VDJ data for ingestion, create a folder for each sample you'd like to analyse, name it with your sample ID, and store the Cell Ranger `all_contig_annotations.csv` and `all_contig.fasta` output files inside.\n",
     "\n",
     "    5841STDY7998693\n",
     "    ├── all_contig_annotations.csv\n",
@@ -45,7 +45,7 @@
     "\n",
     "## Recommended parameterisation\n",
     "\n",
-    "If in possession of gene expression data that the BCR data will be integrated with, the following parameterisation is likely to yield the best results:\n",
+    "If in possession of gene expression data that the VDJ data will be integrated with, the following parameterisation is likely to yield the best results:\n",
     "\n",
     "```bash\n",
     "singularity run -B $PWD /path/to/sc-dandelion_latest.sif dandelion-preprocess \\\n",
@@ -59,7 +59,7 @@
     "By default, this workflow will analyse all provided IG samples jointly with TIgGER to maximise inference power, and in the event of multiple input folders will prepend the sample IDs to the cell barcodes to avoid erroneously merging barcodes overlapping between samples at this stage. TIgGER should be ran on a per-individual level. If running the workflow on multiple individuals' worth of data at once, or wanting to flag the cell barcodes in a non-default manner, information can be provided to the script in the form of a CSV file passed through the `--meta` argument:\n",
     "\n",
     "1. The first row of the CSV needs to be a header identifying the information in the columns, and the first column needs to contain sample IDs.\n",
-    "2. Barcode flagging can be controlled by an optional `prefix`/`suffix` column. The pipeline will then add the specified prefixes/suffixes to the barcodes of the samples. This may be desirable, as corresponding gene expression samples are likely to have different IDs, and providing the matched ID will pre-format the BCR output to match the GEX nomenclature.\n",
+    "2. Barcode flagging can be controlled by an optional `prefix`/`suffix` column. The pipeline will then add the specified prefixes/suffixes to the barcodes of the samples. This may be desirable, as corresponding gene expression samples are likely to have different IDs, and providing the matched ID will pre-format the VDJ output to match the GEX nomenclature.\n",
     "3. Individual information for TIgGER can be specified in an optional `individual` column. If specified, TIgGER will be ran for each unique value present in the column, pooling the corresponding samples.\n",
     "\n",
     "It's possible to just pass a prefix/suffix or individual information. An excerpt of a sample CSV file that could be used on input:\n",
@@ -85,15 +85,8 @@
     "\n",
     "The plots showing the impact of TIgGER are in `<tigger>/<tigger>_reassign_alleles.pdf`, for each TIgGER folder (one per unique individual if using `--meta`, `tigger` otherwise). The impact of C gene reannotation is shown in `dandelion/data/assign_isotype.pdf` for each sample.\n",
     "\n",
-    "If you're interested in more detail about the pre-processing this offers, or wish to use the workflow in a more advanced manner (e.g. by using your own databases), proceed to the pre-processing section of the tutorial."
+    "If you're interested in more detail about the pre-processing this offers, or wish to use the workflow in a more advanced manner (e.g. by using your own databases), proceed to the pre-processing section of the advanced guide."
    ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {

diff --git a/docs/notebooks/Q2-object-prep.ipynb b/docs/notebooks/Q2-object-prep.ipynb
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "handmade-pulse",
+   "id": "provincial-toronto",
    "metadata": {},
    "source": [
     "# Loading data for analysis"
@@ -11,7 +11,7 @@
   {
    "cell_type": "code",
    "execution_count": 1,
-   "id": "relevant-exhaust",
+   "id": "vocational-cambridge",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -25,7 +25,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "stainless-helena",
+   "id": "regular-consumer",
    "metadata": {},
    "source": [
     "This notebook shows how to prepare both GEX and VDJ data for Dandelion analysis. You don't have to run it if you don't want to, the resulting objects are downloaded at the start of the subsequent notebook. However, you're likely to find the provided syntax for loading and concatenating multiple samples with Dandelion useful.\n",
@@ -37,11 +37,11 @@
     "<details>\n",
     "    <summary>Download commands</summary>\n",
     "\n",
-    "    mkdir dandelion_tutorial\n",
-    "    mkdir -p dandelion_tutorial/vdj_nextgem_hs_pbmc3\n",
-    "    mkdir -p dandelion_tutorial/vdj_v1_hs_pbmc3\n",
-    "    mkdir -p dandelion_tutorial/sc5p_v2_hs_PBMC_10k\n",
-    "    mkdir -p dandelion_tutorial/sc5p_v2_hs_PBMC_1k\n",
+    "    mkdir dandelion_tutorial;\n",
+    "    mkdir -p dandelion_tutorial/vdj_nextgem_hs_pbmc3;\n",
+    "    mkdir -p dandelion_tutorial/vdj_v1_hs_pbmc3;\n",
+    "    mkdir -p dandelion_tutorial/sc5p_v2_hs_PBMC_10k;\n",
+    "    mkdir -p dandelion_tutorial/sc5p_v2_hs_PBMC_1k;\n",
     "    cd dandelion_tutorial/vdj_v1_hs_pbmc3;\n",
     "    wget -O filtered_feature_bc_matrix.h5 https://cf.10xgenomics.com/samples/cell-vdj/3.1.0/vdj_v1_hs_pbmc3/vdj_v1_hs_pbmc3_filtered_feature_bc_matrix.h5;\n",
     "    wget -O filtered_contig_annotations.csv https://cf.10xgenomics.com/samples/cell-vdj/3.1.0/vdj_v1_hs_pbmc3/vdj_v1_hs_pbmc3_b_filtered_contig_annotations.csv;\n",
@@ -85,7 +85,7 @@
   {
    "cell_type": "code",
    "execution_count": 2,
-   "id": "skilled-sunday",
+   "id": "downtown-outreach",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -97,7 +97,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "heated-coordination",
+   "id": "speaking-luxembourg",
    "metadata": {},
    "source": [
     "The folder features the following samples."
@@ -106,7 +106,7 @@
   {
    "cell_type": "code",
    "execution_count": 3,
-   "id": "specific-specialist",
+   "id": "mysterious-geology",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -115,7 +115,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "incoming-vancouver",
+   "id": "better-wesley",
    "metadata": {},
    "source": [
     "Import the GEX data and combine it into a single object. Prepend the sample name to each cell barcode, separated with `_`."
@@ -124,7 +124,7 @@
   {
    "cell_type": "code",
    "execution_count": 4,
-   "id": "amateur-swaziland",
+   "id": "basic-radio",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -142,7 +142,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "authorized-hammer",
+   "id": "disabled-brake",
    "metadata": {},
    "source": [
     "Import the Dandelion preprocessing output, and then combine that into a matching single object as well. We don't need to modify the cell names here, as they've already got the sample ID prepended to them by specifying the `prefix` in `meta.csv`."
@@ -151,7 +151,7 @@
   {
    "cell_type": "code",
    "execution_count": 5,
-   "id": "rural-shape",
+   "id": "harmful-davis",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -165,7 +165,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "patent-maple",
+   "id": "judicial-brown",
    "metadata": {},
    "source": [
     "Do standard GEX processing via Scanpy."
@@ -174,7 +174,7 @@
   {
    "cell_type": "code",
    "execution_count": 6,
-   "id": "interested-investigation",
+   "id": "surprised-paintball",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -193,7 +193,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "focal-hughes",
+   "id": "alien-thought",
    "metadata": {},
    "source": [
     "And that's it! Save the objects."
@@ -202,7 +202,7 @@
   {
    "cell_type": "code",
    "execution_count": 7,
-   "id": "optimum-skill",
+   "id": "horizontal-celebration",
    "metadata": {},
    "outputs": [],
    "source": [