Skip to content

Commit

Permalink
Merge pull request #85 from BodenmillerGroup/devel
Browse files Browse the repository at this point in the history
Devel
  • Loading branch information
nilseling committed Nov 28, 2023
2 parents 36ab967 + bb9b77a commit a869169
Show file tree
Hide file tree
Showing 6 changed files with 256 additions and 22 deletions.
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -18,3 +18,4 @@ outputs/*
!publication/README.md
!publication/protocol.md
!CHANGELOG.md
!DEVELOPMENT.md
87 changes: 69 additions & 18 deletions 11-spatial_analysis.Rmd
Expand Up @@ -334,20 +334,83 @@ plotSpatial(spe,
scale_color_brewer(palette = "Set3")
```

The next code chunk visualizes the cell type compositions of the
detected cellular neighborhoods (CN).
There are now different visualizations to examine the cell type composition
of the detected cellular neighborhoods (CN). First we can look at the total
number of cells per cell type and CN.

```{r}
for_plot <- prop.table(table(spe$cn_celltypes, spe$celltype),
margin = 1)
for_plot <- table(as.character(spe$cn_celltypes), spe$celltype)
pheatmap(for_plot,
color = viridis(100), display_numbers = TRUE,
number_color = "white", number_format = "%.0f")
```

Next, we can observe per cell type the fraction of CN that they are distributed
across.

```{r}
for_plot <- prop.table(table(as.character(spe$cn_celltypes), spe$celltype), margin = 2)
pheatmap(for_plot,
color = viridis(100), display_numbers = TRUE,
number_color = "white", number_format = "%.2f")
```

Similarly, we can visualize the fraction of each CN made up of each cell type.

```{r}
for_plot <- prop.table(table(as.character(spe$cn_celltypes), spe$celltype), margin = 1)
pheatmap(for_plot,
color = viridis(100), display_numbers = TRUE,
number_color = "white", number_format = "%.2f")
```

This visualization can also be scaled by column to account for the relative
cell type abundance.

```{r}
pheatmap(for_plot,
color = colorRampPalette(c("dark blue", "white", "dark red"))(100),
scale = "column")
```

CN 1 and CN 6 are mainly composed of tumor cells with CN 6 forming the
tumor/stroma border. CN 3 is mainly composed of B and BnT cells
Lastly, we can visualize the enrichment of cell types within cellular neighborhoods
using the `regionMap` function of the `lisaClust` package.

```{r}
library(lisaClust)
regionMap(spe,
cellType = "celltype",
region = "cn_celltypes")
```

It is also recommended to visualize some images to confirm the interpretation of
cellular neighborhoods. For this we can either use the `lisClust::hatchingPlot` or
the `imcRtools::plotSpatial` functions:

```{r}
# hatchingPlot
cur_spe <- spe[,spe$sample_id == "Patient1_003"]
cur_sce <- as(cur_spe, "SingleCellExperiment")
cur_sce$x <- spatialCoords(cur_spe)[,1]
cur_sce$y <- spatialCoords(cur_spe)[,2]
cur_sce$region <- as.character(cur_sce$cn_celltypes)
hatchingPlot(cur_sce, region = "region", cellType = "celltype") +
scale_color_manual(values = metadata(spe)$color_vectors$celltype)
```

```{r, fig.height=8, fig.width=10}
# plotSpatial
plotSpatial(spe[,spe$sample_id == "Patient1_003"],
img_id = "cn_celltypes", node_color_by = "celltype", node_size_fix = 0.7) +
scale_color_manual(values = metadata(spe)$color_vectors$celltype)
```

CN 1 and CN 6 are mainly enriched for tumor cells with CN 6 forming the
tumor/stroma border. CN 3 is mainly enriched for B and BnT cells
indicating TLS. CN 5 is composed of aggregated plasma cells and most T
cells.

Expand Down Expand Up @@ -408,15 +471,12 @@ derive numeric vectors for each cell which can then again be clustered
using kmeans. All steps are supported by the `lisaClust` function which
can be applied to a `SingleCellExperiment` and `SpatialExperiment` object.


In the following example, we calculate the LISA curves within a 10µm, 20µm and
50µm neighborhood around each cell. Increasing these radii will lead to broader
and smoother spatial clusters. However, a number of parameter settings should be
tested to estimate the robustness of the results.

```{r lisaClust, fig.height=12, fig.width=12, message=FALSE}
library(lisaClust)
set.seed(220705)
spe <- lisaClust(spe,
k = 6,
Expand Down Expand Up @@ -448,15 +508,6 @@ In this case, CN 1 and 4 contain tumor cells but no CN is forming the
tumor/stroma interface. CN 3 represents TLS. CN 2 indicates T cell
subtypes and plasma cells are aggregated to CN 5.

As an alternative way of visualizing the enrichment of cell types within the
detected CNs, the `lisaClust` package provides the `regionMap` function.

```{r}
regionMap(spe,
cellType = "celltype",
region = "region")
```

## Spatial context analysis

Downstream of CN assignments, we will analyze the spatial context (SC)
Expand Down
7 changes: 6 additions & 1 deletion CHANGELOG.md
Expand Up @@ -4,4 +4,9 @@

**Version 1.0.1** [2023-10-19]

- Added seed before `predict` call after training a classifier
- Added seed before `predict` call after training a classifier

**Version 1.0.2** [2023-11-27]

- Added developers documentation
- Added more ways to visualize cell type composition per CN
134 changes: 134 additions & 0 deletions DEVELOPMENT.md
@@ -0,0 +1,134 @@
# Useful information when developing this book

This document is to guide future developers to maintain and extend the IMC
data analysis book.

## General setup

* The IMC data analysis book is written in [bookdown](https://bookdown.org/).
* Each section is stored in its own `.Rmd` file with `index.Rmd` building the landing page
* References are stored in `book.bib`
* At the end of each `.Rmd` file a number of unit tests are executed. These
unit tests are always executed but their results are not shown in the book.

### Continous integration/continous deployment

* CI/CD is executed based on the workflow [here](https://github.com/BodenmillerGroup/IMCDataAnalysis/blob/main/.github/workflows/build.yml).
* On the first of each month based on the [Dockerfile](https://github.com/BodenmillerGroup/IMCDataAnalysis/blob/main/Dockerfile) a new Docker image is build. We are doing this so that the workflow is always tested against the newest software versions.
* The Docker image is pushed to the Github Container Registry [here](https://github.com/BodenmillerGroup/IMCDataAnalysis/pkgs/container/imcdataanalysis).
* The Docker image is date tagged and `latest` always refers to the newest build.
* Once the Docker image is build, the IMC data analysis book is executed within the
newest Docker image. This will also run all unit tests.

**Of note:** Sometimes the calculation of the UMAP produces slightly different
results. If that happens the workflow run can be re-executed by clicking the `Re-run jobs` button of the workflow run.
This test could also be excluded on the long run.

* When pushing to `main` (either directly or via a PR), the CI/CD workflow is
executed.
* If the Dockerfile changed (e.g., if you want to add a new package), a new Docker image is build and the workflow is executed within the new Docker image.
* If the Dockerfile did not change, the workflow is executed within the most recent Docker image.

## Updating the book

This section describes how to update the book. You want to do this to add new content
but also to fix bugs or adjust unit tests.

### Work on the devel branch

It is recommended to work on the `devel` branch of the Github repository to add
new changes.

### Work within the newest Docker container

It is also recommended to always work within a Docker container based on the newest
Docker image available:

1. After installing [Docker](https://docs.docker.com/get-docker/) you can first pull the container via:

```
docker pull ghcr.io/bodenmillergroup/imcdataanalysis:yyyy-mm-dd
```

and then run the container:

```
docker run -v /path/to/IMCDataAnalysis:/home/rstudio/IMCDataAnalysis \
-e PASSWORD=bioc -p 8787:8787 \
ghcr.io/bodenmillergroup/imcdataanalysis:yyyy-mm-dd
```

2. An RStudio server session can be accessed via a browser at `localhost:8787` using `Username: rstudio` and `Password: bioc`.
3. Navigate to `IMCDataAnalysis` and open the `IMCDataAnalysis.Rproj` file.
4. Code in the individual files can now be executed or the whole workflow can be build by entering `bookdown::render_book()`.

### Adding new packages

If you need to add new packages to the workflow, make sure to add them to the
[software requirements](https://bodenmillergroup.github.io/IMCDataAnalysis/prerequisites.html#software-requirements)
section and to the Dockerfile.

### Opening a pull request

Now you can change the content of the book.
Once you have added all changes, push the changes to `devel` and open a pull request
to `main`. Wait until all checks have passed and you can merge the PR.

### Add changes to CHANGELOG.md

Please track the changes that you are making in the [CHANGELOG.md](CHANGELOG.md) file.

### Trigger a new release

Once you have added the changes to the CHANGELOG, merged the pull request and
the workflow has been executed on CI/CD, you can trigger a new release.

* Go to [here](https://github.com/BodenmillerGroup/IMCDataAnalysis/releases) and click on `Draft a new release` at the top of the page.
* Under `Choose a tag` create a new tag and give details on the release.
* With each release the corresponding [Zenodo repository](https://zenodo.org/records/10209942) is updated.

## Updating the data

For new `steinbock` releases and specifically if the Mesmer version changes, the
example data should be updated. The example data are stored on Central NAS
and are hosted on Zenodo.

### Re-analyse the example data

* You can find the raw data on [zenodo](https://zenodo.org/records/7575859).
* On Central NAS under projects/IMCWorkflow/zenodo create a new folder called `steinbock_0.x.y` where x denotes the new major version and y the new minor version.
* Copy the `steinbock.sh` script from the folder of the previous version to to folder of the newest version.
* Change the steinbock version number in the `steinbock.sh` script and execute it.
* It should generate all relevant files and zip all folders.

### Upload data to zenodo

* On [zenodo](https://zenodo.org/records/7624451), click on `New version` and replace all files with the newer version. No need to upload the raw data to zenodo as they are hosted in a different repository

### Adjust the book

* Work in the most recent Docker container and on the devel branch.
* Manually go through each section, update the links in the [Prerequisites](https://bodenmillergroup.github.io/IMCDataAnalysis/prerequisites.html#download-data) section
* Make sure to check and asjust the unit tests at the end of each file
* Make sure that the text (e.g. clustering) still matches the results

*Important:* as we are training a random forest classifier on manually gated cells, these gated cells won't match the newest version of the data if the Mesmer version changed. For this, we have the `code/transfer_labels.R` script that automatically re-gates cells in the new SPE object.

* Go through all sections until `Cell phenotyping`
* Based on the old `gated_cells` and the new SPE object, execute the `code/transfer_labels.R` script
* Zip the new `gated_cells` and upload them to a new version on [zendod](https://zenodo.org/records/8095133)
* Adjust the link to the new gated cells in the [Prerequisites](https://bodenmillergroup.github.io/IMCDataAnalysis/prerequisites.html#download-data) section
* Make sure that the new classification results closely match the new results

* Continue going through the book

### Execute the book

* When you are done working through the book, within the Docker container open the RProject file and execute `bookdown::render_book()` to make sure that it can be executed from beginning to end.
* Under `data/CellTypeValidation` have a look at the PNGs to check if celltypes were correctly detected.

### Add changes to CHANGELOG.md

Finally, add all the recent changes to the CHANGELOG, create and merge a PR and create a new release (see above).


25 changes: 23 additions & 2 deletions README.md
Expand Up @@ -10,7 +10,6 @@ R workflow highlighting analyses approaches for multiplexed imaging data.

## Scope


This workflow explains the use of common R/Bioconductor packages to pre-process and analyse single-cell data obtained from segmented multichannel images.
While we use imaging mass cytometry (IMC) data as an example, the concepts presented here can be applied to images obtained by other technologies (e.g. CODEX, MIBI, mIF, CyCIF, etc.).
The workflow can be largely divided into the following parts:
Expand All @@ -23,6 +22,13 @@ The workflow can be largely divided into the following parts:
6. Image visualization
7. Spatial analyses

## Update freeze

This workflow has been actively developed until December 2023. At that time
we used the most recent (`v.0.16.0`) version of `steinbock` to process the
example data. If you are having issues when using newer versions of `steinbock`
please open an issue [here](https://github.com/BodenmillerGroup/IMCDataAnalysis/issues).

## Usage

To reproduce the analysis displayed at [https://bodenmillergroup.github.io/IMCDataAnalysis/](https://bodenmillergroup.github.io/IMCDataAnalysis/) clone the repository via:
Expand Down Expand Up @@ -58,6 +64,20 @@ docker pull ghcr.io/bodenmillergroup/imcdataanalysis:<year-month-date>
3. Navigate to `IMCDataAnalysis` and open the `IMCDataAnalysis.Rproj` file.
4. Code in the individual files can now be executed or the whole workflow can be build by entering `bookdown::render_book()`.

## Feedback

We provide the workflow as an open-source resource. It does not mean that
this workflow is tested on all possible datasets or biological questions and
there exist multiple ways of analysing data. It is therefore recommended to
check the results and question their biological interpretation.

If you notice an issue or missing information, please report an issue
[here](https://github.com/BodenmillerGroup/IMCDataAnalysis/issues). We also
welcome contributions in form of pull requests or feature requests in form of
issues. Have a look at the source code at:

[https://github.com/BodenmillerGroup/IMCDataAnalysis](https://github.com/BodenmillerGroup/IMCDataAnalysis)

## Contributing guidelines

For feature requests and bug reports, please raise an issue [here](https://github.com/BodenmillerGroup/IMCDataAnalysis/issues).
Expand All @@ -68,10 +88,11 @@ To add new libraries to the container please add them to the [Dockerfile](Docker

## Maintainer

[Nils Eling](https://github.com/nilseling)
[Daniel Schulz](https://github.com/SchulzDan)

## Contributors

[Nils Eling](https://github.com/nilseling)
[Vito Zanotelli](https://github.com/votti)
[Daniel Schulz](https://github.com/SchulzDan)
[Jonas Windhager](https://github.com/jwindhager)
Expand Down
24 changes: 23 additions & 1 deletion index.Rmd
Expand Up @@ -45,10 +45,19 @@ spatial analysis and the user will need to become familiar with the general
framework to efficiently analyse data obtained from multiplexed imaging
technologies.

## Update freeze

This workflow has been actively developed until December 2023. At that time
we used the most recent (`v.0.16.0`) version of `steinbock` to process the
example data. If you are having issues when using newer versions of `steinbock`
please open an issue [here](https://github.com/BodenmillerGroup/IMCDataAnalysis/issues).

## Feedback and contributing

We provide the workflow as an open-source resource. It does not mean that
this workflow is tested on all possible datasets or biological questions.
this workflow is tested on all possible datasets or biological questions and
there exist multiple ways of analysing data. It is therefore recommended to
check the results and question their biological interpretation.

If you notice an issue or missing information, please report an issue
[here](https://github.com/BodenmillerGroup/IMCDataAnalysis/issues). We also
Expand All @@ -57,6 +66,19 @@ issues. Have a look at the source code at:

[https://github.com/BodenmillerGroup/IMCDataAnalysis](https://github.com/BodenmillerGroup/IMCDataAnalysis)

## Maintainer

[Daniel Schulz](https://github.com/SchulzDan)

## Contributors

[Nils Eling](https://github.com/nilseling)
[Vito Zanotelli](https://github.com/votti)
[Daniel Schulz](https://github.com/SchulzDan)
[Jonas Windhager](https://github.com/jwindhager)
[Michelle Daniel](https://github.com/michdaniel)
[Lasse Meyer](https://github.com/lassedochreden)

## Citation

The workflow has been published in
Expand Down

0 comments on commit a869169

Please sign in to comment.