/
02-LoadPancreasImages.Rmd
146 lines (109 loc) · 5.51 KB
/
02-LoadPancreasImages.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
title: "Load example images"
author: "Nicolas Damond"
date: "`r Sys.Date()`"
output: workflowr::wflow_html
editor_options:
chunk_output_type: console
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
This script downloads 100 example images and masks form the pancreas IMC dataset available [here](http://dx.doi.org/10.17632/cydmwsfztj.2).
The dataset is associated to the following publication:
[Damond et al. A Map of Human Type 1 Diabetes Progression by Imaging Mass Cytometry. Cell Metabolism. 2019 Mar 5;29(3):755-768](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6821395)
The images and masks have been created using the [imctools](https://github.com/BodenmillerGroup/imctools) package and the [IMC segmentation pipeline](https://github.com/BodenmillerGroup/ImcSegmentationPipeline).
We will use the [cytomapper](https://www.bioconductor.org/packages/release/bioc/html/cytomapper.html) package to read in the images and create `CytoImageList` objects.
# Download and read-in images
Here, a subset of 100 images from the pancreas IMC dataset is downloaded.
We use the `loadImages` function of the `cytomapper` package to read them into a `CytoImageList` object.
```{r load-images, message=FALSE}
library(cytomapper)
# Download the zipped folder image and unzip it
url.images <- ("https://data.mendeley.com/public-files/datasets/cydmwsfztj/files/b37054d2-d5d0-4c48-a001-81ff77136f41/file_downloaded")
download.file(url.images, destfile = "data/PancreasData/ImageSubset.zip")
unzip("data/PancreasData/ImageSubset.zip", exdir = "data/PancreasData/")
file.remove("data/PancreasData/ImageSubset.zip")
# Load the images as a CytoImageList object
images <- loadImages("data/PancreasData/", pattern="_full_clean.tiff")
images
```
We also download the associated segemntation masks and read them into a `CytoImageList` object.
```{r load-masks}
# Download the zipped folder masks and unzip it
url.masks <- ("https://data.mendeley.com/public-files/datasets/cydmwsfztj/files/13679a61-e9b4-4820-9f09-a5bbc697647c/file_downloaded")
download.file(url.masks, destfile = "data/PancreasData/Masks.zip")
unzip("data/PancreasData/Masks.zip", exdir = "data/PancreasData/")
file.remove("data/PancreasData/Masks.zip")
# Load the images as a CytoImageList object
masks <- loadImages("data/PancreasData/", pattern="_full_mask.tiff")
masks
```
Here, we remove the downloaded images again.
```{r clean-up-2, message = FALSE}
# Remove image stacks
images.del <- list.files("data/PancreasData/", pattern="_full_clean.tiff", full.names = TRUE)
file.remove(images.del)
# Remove masks
masks.del <- list.files("data/PancreasData/", pattern="_full_mask.tiff", full.names = TRUE)
file.remove(masks.del)
```
# Load panel data
Here, we will download the panel information, which contains antibody-related metadata.
However, for some datasets, the channel-order and the panel order do not match.
For this, the channel-mass file is used to match panel information and image stack slices.
This will be important later to set the `channelNames` of the `CytoImageList` objects.
```{r load-panel}
# Import panel
url.panel <- ("https://data.mendeley.com/public-files/datasets/cydmwsfztj/files/2f9fecfc-b98f-4937-bc38-ae1b959bd74d/file_downloaded")
download.file(url.panel, destfile = "data/PancreasData/panel.csv")
panel <- read.csv("data/PancreasData/panel.csv")
# Import channel-mass file
url.channelmass <- ("https://data.mendeley.com/public-files/datasets/cydmwsfztj/files/704312eb-377c-42e2-8227-44bb9aca0fb3/file_downloaded")
download.file(url.channelmass, destfile = "data/PancreasData/ChannelMass.csv")
channel.mass <- read.csv("data/PancreasData/ChannelMass.csv", header = FALSE)
```
# Process images and masks
We will now have to process the images to make them compatible with `cytomapper`.
The masks are 16-bit images and need to be re-scaled in order to obtain integer cell IDs.
```{r scale-masks}
# Before scaling
masks[[1]]
masks <- scaleImages(masks, value = (2 ^ 16) - 1)
# After scaling
masks[[1]]
```
Next, we will add the `ImageName` to the images and masks objects.
This information is stored in the metadata columns of the `CytoImageList` objects
and is used by `cytomapper` to match single cell data, images and mask
```{r add-image-names}
mcols(images)$ImageName <- gsub("_a0_full_clean", "", names(images))
mcols(masks)$ImageName <- gsub("_a0_full_mask", "", names(masks))
```
We downloaded the full set of segmentation masks.
To match the segmentation masks to the corresponding images, we will subset them.
As a safty check, we will make sure that the `ImageName`s of the masks are identical to those of the images.
```{r subset-masks}
masks <- masks[mcols(masks)$ImageName %in% mcols(images)$ImageName]
identical(mcols(masks)$ImageName, mcols(images)$ImageName)
```
Finally, we will use the protein short name as `channelNames`.
Again, we need to make sure that the names match the correct order of the channels.
```{r add-channel-names}
# Match panel and stack slice information
panel <- panel[panel$full == 1,]
panel <- panel[match(channel.mass[,1], panel$MetalTag),]
# Add channel names to the image stacks CytoImageList object
channelNames(images) <- panel$shortname
```
# Save the CytoImageList objects
Here, we will save the generated `CytoImageList` objects for convenient access later on.
```{r save}
saveRDS(images, "data/PancreasData/pancreas_images.rds")
saveRDS(masks, "data/PancreasData/pancreas_masks.rds")
```
# Clean up
We will delete all unecessary files.
```{r clean-up, message = FALSE}
file.remove("data/PancreasData/panel.csv", "data/PancreasData/ChannelMass.csv")
```