-
Notifications
You must be signed in to change notification settings - Fork 4
/
Getting-Started.Rmd
154 lines (113 loc) · 3.64 KB
/
Getting-Started.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
---
title: "Getting started with MuData for MultiAssayExperiment"
author:
- name: "Danila Bredikhin"
affiliation: "European Molecular Biology Laboratory, Heidelberg, Germany"
email: "danila.bredikhin@embl.de"
- name: "Ilia Kats"
affiliation: "German Cancer Research Center, Heidelberg, Germany"
email: "i.kats@dkfz-heidelberg.de"
date: "`r Sys.Date()`"
output:
BiocStyle::html_document:
toc_float: true
vignette: >
%\VignetteIndexEntry{Getting started with MuDataMae}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
## Introduction
Multimodal data format — [MuData](https://mudata.readthedocs.io/) —
[has been introduced](https://www.biorxiv.org/content/10.1101/2021.06.01.445670v1)
to address the need for cross-platform standard for sharing
large-scale multimodal omics data. Importantly, it develops ideas of and is
compatible with [AnnData](https://anndata.readthedocs.io/) standard for storing
raw and derived data for unimodal datasets.
In Bioconductor, multimodal datasets have been stored in
[MultiAssayExperiment](https://bioconductor.org/packages/MultiAssayExperiment)
(MAE) objects. This `MuData` package provides functionality to read data from
MuData files into MAE objects as well as to save MAE objects into H5MU files.
## Installation
The most recent dev build can be installed from GitHub:
```{r, eval = FALSE}
library(remotes)
remotes::install_github("ilia-kats/MuData")
```
Stable version of `MuData` will be available in future bioconductor versions.
## Loading libraries
```{r setup, message = FALSE}
library(MuData)
library(MultiAssayExperiment)
library(rhdf5)
```
## Writing H5MU files
We'll use a simple MAE object from the `MultiAssayExperiment` package that
we'll then save in a H5MU file.
```{r}
data(miniACC)
miniACC
```
We will now write its contents into an H5MU file with `WriteH5MU`:
```{r}
writeH5MU(miniACC, "miniacc.h5mu")
```
## Reading H5MU files
We can manually check the top level of the structure of the file:
```{r}
rhdf5::h5ls("miniacc.h5mu", recursive = FALSE)
```
Or dig deeper into the file:
```{r}
h5 <- rhdf5::H5Fopen("miniacc.h5mu")
h5&'mod'
rhdf5::H5close()
```
### Creating MAE objects from H5MU files
This package provides `ReadH5MU` to create an object with data from an H5MU
file. Since H5MU structure has been designed to accommodate more structured
information than MAE, only some data will be read. For instance, MAE has no
support for loading multimodal embeddings or pairwise graphs.
```{r}
acc <- readH5MU("miniacc.h5mu")
acc
```
Importantly, we recover the information from the original MAE object:
```{r}
head(colData(miniACC)[,1:4])
head(colData(acc)[,1:4])
```
Features metadata is also recovered:
```{r}
head(rowData(miniACC[["gistict"]]))
head(rowData(acc[["gistict"]]))
```
#### Backed objects
It is possible to read H5MU files while keeping matrices (both `.X` and
`.layers`) on disk.
```{r}
acc_b <- readH5MU("miniacc.h5mu", backed = TRUE)
assay(acc_b, "RNASeq2GeneNorm")[1:5,1:3]
```
The data in the assay is a `DelayedMatrix` object:
```{r}
class(assay(acc_b, "RNASeq2GeneNorm"))
```
This is in contrast to the `acc` object that has matrices in memory:
```{r}
assay(acc, "RNASeq2GeneNorm")[1:5,1:3]
class(assay(acc, "RNASeq2GeneNorm"))
```
## References
- [Muon: multimodal omics analysis framework](https://www.biorxiv.org/content/10.1101/2021.06.01.445670) preprint
- [mudata](https://mudata.readthedocs.io/) (Python) documentation
- muon [documentation](https://muon.readthedocs.io/) and [web page](https://gtca.github.io/muon/)
## Session Info
```{r}
sessionInfo()
```