-
-
Notifications
You must be signed in to change notification settings - Fork 4
/
getting_started.Rmd
125 lines (100 loc) · 2.67 KB
/
getting_started.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
---
title: "Getting started"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Getting started}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
knitr::opts_chunk$set(eval = reticulate::py_module_available("anndata"))
```
The API of anndata for R is very similar to its Python counterpart.
Check out `?anndata` for a full list of the functions provided by this package.
`AnnData()` stores a data matrix `X` together with annotations
of observations `obs` (`obsm`, `obsp`), variables `var` (`varm`, `varp`),
and unstructured annotations `uns`.
Here is an example of how to create an AnnData object with 2 observations and 3 variables.
```{r}
library(anndata)
ad <- AnnData(
X = matrix(1:6, nrow = 2),
obs = data.frame(group = c("a", "b"), row.names = c("s1", "s2")),
var = data.frame(type = c(1L, 2L, 3L), row.names = c("var1", "var2", "var3")),
layers = list(
spliced = matrix(4:9, nrow = 2),
unspliced = matrix(8:13, nrow = 2)
),
obsm = list(
ones = matrix(rep(1L, 10), nrow = 2),
rand = matrix(rnorm(6), nrow = 2),
zeros = matrix(rep(0L, 10), nrow = 2)
),
varm = list(
ones = matrix(rep(1L, 12), nrow = 3),
rand = matrix(rnorm(6), nrow = 3),
zeros = matrix(rep(0L, 12), nrow = 3)
),
uns = list(
a = 1,
b = data.frame(i = 1:3, j = 4:6, value = runif(3)),
c = list(c.a = 3, c.b = 4)
)
)
ad
```
You can read the information back out using the `$` notation.
```{r}
ad$X
ad$obs
ad$obsm[["ones"]]
ad$layers[["spliced"]]
ad$uns[["b"]]
```
### Reading / writing AnnData objects
Read from h5ad format:
```{r, eval = FALSE}
read_h5ad("pbmc_1k_protein_v3_processed.h5ad")
```
### Creating a view
You can use any of the regular R indexing methods to subset the `AnnData` object.
This will result in a 'View' of the underlying data without needing to store
the same data twice.
```{r}
view <- ad[, 2]
view
view$is_view
ad[, c("var1", "var2")]
ad[-1, ]
```
### AnnData as a matrix
The `X` attribute can be used as an R matrix:
```{r}
ad$X[, c("var1", "var2")]
ad$X[-1, , drop = FALSE]
ad$X[, 2] <- 10
```
You can access a different layer matrix as follows:
```{r}
ad$layers["unspliced"]
ad$layers["unspliced"][, c("var2", "var3")]
```
### Note on state
If you assign an AnnData object to another variable and
modify either, both will be modified:
```{r}
ad2 <- ad
ad$X[, 2] <- 10
list(ad = ad$X, ad2 = ad2$X)
```
This is standard Python behaviour but not R. In order to
have two separate copies of an AnnData object, use the `$copy()` function:
```{r}
ad3 <- ad$copy()
ad$X[, 2] <- c(3, 4)
list(ad = ad$X, ad3 = ad3$X)
```