-
Notifications
You must be signed in to change notification settings - Fork 26
/
neurons-as-graph.Rmd
344 lines (274 loc) · 11.3 KB
/
neurons-as-graph.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
---
title: "Working with individual neurons as graph structures"
author: "Gregory Jefferis"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Neurons as Graphs}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
## Introduction
Neurons can helpfully be treated as graphs (branching trees) in which nodes connected by edges
define the morphology of the neuron. The **nat** package provides a number of
built-in functions that allow you to analyse the branching structure of graphs
https://natverse.org/nat/reference/index.html
or use aspects of the branching structure to manipulate graphs. Examples of
such functions include the `strahler_order` function which calculates the
Strahler branch order for each node or segment in the neuron or the `spine`
function that extracts the longest path across the neuron.
More sophisticated analysis and manipulations can be carried out by converting
`neuron` objects into `ngraph` objects.
## Neurons as Graphs
A neuron will typically be a graph in the form of a binary tree. We follow
the convention of the Matlab [trees toolbox](https://www.treestoolbox.org/) by
Hermann Cuntz and colleagues in treating the root (typically the soma) as the
origin of the graph and then having directed edges leaving the root.
![neuron-as-graph](fig/neuron-graph.svg)
Nodes will be one of a:
* root
* branch point
* end point
* continuation point
Each node will have a numeric identifier or label (an arbitrary integer stored the in the
neuron's `PointNo` field that may have come from an external source) as well
an index (an integer starting at 1 and increasing without gaps). Although these
two identifiers may often be the same, code should never rely on this being the
case. In the wild, one frequently encounters cases where e.g. the numeric labels
* have gaps because a neuron was edited
* are not in the same order as the vertices
* have some other significance e.g. are globally unique across all neurons in a
database and therefore have large values (which may be challenging for R to
to represent give its maxint of 2^31 - 1=`r 2^31 - 1`)
```{r, message=FALSE}
library(nat)
n=Cell07PNs[[1]]
summary(n)
```
We can extract the points as follows:
```{r}
rootpoints(n)
branchpoints(n)
endpoints(n)
```
Segments are unbranched connected sequences of nodes that terminate in a branch
point or end point.
## Built-in neuron graph functions
We will give a few examples of the use of the built-in functions that treat
neurons as graphs.
### Strahler Order
The branching structure of a neuron is commonly summarised by calculating the
[Strahler Order](https://en.wikipedia.org/wiki/Strahler_number).
```{r, fig.width=6}
n=Cell07PNs[[1]]
so=strahler_order(n)
orders=1:max(so$points)
for (i in orders) {
plot(subset(n, so$points==i), col=i, add = i!=1, boundingbox = boundingbox(n))
}
```
Note the use of multiple calls to `plot.neuron` using the `add=TRUE` argument
for all but the first plot. Note also the use of the `boundingbox` argument/function
in order to ensure that the plot is set up with appropriate axes for the whole
neuron even if only part of it is plotted in the first call to `plot`.
### Spine
You can find the longest path across a neuron using the spine function.
```{r, fig.width=6}
n=Cell07PNs[[1]]
sp=spine(n)
plot(n, col='grey')
plot(sp, add=T, col='blue')
```
`spine` has a variety of options that you can use to control the results.
### Segment graph
You can use the `segmentgraph` function to make a simplified representation of
the branching structure of the neuron. In this object (which has class `igraph`
associated with the powerful `igraph` package) each unbranched segment in the
original neuron (which might have contained many vertices) is collapsed to a
single edge joining the branch points (which are retained).
```{r, fig.width=5, fig.height=5}
sg=segmentgraph(Cell07PNs[[1]])
plot(sg)
```
It can be useful to plot the graph with a tree layout:
```{r, fig.width=6, fig.height=6}
plot(sg, layout=igraph::layout_as_tree, edge.arrow.size=.3, vertex.size=15)
```
Note that the root of the neuron is placed at the top of the plot (point number
1 in the graph above) and that successive branching orders and leaves are
placed on levels further down the plot. Note also that the labels on the plot
correspond to the identifiers of the points in the original neuron (aka the
`PointNo` field, see [first section](#neurons-as-graphs)).
If you need to work with the original identifiers of the points in the
`segmentgraph` object, they are stored as `igraph` node attributes. You can
access them like this:
```{r}
igraph::V(sg)$label
igraph::V(sg)$vid
```
`label` encodes the `PointNo` column and `vid` the raw integer index of the
point in the node array. If and only if the `PointNo` identifier is a
sequentially increasing integer starting at 1, then these will be identical. The
SWC format is a little vague about whether they should indeed be the same, but
is normally understood to imply it. However there are many SWC files in the wild
that violate this assumption.
You can also use the `endpoints()` and `rootpoints()` functions with the
`segmentgraph` objects as well as any function that expects an `igraph` object.
We can use this to find the branchpoints upstream of all end points using the
`igraph::adjacent_vertices()` function. We use `mode="all"` to handle the
situation where the root node is also an end point (i.e. a leaf node).
```{r}
endpoints(sg)
ups=unlist(igraph::adjacent_vertices(sg, endpoints(sg), mode='all'))
# this maps the segmentgraph node indices back to indices for the neuron
igraph::V(sg)$vid[ups]
```
Here we plot those _terminal_ branchpoints in red while internal branches are
displayed in blue.
```{r, fig.height=4, fig.width=5}
plot(n, WithNodes = F)
terminal_branches=igraph::V(sg)$vid[ups]
other_branches=setdiff(branchpoints(n), terminal_branches)
points(xyzmatrix(n)[terminal_branches,1:2], col='red')
points(xyzmatrix(n)[other_branches,1:2], col='blue')
```
## ngraph objects
The **nat** package provides a bridge for neurons to the rich cross-platform **igraph** library. We provide a class
[`ngraph`](https://natverse.org/nat/reference/ngraph.html)
that is a thin wrapper for the `igraph` class. This looks after things
that we might need to know about a neuron (like the 3D coordinates of each node)
while still giving access to all of the graph functions in the igraph package.
```{r}
g=as.ngraph(Cell07PNs[[1]])
class(g)
g
```
You can use functions such as
```{r}
igraph::diameter(g)
```
to find the length of the longest path across the neuron. This is defined in
terms of the number of intervening nodes. You can also make a graph in which
the edge weights are the euclidean distance between the connected 3D nodes:
```{r}
gw=as.ngraph(Cell07PNs[[1]], weights=TRUE)
igraph::diameter(gw)
```
This gives you the longest path length (geodesic) across the graph in units of
µm in this case.
Note that although you can do `library(igraph)`, it adds a lot of functions to
the search path, some of which have name clashes, so I often just use the
package name (`igraph::`) prepended to the function that I want to call.
### Walking along ngraph objects
You can use the graph representation of neurons e.g. to find the path between nodes.
```{r}
g=as.ngraph(Cell07PNs[[1]], weights=TRUE)
eg=endpoints(g)
p=igraph::shortest_paths(g, from=1, to=180)
p$vpath[[1]]
# fails
p2=igraph::shortest_paths(g, from=180, to=1)
p2$vpath[[1]]
# mode all will find the path irrespective of direction of links, which are
# directed from the soma
p3=igraph::shortest_paths(g, from=180, to=1, mode = 'all')
p3$vpath[[1]]
all.equal(p$vpath[[1]], rev(p3$vpath[[1]]))
# just the distances - by default uses mode=all
igraph::distances(g, v=1, to=180)
igraph::distances(g, v=180, to=1)
```
### Node identifiers and indices
Note that in the previous code block, nodes are identified by their index (i.e.
an integer starting at 1). As already discussed, some neurons have an arbitrary
numeric identifier for each node (this can be a large integer from a database
table e.g. for CATMAID neurons). You access this identifier in all of the above
calls by quoting it. For example:
```{r}
# using raw indices
igraph::distances(g, v=180, to=1)
# using node identifiers
igraph::distances(g, v='180', to='1')
igraph::shortest_paths(g, from='1', to='180')$vpath[[1]]
```
In this instance, the results are identical since the node identifiers are the
same as the raw indices. If we manipulate the node identifiers to add 1000 to
each
```{r}
# make a copy of `ngraph` object and add 1000 to each identifier
g2=g
igraph::V(g2)$name <- igraph::V(g2)$name+1000
# make a neuron with thoose identifiers to see what happened to its structure:
n2=as.neuron(g2)
head(n2$d)
```
then find path by identifier:
```{r}
p2=igraph::shortest_paths(g2, from='1001', to='1180')$vpath[[1]]
p2
```
Note that when the path is printed it shows the node identifiers. But when using
the path, it may be necessary to convert to integers. This results in raw indices
again.
```{r}
as.integer(p2)
names(p2)
```
### Downstream nodes
You can also ask for nodes upstream or downstream of a given starting node. For
example the neurons in the `Cell07PNs` set have a tag called `AxonLHEP` that
defines the entry point of the axon into the lateral horn neuropil of the fly
brain. Here we defined
```{r}
n=Cell07PNs[[1]]
g=as.ngraph(n)
# find the nodes distal to this point
# nb you must set unreachable=F if you only want to get downstream nodes
igraph::dfs(g, mode='out', unreachable = FALSE, root=n$AxonLHEP)
# the proximal nodes back to the soma (including any branches)
igraph::dfs(g, mode='in', unreachable = FALSE, root=n$AxonLHEP)
```
Note that `dfs` (depth first search) provides a good way to visit all the nodes
of the neuron
Let's use this to make a function that prunes neurons downstream of this axon
entry point:
```{r, fig.width=6}
prune_from_lhep <- function(n, ...) {
g=as.ngraph(n)
downstream_indices=igraph::dfs(g, root = n$AxonLHEP, unreachable = FALSE)$order
prune_vertices(n, verticestoprune = downstream_indices, invert = TRUE)
}
pruned=nlapply(Cell07PNs[1:3], prune_from_lhep)
plot(Cell07PNs[1:3], col='grey')
plot(pruned, lwd=2, add = T)
```
The pruned neurons show up in red, green, and blue in the above plot.
### Distal nodes
As of nat v1.10.0 there is a `distal_to()` function to provide a simpler
access to nodes defined by graph position.
```{r}
n=Cell07PNs[[1]]
distal_to(n, node.idx = n$AxonLHEP)
```
One can find the complement (i.e. all the nodes that are not in the distal set)
by comparing with the indices of all vertices.
```{r}
setdiff(seq_len(nvertices(n)), distal_to(n, node.idx = n$AxonLHEP))
```
`distal_to()` allows you to write a slightly simpler version of the function
above:
```{r}
prune_from_lhep2 <- function(n, ...) {
downstream_indices=distal_to(n, node.idx = n$AxonLHEP)
prune_vertices(n, verticestoprune = downstream_indices, invert = TRUE)
}
all.equal(nlapply(Cell07PNs[1:3], prune_from_lhep2), pruned)
```
Notice that `distal_to()` will help with the situation where you need to find
nodes using their identifiers rather than indices. For example tags in neurons
loaded from the CATMAID reconstruction tool are defined by ids not indices.
```{r}
tokeep=distal_to(dl1neuron, node.pointno = dl1neuron$tags$SCHLEGEL_LH)
plot(dl1neuron, WithNodes = F, soma=2000)
plot(prune_vertices(dl1neuron, tokeep, invert = T), add=T, WithNodes = F, col='red', lwd=2)
```