/
101_timing.Rmd
144 lines (100 loc) · 4.16 KB
/
101_timing.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
---
title: "Timings"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
rmarkdown::html_vignette:
toc: true
toc_depth: 2
fig_width: 7
fig_height: 4
fig_caption: true
includes:
in_header: "../../inst/_x3d_header.html"
vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{Timings}
%\VignetteEncoding{UTF-8}
---
```{r demo, message=FALSE, echo=FALSE, warning=FALSE}
library(knitr)
suppressPackageStartupMessages(require(terms))
options(cli.num_colors = 1)
theme_set(theme_grey())
opts_chunk$set(echo = FALSE, warning=FALSE,
message=FALSE, comment='',
tidy=FALSE, cache = FALSE)
do.call(read_chunk, list(path = "_run.R"))
```
## Objective
This example considers the scaling of computational time with respect to key parameters:
- number of particles
- number of multipoles
- number of incidence angles
- solution scheme
The number of wavelengths simply scales the computation time for all other parameters, as everything needs to be re-calculated for each wavelength (assuming negligible input-output time, compared to the calculations).
## Parameters
```{r params}
```
The structure consists of a helix of gold spheres, with varying number of particles.
```{r show_cluster, echo=FALSE, results='asis', message=FALSE}
suppressPackageStartupMessages(require(terms))
cat(' <style>
.x3d_scene {
float:right;
margin:0em;
}
</style>')
ge <- get_geometry('runs/input_10_1_16_1_1')
cl <- cluster_geometry(ge)
terms::x3d_scene(cl, viewpoint=c(0,0,300), width = "200px", height = "150px")
```
```{r tpl}
```
The timings are collected from the log files, and stored with the corresponding subroutine name (and detail where applicable),
```{r read}
```
## Total timings
We first look at the total run time summaries, as a function of number of particles $Npart$, multipolar order $Nmax$, and number of incidence directions $Ninc$.
```{r fulltimings, fig.height=5, fig.width=10}
```
## Detailed timings (subroutines)
At `Verbosity > 0` we can also access the timings within subroutines, to break down the calculation and identify time-consuming steps.
```{r info1}
str(combined)
unique(combined$routine)
```
Some subroutines are broken down in more detail,
```{r info2}
unique(combined$detail)
```
### Varying number of particles
```{r detailedtimings_Npart, fig.height=6, fig.width=10}
```
Details of the `solve` subroutine are shown below
```{r detailedtimings_Npart_2, fig.height=6, fig.width=10}
```
We can make some general remarks,
- The Stout Scheme (2) solves recursively the multiple scattering problem, starting with a dimer, and adding one particle at a time. This is evident in the routine `calcTIJStout` which increases in time.
- Mackwoski's Scheme (3) is very performant across the board; in fact it is about as fast as solving the original coupled-system without seeking the collective T-matrix, with the advantage that it also calculates orientation-averaged properties
### Varying maximum multipole order
```{r detailedtimings_Nmax, fig.height=6, fig.width=10}
```
Details of the `solve` subroutine are shown below
```{r detailedtimings_Nmax_2, fig.height=6, fig.width=10}
```
The computational cost scales similarly for all Schemes and subroutines with respect to maximum multipolar order.
### Varying number of incidence directions
```{r detailedtimings_Ninc, fig.height=6, fig.width=10}
```
Details of the `spectrumFF` subroutine are shown below
```{r detailedtimings_Ninc_3, fig.height=6, fig.width=10}
```
Details of the `solve` subroutine are shown below
```{r detailedtimings_Ninc_2, fig.height=6, fig.width=15}
```
Solving a linear system for multiple incidences (right hand side) differs between `Scheme = 0` (direct solve) and the other schemes (which seek a collective T-matrix). This is visible in routine `solLinSys`, which shows linear scaling in `Scheme = 0` while Schemes 1, 2, 3 have almost constant performance.
Curiously, the total timings rise sharply between 1 and 50 incidence angles for some schemes, suggesting that another subroutine is doing substantial computations. This may point to a possible optimisation in the future.
-----
```{r cleanup}
```
_Last run: `r format(Sys.time(), '%d %B, %Y')`_