-
Notifications
You must be signed in to change notification settings - Fork 0
/
multi_processing.Rmd
55 lines (45 loc) · 2.06 KB
/
multi_processing.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
---
title: "Multi-core processing"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = F, warning = F, error = F, fig.width = 9)
library(evsim)
library(dplyr)
library(lubridate)
```
The time-series calculations done by functions `get_demand` and `get_n_connections` can be done by multiple time windows in parallel. At the same time, in computer science, a common practice to reduce computation times is parallel processing when the computer has multiple CPUs (i.e. multi-core processing). In parallel processing, a task is divided into multiple sub-tasks and assigned to multiple cores to make de calculations in parallel. For parallel processing in R we make use of the R base package `parallel`.
We can discover the number of physical CPUs in our computer with function `parallel::detectCores`:
```{r}
n_cores <- parallel::detectCores(logical = FALSE)
print(n_cores)
```
We set the parameter `logical = FALSE` because it is important to know the **physical** CPUs, not the logical (or virtual/thread) ones seen by the OS. More information about this parameter can be found in `parallel::detectCores` documentation.
Below, an example of calculating the demand of 10.000 EV charging sessions with different number of cores:
```{r sessions, echo = T, eval = F}
cores_time <- tibble(
cores = 1:n_cores,
time = 0
)
for (mcc in cores_time$cores) {
results <- system.time(
evsim::california_ev_sessions %>%
sample_n(10000) %>%
mutate(Profile = "All") %>%
get_demand(by = "Profile", resolution = 60, mc.cores = mcc)
)
cores_time$time[mcc] <- as.numeric(results[3])
}
```
```{r saved data, echo=F, eval = T}
# saveRDS(cores_time, "multi_processing_data/cores_time.RDS")
cores_time <- readRDS("multi_processing_data/cores_time.RDS")
```
And below there's the plot of the results, were we can see that, for this computer, the optimal number of cores to use is 4:
```{r}
library(ggplot2)
cores_time %>%
ggplot(aes(x = cores, y = time)) +
geom_col() +
labs(x = "Number of cores", y = "Computation time (seconds)")
```