-
Notifications
You must be signed in to change notification settings - Fork 1
/
selenider.Rmd
328 lines (246 loc) · 9.13 KB
/
selenider.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
---
title: "Getting started with selenider"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Getting started with selenider}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
available <- selenider::selenider_available()
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
eval = available
)
```
```{r, eval = !available, include = FALSE}
message("Selenider is not available")
```
This vignette introduces you to the basics of automating a web browser using selenider.
```{r setup}
library(selenider)
```
## Starting the session
To use selenider, you must first start a session with `selenider_session()`. If you don't do this, it
is done automatically for you, but you may want to change some of the options from their defaults
(the backend, for example). Here, we use chromote as a backend (the default), and we set the timeout
to 10 seconds (the default is 4).
```{r}
session <- selenider_session(
"chromote",
timeout = 10
)
```
The session, once created, will be set as the *local session* inside the current environment,
meaning that in this case, it can be accessed anywhere in this script, and will be closed
automatically when the script finishes running.
One thing to remember is that if you start a session inside a function, it will be closed
automatically when the function finishes running. If you want to use the session outside the
function, you need to use the `.env` argument. For example, let's say we want a wrapper
function around `selenider_session()` that always uses selenium:
```{r}
# Bad (unless you only need to use the session inside the function)
my_selenider_session <- function(...) {
selenider_session("selenium", ...)
# The session will be closed here
}
# Good - the session will be open in the caller environment/function
my_selenider_session <- function(..., .env = rlang::caller_env()) {
selenider_session("selenium", ..., .env = .env)
}
```
Use `open_url()` to navigate to a website. selenider also provides the `back()` and `forward()` functions to easily
navigate through your search history, and the `reload()` function to reload the current page.
``` {r}
open_url("https://www.r-project.org/")
open_url("https://www.tidyverse.org/")
back()
forward()
reload()
```
## Selecting elements
Use `s()` to select an element. By default, CSS selectors are used, but other options are available.
``` {r}
header <- s("#rStudioHeader")
header
```
For example, an XPath can be used instead. XPaths can be useful for more complex selectors, and are
not limited to selecting from the ancestors of the current element. However, they can be difficult
to read.
```{r}
s(xpath = "//div/a")
```
Use `ss()` to select multiple elements.
```{r}
all_links <- ss("a")
all_links
```
Use `find_element()` and `find_elements()` to find child elements of an existing element. These can be
chained with the pipe operator (`|>`) to specify paths to elements. Just like `s()` and `ss()`, a
variety of selector types are available, but CSS selectors are used by default.
``` {r}
tidyverse_title <- s("#rStudioHeader") |>
find_element("div") |>
find_element(".productName")
tidyverse_title
menu_items <- s("#rStudioHeader") |>
find_element("#menu") |>
find_elements(".menuItem")
menu_items
```
Use `elem_children()` and friends to find elements using their relative position to another.
```{r}
s("#menuItems") |>
elem_children()
s("#menuItems") |>
elem_ancestors()
```
You can use `elem_filter()` and `elem_find()` to filter collections of elements using a custom function.
`elem_find()` returns the first matching element, while `elem_filter()` returns all matching elements.
These functions use the same interface as `elem_expect()`: see the "Expectations" section below.
``` {r}
# Find the blog item in the menu
menu_items |>
elem_find(has_text("Blog"))
# Find the hex badges on the second row
s(".hexBadges") |>
find_elements("img") |>
elem_filter(
\(x) substring(elem_attr(x, "class"), 1, 2) == "r2"
)
```
## Interacting with an element
selenider elements are *lazy*, meaning that when you specify the path to an element or group of
elements, they are not actually located in the DOM until you *do* something with them.
There are three types of functions that force an element to be collected:
* actions (e.g. `elem_click()`)
* properties (e.g. `elem_text()`)
* conditions (e.g. `is_visible()`)
Most functions that act on elements use the `elem_` prefix.
## Actions
There are various ways to interact with a HTML element.
Use `elem_click()`, `elem_right_click()`, or `elem_double_click()` to click on an element, and
`elem_hover()` to hover over an element. Use `elem_scroll_to()` to scroll to an element
before clicking it, which is useful if the element is not currently in view.
```{r}
s(".blurb") |>
find_element("a") |> # List of packages
elem_scroll_to() |>
elem_click()
```
Some links will not work when clicked on, since they will open their content in a new tab.
Use `open_url()` manually to solve this. This approach is recommended over using `elem_click()`, as it
is more reliable.
```{r}
s(".packages") |>
find_elements("a") |>
elem_find(has_text("dplyr")) |> # Find the link to the dplyr documentation
elem_attr("href") |> # Get the URL
open_url()
```
Use `elem_set_value()` to set the value of an input element, and `elem_clear_value()` to clear the value.
```{r}
s("input[type='search']") |>
elem_set_value("filter")
# Go back to the main page
back()
back()
```
selenider also provides a `elem_submit()` function, allowing you to submit a HTML form using any
element inside the form.
## Properties
HTML elements have a number of accessible properties.
```{r}
# Get the tag name
s("#appTidyverseSite") |>
elem_name()
# Get the text inside the element
s(".tagline") |>
elem_text()
# Get an attribute
s(".hexBadges") |>
find_element("img") |>
elem_attr("alt")
# Get every attribute
s(".hexBadges") |>
find_element("img") |>
elem_attrs()
# Get the 'value' attribute (`NULL` in this case)
s("#homeContent") |>
elem_value()
# Get a CSS property
s(".tagline") |>
elem_css_property("font-size")
```
## Conditions
Conditions are predicate functions on HTML elements. Unlike all other functions in selenider, they
do not wait for the element to exist or for the condition to be met: they return `TRUE` or `FALSE`
(or throw an error) instantly. For this reason, they are designed to be used with `elem_expect()`
and `elem_wait_until()`, which will automatically wait for conditions to be met.
There are a wide range of conditions, many of which do the same thing. Each HTML property has a
corresponding condition, and selenider also provides conditions for basic checks like `is_present()`,
`is_visible()` and `is_enabled()`. In the documentation for any condition, you can find all other
conditions in the "See Also" section.
```{r}
s(".hexBadges") |>
is_present()
```
## Expectations
selenider provides a concise testing interface using the `elem_expect()` function. Provide an element,
and one or more conditions, and the function will wait until all the conditions are met. Conditions
can be functions or simple calls (e.g. `has_text("text")` will be turned into
`has_text(<THE ELEMENT>, "text")`). `elem_expect()` tends to work well with R's lambda function
syntax.
``` {r}
s(".tagline") |>
elem_expect(is_present) |>
elem_expect(has_text("data science"))
s(".hexBadges") |>
find_element("a") |>
elem_expect(is_visible, is_enabled)
s("#menu") |>
find_element("#menuItems") |>
elem_children() |>
elem_expect(has_at_least(4))
s(".productName") |>
elem_expect(
\(x) substring(elem_text(x), 1, 1) == "T" # Tidyverse starts with T
)
```
Errors try to give as much information as possible. Since we know this condition is going to fail,
we'll set the timeout to a lower value so we don't have to wait for too long.
```{r, error = TRUE}
s(".band.first") |>
find_element(".blurb") |>
find_element("code") |>
elem_expect(has_text('install.packages("selenider")'), timeout = 1)
```
And (`&&`), or (`||`) and not (`!`) can be used as if the conditions were logical values. Additionally,
you can omit the first argument to `elem_expect()` (but in this case, all conditions must be calls).
``` {r}
s(".random-class") |>
elem_expect(!is_present)
s(".innards") |>
elem_expect(is_visible || is_enabled)
elem_1 <- s(".random-class")
elem_2 <- s("#main")
# Test that either the first or second element exists
elem_expect(is_present(elem_1) || is_present(elem_2))
```
Use `elem_wait_until()` if you don't want an error to be thrown if a condition is not met.
`elem_wait_until()` will do the exact same thing as `elem_expect()` but always returns `TRUE`
or `FALSE`.
```{r}
elem_wait_until(is_present(elem_1) || is_present(elem_2))
```
The syntax used for `elem_expect()` and `elem_wait_until()` can also be used in `elem_filter()`
and `elem_find()` to filter element collections. Additionally, selenider provides
`elem_expect_all()` and `elem_wait_until_all()` to test a condition on every element
in a collection.
```{r}
s(".hexBadges") |>
find_elements("a") |>
elem_expect_all(is_visible)
```
Once we are done, we do not need to close the session; it is closed for us automatically!