/
basic.Rmd
69 lines (56 loc) 路 2.77 KB
/
basic.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
title: "basic"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{basic}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
This vignette provides a quick introduction to basic usage of the
{tweetbotornot2} package. It introduces the two key functions for generating
estimates from the built-in bot classifier.
```{r setup}
## load package
library(tweetbotornot2)
```
## Bot Classifier
The main use case for {tweetbotornot} is application of the built-in Twitter bot
classifier. There are two key functions for getting bot-probability estimates
for one or more Twitter accounts. Those functions are `predict_bot()` and
`predict_bot_score()`. Both of these functions return estimates generated via
same classifier, but `predict_bot()` returns a `data.table` with **one row for
each unique user** and **three columns containing the `user_id`, `screen_name`,
and `prob_bot` (the model estimate)**. While `predict_bot_score()` returns a
numeric vector of estimates matched to the input order of users.
```{r, eval=FALSE}
## users
users <- c("netflix_bot", "PatrickMahomes", "PatrickMahomes", "netflix_bot")
## returns a data.table with 2 rows
predict_bot(users)
## returns a numeric vector with 4 estimates
predict_bot_score(users)
```
Of course, one thing missing from the above code are the actual Twitter data
(or features) used to make the predictions. That's because when screen names
(or user IDs) are provided, {tweetbotornot2} does that work with the help of
[rtweet](https://rtweet.info) behind the scenes. But for that to work, users must
obtain the proper authorization for accessing Twitter's REST API. Fortunately,
for most users this process should now be a breeze鈥揺.g., running the code above
may require one-click authorizing {rtweet}'s embedded **`rstats2twitter`**
application but will otherwise *just work*. However, for users working in the
cloud or trying to generate estimates for thousands or millions of uers,
additional consideration may be warranted in terms of token-handling (see the
`"tokens"` vignette) and rate limit-management (see the `"bulk"` vignette).
<sup>1</sup> Unless you've already collected user timeline data, **[`{tweetbotornot2}`](tweetbotornot2.mikewk.com)**
relies on [`{rtweet}`](https://rtweet.info) to pull data from [Twitter's REST API](https://developer.twitter.com/en/docs).
<sup>2</sup> The first time you make a request [from your computer] to Twitter's API's, a
browser should pop-up, asking you to authorize the app. After that, the
authorization token is stored and remembered behind the scenes (you will only
have to reauthorize the app if your Twitter settings or token credentials get
reset or if you use a different machine).