Downloading YouTube Subtitle Transcription in a Tidy Tibble Data_Frame in R
.Rbuildignore V0.1.0 Apr 23, 2019
.gitignore Update Apr 23, 2019
README.Rmd Update May 1, 2019


License: GPL v3 CRAN status Total Downloads Travis build status AppVeyor build status Codecov test coverage


Although there exist some R packages tailored for YouTube API (e.g., ‘tuber’), downloading YouTube video subtitle (i.e., caption) in a tidy form has never been a low-hanging fruit. Using ‘youtube-transcript-api’ Python package under the hood, this R package provides users with a convenient way of parsing and converting a desired YouTube caption into a handy tibble data_frame object. Furthermore, users can easily save a desired YouTube caption data as a tidy Excel file without advanced programming background knowledge.


Python Dependencies

youtubecaption requires Anaconda Python environment on your system Path.

If you have not installed Conda environment on your system, please download and install Anaconda (Python 3.6 or later is recommended).

For this package, I have employed youtube-transcript-api Python module into R using reticulate.

R Package Installation

Development Version

You can install the latest development version as follows:

if(!require(remotes)) {


Stable Version

You can install the released version of youtubecaption from CRAN with:



Please use get_caption() function after loading youtubecaption package like below:


# Let's get the video caption out of Hadley Wickham's "You can't do data science in a GUI":
url <- ""
caption <- get_caption(url)

#> # A tibble: 1,420 x 5
#>    segment_id text                                start duration vid       
#>         <int> <chr>                               <dbl>    <dbl> <chr>     
#>  1          1 thank you for coming to a meeting ~  7.13     8.32 cpbtcsGE0~
#>  2          2 in regards to data science GUI with 10.7      8.44 cpbtcsGE0~
#>  3          3 happy with chief data scientist in~ 15.4      7.11 cpbtcsGE0~
#>  4          4 studio as well as the member of th~ 19.1      7.23 cpbtcsGE0~
#>  5          5 Foundation and an attempt professo~ 22.6      6    cpbtcsGE0~
#>  6          6 Stanford and at the University of   26.4      6.48 cpbtcsGE0~
#>  7          7 Auckland he builds both computatio~ 28.6      7.17 cpbtcsGE0~
#>  8          8 and cognitive tools to make data s~ 32.8      7.5  cpbtcsGE0~
#>  9          9 easier faster and more times his w~ 35.7      7.01 cpbtcsGE0~
#> 10         10 includes various packages as well ~ 40.4      6.21 cpbtcsGE0~
#> # ... with 1,410 more rows

# Save the caption as an Excel file and open it right it away:
get_caption(url = url, savexl = TRUE, openxl = TRUE)
