Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
153 lines (111 sloc) 4.58 KB
---
title: "Dealing with multiple files"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Dealing with multiple files}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
can_decrypt <- gargle:::secret_can_decrypt("googledrive")
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
cache = TRUE,
error = TRUE,
purl = can_decrypt,
eval = can_decrypt
)
if (can_decrypt) {
json <- gargle:::secret_read("googledrive", "googledrive-docs.json")
googledrive::drive_auth(path = rawToChar(json))
}
options(tidyverse.quiet = TRUE)
```
```{r eval = !can_decrypt, echo = FALSE, comment = NA}
message("No token available. Code chunks will not be evaluated.")
```
Some googledrive functions are built to naturally handle multiple files, while others operate on a single file.
Functions that expect a single file:
* `drive_browse()`
* `drive_cp()`
* `drive_download()`
* `drive_ls()`
* `drive_mkdir()`
* `drive_mv()`
* `drive_rename()`
* `drive_update()`
* `drive_upload()`
Functions that allow multiple files:
* `drive_publish()`
* `drive_reveal()`
* `drive_rm()`
* `drive_share()`
* `drive_trash()`
In general, the principle is: if there are multiple parameters that are likely to vary across multiple files, the function is designed to take a single input. In order to use these function with multiple inputs, use them together with your favorite approach for iteration in R. Below is a worked example, focusing on tools in the tidyverse, namely the `map()` functions in purrr.
## Upload multiple files, then rename them
Scenario: we have multiple local files we want to upload into a folder on Drive. Then we regret their original names and want to rename them.
Load packages.
```{r}
library(googledrive)
library(glue)
library(tidyverse)
```
### Upload
Use the example files that ship with googledrive. This looks a bit odd, but the first call returns their names and the second returns full paths on the local system.
```{r}
local_files <- drive_example() %>%
drive_example()
basename(local_files)
```
Create a folder on your Drive and upload all files into this folder by iterating over the `local_files` using `purrr::map()`.
```{r}
folder <- drive_mkdir("upload-into-me-article-demo")
files <- map(local_files, ~ drive_upload(.x, path = folder, verbose = FALSE))
```
First, let's confirm that we uploaded the files into the new folder.
```{r}
drive_ls(folder)
```
Now let's reflect on the `files` object returned by this operation. `files` is a list of **dribbles**, one per uploaded file.
```{r}
str(files, max.level = 1)
```
This would be a favorable data structure if you've got more `map()`ing to do, as you'll see below.
But what if not? You can always row bind individual dribbles into one big dribble yourself with, e.g., `dplyr::bind_rows()`.
```{r}
bind_rows(files)
```
Below we show another way to finesse this by using a variant of `purrr::map()` that does this for us, namely `map_df()`.
### Rename
Imagine that we now wish these file names had a date prefix. First, form the new names. We use `glue::glue()` for string interpolation but you could also use `paste()`. Second, we map over two inputs: the list of dribbles from above and the vector of new names.
```{r}
(new_names <- glue("{Sys.Date()}_{basename(local_files)}"))
files_dribble <- map2_df(files, new_names, drive_rename)
```
We use `purrr::map2_df()` to work through `files`, the list of dribbles (= Drive files), and `new_names`, the vector of new names, and row bind the returned dribbles into a single dribble holding all files.
Let's check on the contents of this folder again to confirm the new names:
```{r}
drive_ls(folder)
```
Let's confirm that, by using `map2_df()` instead of `map2()`, we got a single dribble back, instead of a list of one-row dribbles:
```{r}
files_dribble
```
What if you wanted to get a list back, because your downstream operations include yet more `map()`ing? Then you would use `map2()`.
```{r eval = FALSE}
files_list <- map2(files, new_names, drive_rename)
```
### Clean up
Our trashing function, `drive_trash()` is vectorized and can therefore operate on a multi-file dribble. We could trash these files like so:
```{r eval = FALSE}
drive_trash(files_dribble)
```
If you're absolutely sure of yourself and happy to do something irreversible, you could truly delete these files with `drive_rm()`, which is also vectorized:
```{r eval = FALSE}
drive_rm(files_dribble)
```
Finally -- and this is the code we will actually execute -- the easiest way to delete these files is to delete their enclosing folder.
```{r}
drive_rm(folder)
```
You can’t perform that action at this time.