# How to run shell commands in R

In R, both system() and system2() let you run terminal commands from within your R scripts, but they work differently. Here’s a simple breakdown of when and how to use each:

**system() – The Simple (But Risky) Way**

How it works: You pass the entire command as a single string.

Best for: Quick, one-off tasks where you trust the inputs.

- Pros:

Easy to use (like typing in a terminal).

Supports shell features (wildcards *, pipes |, etc.).

- Cons:

Security risk if inputs aren’t sanitized (e.g., user_input <- "malicious; rm -rf /" could cause harm).

**system2() – The Safer (More Controlled) Way**

How it works: You split the command and its arguments into separate parts.

Best for: Scripts where security and control matter.

- Pros:

Safer: Avoids shell injection risks.

More control: Arguments are passed cleanly (no quoting issues).

- Cons:

Slightly more verbose.

**Which Should You Use?**

Rule of Thumb

"If you’re pasting inputs into a command, use system2(). If you’re typing a fixed string, system() is fine."

## How to get environment (env) variables

In [None]:
my_bucket <- Sys.getenv('WORKSPACE_BUCKET')
my_bucket

In [None]:
DATASET <- Sys.getenv('WORKSPACE_CDR')
DATASET

## How to run shell commands in R

**list files in local disk**

In [None]:
# check files in the current working directory
system(paste0("ls *.csv "), intern=T)

In [None]:
# 
system(paste0("head manifest.csv > test.txt "), intern=T)

In [None]:
# check files in the current working directory
system(paste0("ls *.txt "), intern=T)

In [None]:
system2(
  command = "ls",  # The command to run
  args = c(" *.txt"),  # Arguments as a vector
  stdout = TRUE,  # Capture output (like intern=T in system())
  stderr = TRUE   # Capture errors (optional)
)

**use gsutil**

Copy files from genomic bucket to local disk

In [None]:
system("gsutil -u $GOOGLE_PROJECT cp gs://fc-aou-datasets-controlled/v8/wgs/cram/manifest.csv .", intern = T)

In [None]:
# check files in the current working directory
system(paste0("ls *.csv"), intern=T)

In [None]:
system2(
  command = "gsutil",  # Base command
  args = c(
    "-u", Sys.getenv("GOOGLE_PROJECT"),  # 
    "cp",
    "gs://fc-aou-datasets-controlled/v8/wgs/cram/manifest.csv",
    "manifest2.csv"
  ),
  stdout = TRUE,  # Capture output
  stderr = TRUE   # Capture errors
)

In [None]:
system2(
  command = "ls",  # The command to run
  args = c(" *.csv"),  # Arguments as a vector
  stdout = TRUE,  # Capture output (like intern=T in system())
  stderr = TRUE   # Capture errors (optional)
)

In [None]:
# Copy the file from the bucket to the current working directory, with a different file name
system(paste0("gsutil cp ", my_bucket, "/data/readme.txt", " readme2.txt"), intern=T)

In [None]:
# check files in the current working directory
system(paste0("ls *.txt"), intern=T)

Copy files from local disk to bucket

In [None]:
# 
system(paste0("gsutil cp readme.txt ", my_bucket, "/data/"), intern=T)

In [None]:
system2(
  command = "gsutil",
  args = c("cp", "readme.txt", paste0(my_bucket, "/data/readme3.txt")),  
  stdout = TRUE,  # Capture output
  stderr = TRUE   # Capture errors
)

In [None]:
# Check if file is in the bucket
system(paste0("gsutil ls ", my_bucket, "/data/readme.txt"), intern=T)

In [None]:
system2(
  command = "gsutil",
  args = c("ls", paste0(my_bucket, "/data/*.txt")),  # No space after "ls"!
  stdout = TRUE,  # Capture output
  stderr = TRUE   # Capture errors
)