/
api.Rmd
3123 lines (2228 loc) · 94.3 KB
/
api.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Complete Guide for Seven Bridges API R Client"
date: "`r Sys.Date()`"
output:
rmarkdown::html_document:
toc: true
toc_float: true
toc_depth: 4
number_sections: true
theme: "flatly"
highlight: "textmate"
css: "sevenbridges.css"
vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{Complete Guide for Seven Bridges API R Client}
---
<a name="top"></a>
```{r include=FALSE}
knitr::opts_chunk$set(eval = FALSE)
```
# Introduction
`sevenbridges` is an R/Bioconductor package that provides an interface for Seven Bridges public API. The supported platforms includes the [Seven Bridges Platform](https://igor.sbgenomics.com/), [Cancer Genomics Cloud (CGC)](https://www.cancergenomicscloud.org), and [Cavatica](https://cavatica.sbgenomics.com).
Learn more from our documentation on the [Seven Bridges Platform](https://docs.sevenbridges.com/page/api) and the [Cancer Genomics Cloud (CGC)](http://docs.cancergenomicscloud.org/v1.0/page/the-cgc-api).
## R Client for Seven Bridges API
The `sevenbridges` package only supports v2+ versions of the API, since versions prior to V2 are not compatible with the Common Workflow Language (CWL). This package provides a simple interface for accessing and trying out various methods.
There are two ways of constructing API calls. For instance, you can use low-level API calls which use arguments like `path`, `query`, and `body`. These are documented in the API reference libraries for the [Seven Bridges Platform](https://docs.sevenbridges.com/reference#list-all-api-paths) and the [CGC](http://docs.cancergenomicscloud.org/docs/new-1). An example of a low-level request to "list all projects" is shown below. In this request, you can also pass `query` and `body` as a list.
```{r}
library("sevenbridges")
a <- Auth(token = "your_token", platform = "aws-us")
a$api(path = "projects", method = "GET")
```
***(Advanced user option)*** The second way of constructing an API request is to directly use the `httr` package to make your API calls, as shown below.
```{r}
a$project()
```
## API General Information
Before we start, keep in mind the following:
__`offset` and `limit`__
Every API call accepts two arguments named `offset` and `limit`.
- Offset defines where the retrieved items started.
- Limit defines the number of items you want to get.
By default, `offset` is set to `0` and `limit` is set to `100`. As such, your API request returns the __first 100 items__ when you list items or search for items by name. To search and list all items, use `complete = TRUE` in your API request.
__Search by ID__
When searching by ID, your request will return your exact resource as it is unique. As such, you do not have to set `offset` and `limit` manually. It is a good practice to find your resources by their ID and pass this ID as an input to your task. You can find a resource's ID in the final part of the URL on the visual interface or via the API requests to list resources or get a resource's details.
__Search by name__
Search by name returns all partial matches unless you specify `exact = TRUE`.
## Installation
The `sevenbridges` package is available on both the `release` and `devel` branch from Bioconductor.
To install it from the `release` branch, use:
```{r}
install.packages("BiocManager")
BiocManager::install("sevenbridges")
```
To install it from the `devel` branch, use:
```{r}
install.packages("BiocManager")
BiocManager::install("sevenbridges", version = "devel")
```
Since we are constantly improving our API and client libraries, please also visit our [GitHub repository](https://github.com/sbg/sevenbridges-r) for the most recent news and the latest version of the package.
**If you do not have `devtools`**
This installation requires that you have the `devtools` package. If you do not have this package, you can install it from CRAN.
```{r}
install.packages("devtools")
```
You may get an error for missing system dependencies such as `curl` and `openssl`. For example, in Ubuntu, you probably need to do the following first to install `devtools` and to build vignettes since you need `pandoc`.
```bash
apt-get update
apt-get install libcurl4-gnutls-dev libssl-dev pandoc pandoc-citeproc
```
**If `devtools` is already installed**
Install the latest version for `sevenbridges` from GitHub with the following:
```{r}
install.packages("BiocManager")
install.packages("readr")
devtools::install_github(
"sbg/sevenbridges-r",
repos = BiocManager::repositories(),
build_vignettes = TRUE, dependencies = TRUE
)
```
If you have trouble with `pandoc` and do not want to install it, set `build_vignettes = FALSE` to avoid the vignettes build.
# Quickstart
For more details about how to use the API client in R, please consult the [Seven Bridges API Reference](#reference) section below for a complete guide.
## Create `Auth` Object
Before you can access your account via the API, you have to provide your credentials. You can obtain your credentials in the form of an ["authentication token"](https://docs.sevenbridges.com/v1.0/docs/get-your-authentication-token) from the **Developer Tab** under **Account Settings** on the visual interface. Once you've obtained this, create an `Auth` object, so it remembers your authentication token and the path for the API. All subsequent requests will draw upon these two pieces of information.
Let's load the package first:
```{r, eval = TRUE, message = FALSE}
library("sevenbridges")
```
You have three different ways to provide your token. Choose from one method below:
1. [Direct authentication.](#method1) This explicitly and temporarily sets up your token and platform type (or alternatively, API base URL) in the function call arguments to `Auth()`.
2. [Authentication via system environment variables.](#method2) This will read the credential information from two system environment variables: `SB_API_ENDPOINT` and `SB_AUTH_TOKEN`.
3. [Authentication via the user configuration file.](#method3) This file, by default `$HOME/.sevenbridges/credentials`, provides an organized way to collect and manage all your API authentication information for Seven Bridges platforms.
<div style="margin-left:3em"><a name="method1"/>**Method 1: Direct authentication**
This is the most common method to construct the `Auth` object. For example:
```{r}
(a <- Auth(platform = "cgc", token = "your_token"))
```
```
Using platform: cgc
== Auth ==
url : https://cgc-api.sbgenomics.com/v2/
token : <your_token>
```
<a name="method2"/>**Method 2: Environment variables**
To set the two environment variables in your system, you could use
the function `sbg_set_env()`. For example:
```{r}
sbg_set_env("https://cgc-api.sbgenomics.com/v2", "your_token")
```
Note that this change might be just temporary, please feel free to
use the standard method to set persistent environment variables
according to your operating system.
Create an `Auth` object:
```{r}
a <- Auth(from = "env")
```
<a name="method3"/>***Method 3: User configuration file***
Assume we have already created the configuration file named
`credentials` under the directory `$HOME/.sevenbridges/`:
```
[aws-us-rfranklin]
api_endpoint = https://api.sbgenomics.com/v2
auth_token = token_for_this_user
# This is a comment:
# another user on the same platform
[aws-us-rosalind-franklin]
api_endpoint = https://api.sbgenomics.com/v2
auth_token = token_for_this_user
[default]
api_endpoint = https://cgc-api.sbgenomics.com/v2
auth_token = token_for_this_user
[gcp]
api_endpoint = https://gcp-api.sbgenomics.com/v2
auth_token = token_for_this_user
```
To load the user profile `aws-us-rfranklin` from this configuration file, simply use:
```{r}
a <- Auth(from = "file", profile_name = "aws-us-rfranklin")
```
If `profile_name` is not specified, we will try to load the profile named `[default]`:
```{r}
a <- Auth(from = "file")
```
***Note:*** API paths (base URLs) differ for each Seven Bridges environment. Be sure to provide the correct path for the environment you are using. API paths for some of the environments are:
+-------------------------------------------+---------------------------------------------------+---------------+
| Platform Name | API Base URL | Short Name |
+===========================================+===================================================+===============+
| Seven Bridges Platform (US) | `https://api.sbgenomics.com/v2` | `"aws-us"` |
+-------------------------------------------+---------------------------------------------------+---------------+
| Seven Bridges Platform (EU) | `https://eu-api.sbgenomics.com/v2` | `"aws-eu"` |
+-------------------------------------------+---------------------------------------------------+---------------+
| Seven Bridges Platform (China) | `https://api.sevenbridges.cn/v2` | `"ali-cn"` |
+-------------------------------------------+---------------------------------------------------+---------------+
| Cancer Genomics Cloud (CGC) | `https://cgc-api.sbgenomics.com/v2` | `"cgc"` |
+-------------------------------------------+---------------------------------------------------+---------------+
| Cavatica | `https://cavatica-api.sbgenomics.com/v2` | `"cavatica"` |
+-------------------------------------------+---------------------------------------------------+---------------+
| BioData Catalyst Powered by Seven Bridges | `https://api.sb.biodatacatalyst.nhlbi.nih.gov/v2` | `"f4c"` |
+-------------------------------------------+---------------------------------------------------+---------------+
Please refer to the [API reference section](#auth-reference) for more usage and technical details about the three authentication methods.
</div>
<div align="right"><a href="#top">top</a></div>
## Get User Information
<a name="youruser"/>**Get your own information**
This call returns information about your account.
```{r}
a$user()
```
```
== User ==
href : https://cgc-api.sbgenomics.com/v2/users/RFranklin
username : RFranklin
email : rosalind.franklin@sbgenomics.com
first_name : Rosalind
last_name : Franklin
affiliation : Seven Bridges Genomics
country : United States
```
<div align="right"><a href="#top">top</a></div>
**Get information about a user**
This call returns information about the specified user. Note that currently you can view only your own user information, so this call is equivalent to the [call to get information about your account](#youruser).
```{r}
a$user(username = "RFranklin")
```
<div align="right"><a href="#top">top</a></div>
## Rate Limit
This call returns information about your current rate limit. This is the number of API calls you can make in one hour.
```{r}
a$rate_limit()
```
```
== Rate Limit ==
limit : 1000
remaining : 993
reset : 1457980957
```
<div align="right"><a href="#top">top</a></div>
## Show Billing Information
Each project must have a Billing Group associated with it. This Billing Group pays for the storage and computation in the project.
For example, your first project(s) were created with the free funds from the Pilot Funds Billing Group assigned to each user at sign-up. To get information about billing:
```{r}
# check your billing info
a$billing()
a$invoice()
```
For more information, use `breakdown = TRUE`.
```{r}
a$billing(id = "your_billing_id", breakdown = TRUE)
```
<div align="right"><a href="#top">top</a></div>
## Create Project
Projects are the core building blocks of the platform. Each project corresponds to a distinct scientific investigation, serving as a container for its data, analysis tools, results, and collaborators.
Create a new project called "api testing" with the billing group `id` obtained above.
```{r}
# get billing group id
bid <- a$billing()$id
# create new project
(p <- a$project_new(name = "api testing", bid, description = "Just a test"))
```
```
== Project ==
id : RFranklin/api-testing
name : api testing
description : Just a test
billing_group_id : <your_bid>
type : v2
-- Permission --
```
<div align="right"><a href="#top">top</a></div>
## Get Details about Existing Project
```{r}
# list first 100
a$project()
# list all
a$project(complete = TRUE)
# return all named match "demo"
a$project(name = "demo", complete = TRUE)
# get the project you want by id
p <- a$project(id = "RFranklin/api-tutorial")
```
<div align="right"><a href="#top">top</a></div>
## Copy Public Apps into Your Project
Seven Bridges maintains workflows and tools available to all of its users in the Public Apps repository.
To find out more about public apps, you can do the following:
- Browse them online. Check out the [tutorial](http://docs.cancergenomicscloud.org/docs/) for the "Find apps" section.
- You can use the `sevenbridges` package to find it, as shown below.
```{r}
# search by name matching, complete = TRUE search all apps,
# not limited by offset or limit.
a$public_app(name = "STAR", complete = TRUE)
# search by id is accurate
a$public_app(id = "admin/sbg-public-data/rna-seq-alignment-star/5")
# you can also get everything
a$public_app(complete = TRUE)
# default limit = 100, offset = 0 which means the first 100
a$public_app()
```
Now, from your `Auth` object, you copy an App `id` into your `project` id with a new `name`, following this logic.
```{r}
# copy
a$copy_app(
id = "admin/sbg-public-data/rna-seq-alignment-star/5",
project = "RFranklin/api-testing", name = "new copy of star"
)
# check if it is copied
p <- a$project(id = "RFranklin/api-testing")
# list apps your got in your project
p$app()
```
The short name is changed to `newcopyofstar`.
```
== App ==
id : RFranklin/api-testing/newcopyofstar/0
name : RNA-seq Alignment - STAR
project : RFranklin/api-testing-2
revision : 0
```
Alternatively, you can copy it from the `app` object.
```{r}
app <- a$public_app(id = "admin/sbg-public-data/rna-seq-alignment-star")
app$copy_to(
project = "RFranklin/api-testing",
name = "copy of star"
)
```
<div align="right"><a href="#top">top</a></div>
## Import CWL App and Run a Task
You can also upload your own Common Workflow Language JSON file which describes your app to your project.
***Note:*** Alternatively, you can directly describe your CWL tool in R with this package. Please read the vignette on "Describe CWL Tools/Workflows in R and Execution".
```{r}
# add an CWL file to your project
f.star <- system.file("extdata/app", "flow_star.json", package = "sevenbridges")
app <- p$app_add("starlocal", fl.runif)
(aid <- app$id)
```
You will get an app `id`, like the one below:
```
"RFranklin/api-testing/starlocal/0"
```
It's composed of the following elements:
1. __project id__ : `RFranklin/api`
2. __app short name__ : `runif`
3. __revision__ : `0`
Alternatively, you can describe tools in R directly, as shown below:
```{r comment = "", eval = TRUE}
fl <- system.file("docker", "sevenbridges/rabix/generator.R",
package = "sevenbridges"
)
cat(readLines(fl), sep = "\n")
```
Then, you can add it like this:
```{r}
# rbx is the object returned by `Tool` function
app <- p$app_add("runif", rbx)
(aid <- app$id)
```
Please consult another tutorial `vignette("apps", "sevenbridges")` about how to describe tools and flows in R.
<div align="right"><a href="#top">top</a></div>
## Execute a New Task
### Find your app inputs
Once you have copied the public app `admin/sbg-public-data/rna-seq-alignment-star/5` into your project, `username/api-testing`, the app `id` in your current project is `username/api-testing/newcopyofstar`. Conversely, you can use another app you already have in your project for this Quickstart.
To draft a new task, you need to specify the following:
- The name of the task
- An optional description
- The app `id` of the workflow you are executing
- The inputs for your workflow. In this case, the CWL app accepts four parameters: number, min, max, and seed.
You can always check the App details on the visual interface for task input requirements. To find the required inputs with R, you need to get an `App` object first.
```{r}
app <- a$app(id = "RFranklin/api-testing-2/newcopyofstar")
# get input matrix
app$input_matrix()
app$input_matrix(c("id", "label", "type"))
# get required node only
app$input_matrix(c("id", "label", "type"), required = TRUE)
```
Conversely, you can load the app from a CWL JSON and convert it into an R object first, as shown below.
```{r, eval = TRUE}
f1 <- system.file("extdata/app", "flow_star.json", package = "sevenbridges")
app <- convert_app(f1)
# get input matrix
app$input_matrix()
app$input_matrix(c("id", "label", "type"))
app$input_matrix(c("id", "label", "type"), required = TRUE)
```
Note that `input_matrix` and `output_matrix` are useful accessors for `Tool`, `Flow`, you can call these two function on a JSON file directly as well.
```{r}
tool.in <- system.file("extdata/app", "tool_unpack_fastq.json", package = "sevenbridges")
flow.in <- system.file("extdata/app", "flow_star.json", package = "sevenbridges")
input_matrix(tool.in)
input_matrix(tool.in, required = TRUE)
input_matrix(flow.in)
input_matrix(flow.in, c("id", "type"))
input_matrix(flow.in, required = TRUE)
output_matrix(tool.in)
output_matrix(flow.in)
```
In the response body, locate the names of the required inputs. Note that task inputs need to match the expected data type and name. In the above example, we see two required fields:
- **fastq:** This input takes a file array as indicated by "File..."
- **genomeFastaFiles:** This is a single file as indicated by "File".
We also want to provide a gene feature file:
- **sjdbGTFfile:** A single file as indicated by "File".
You can find a list of possible input types below:
- **number, character or integer:** you can directly pass these to the input parameter as it is.
- **enum type:** Pass this value to the input parameter.
- **file:** This input is a file. However, while some inputs accept only single file (`File`), other inputs take more than one file (`File` arrays, `FilesList`, or '`File...`' ). This input requires you to pass a `Files` object (for a single file input) or `FilesList` object (for inputs which accept more than one file) or simply a list in a "Files" object. You can search for your file by `id` or by `name` with an exact match (`exact = TRUE`), as shown in the example below.
<div align="right"><a href="#top">top</a></div>
### Get your input files ready
```{r}
fastqs <- c("SRR1039508_1.fastq", "SRR1039508_2.fastq")
# get all 2 exact files
fastq_in <- p$file(name = fastqs, exact = TRUE)
# get a single file
fasta_in <- p$file(
name = "Homo_sapiens.GRCh38.dna.primary_assembly.fa",
exact = TRUE
)
# get all single file
gtf_in <- p$file(
name = "Homo_sapiens.GRCh38.84.gtf",
exact = TRUE
)
```
<div align="right"><a href="#top">top</a></div>
### Create a new draft task
```{r}
# add new tasks
taskName <- paste0("RFranklin-star-alignment ", date())
tsk <- p$task_add(
name = taskName,
description = "star test",
app = "RFranklin/api-testing-2/newcopyofstar/0",
inputs = list(
sjdbGTFfile = gtf_in,
fastq = fastq_in,
genomeFastaFiles = fasta_in
)
)
```
Remember the `fastq` input expects a list of files. You can also do something as follows:
```{r}
f1 <- p$file(name = "SRR1039508_1.fastq", exact = TRUE)
f2 <- p$file(name = "SRR1039508_2.fastq", exact = TRUE)
# get all 2 exact files
fastq_in <- list(f1, f2)
# or if you know you only have 2 files whose names match SRR924146*.fastq
fastq_in <- p$file(name = "SRR1039508*.fastq", complete = TRUE)
```
Use `complete = TRUE` when the number of items is over 100.
### Draft a batch task
Now let's do a batch with 8 files in 4 groups, which is batched by metadata `sample_id` and `library_id`. We will assume each file has these two metadata fields entered. Since these files can be evenly grouped into 4, we will have a single parent batch task with 4 child tasks.
```{r}
fastqs <- c(
"SRR1039508_1.fastq", "SRR1039508_2.fastq", "SRR1039509_1.fastq",
"SRR1039509_2.fastq", "SRR1039512_1.fastq", "SRR1039512_2.fastq",
"SRR1039513_1.fastq", "SRR1039513_2.fastq"
)
# get all 8 files
fastq_in <- p$file(name = fastqs, exact = TRUE)
# can also try to returned all SRR*.fastq files
# fastq_in <- p$file(name= "SRR*.fastq", complete = TRUE)
tsk <- p$task_add(
name = taskName,
description = "Batch Star Test",
app = "RFranklin/api-testing-2/newcopyofstar/0",
batch = batch(
input = "fastq",
criteria = c("metadata.sample_id", "metadata.noexist_id")
),
inputs = list(
sjdbGTFfile = gtf_in,
fastq = fastqs_in,
genomeFastaFiles = fasta_in
)
)
```
Now you have a draft batch task. Please check it out in the visual interface. Your response body should inform you of any errors or warnings.
Note: you can also directly pass file id or file names as characters to inputs list,
the package will first guess if the passed strings are file id (24-bit hexadecimal)
or names, then convert it to Files or FilesList object. However, as a good practice,
we recommend you construct your files object(e.g. `p$file(id = ..., name = ...)`) first,
check the value, then pass it to `task_add` inputs. This is a safer approach.
<div align="right"><a href="#top">top</a></div>
## Run a Task
Now, we are ready to run our task.
```{r}
# run your task
tsk$run()
```
Before you run your task, you can adjust your draft task if you have any final modifications. Alternatively, you can delete the draft task if you no longer wish to run it.
```{r}
# # not run
# tsk$delete()
```
After you run a task, you can abort it.
```{r}
# abort your task
tsk$abort()
```
If you want to update your task and then re-run it, follow the example below.
```{r}
tsk$getInputs()
# missing number input, only update number
tsk$update(inputs = list(sjdbGTFfile = "some new file"))
# double check
tsk$getInputs()
```
<div align="right"><a href="#top">top</a></div>
## Run tasks using spot instances
Running tasks with [spot instances](https://docs.sevenbridges.com/docs/about-spot-instances)
could potentially [reduce a considerable amount of computational cost](https://www.sevenbridges.com/spot-instances-cost-reduction/).
This option can be controlled on the project level or the task level on Seven Bridges platforms.
Our package follows the same [logic](https://docs.sevenbridges.com/docs/use-spot-instances)
as our platform's web interface (the current default setting for spot instances is **on**).
For example, when we create a project using `project_new()`,
we can set `use_interruptible = FALSE` to use on-demand instances
(non-interruptible but more expensive) instead of the spot instances
(interruptible but cheaper):
```{r}
p <- a$project_new(
name = "spot-disabled-project", bid, description = "spot disabled project",
use_interruptible = FALSE
)
```
Then all the new tasks created under this project will use on-demand instances
to run **by default**, unless an argument `use_interruptible_instances`
is specifically set to `TRUE` when drafting the new task using `task_add()`.
For example, if `p` is the above spot disabled project, to draft
a task that will use spot instances to run:
```{r}
tsk <- p$task_add(
name = paste0("spot enabled task in a spot disabled project"),
description = "spot enabled task",
app = ...,
inputs = list(...),
use_interruptible_instances = TRUE
)
```
Conversely, you can have a spot instance enabled project,
but draft and run specific tasks using on-demand instances,
by setting `use_interruptible_instances = FALSE` in `task_add()` explicitly.
## Execution hints per task run
During workflow development and benchmarking, sometimes we need to view and make adjustments to the computational resources needed for a task to run more efficiently. Also, if a task fails due to resource deficiency, we often want to define a larger instance for the task re-run without editing the app itself. This is particularly important in cases where there is not enough disk space.
The Seven Bridges API allows setting specific task execution parameters by using `execution_settings`. It includes the instance type (`instance_type`) and the maximum number of parallel instances (`max_parallel_instances`):
```{r}
tsk <- p$task_add(
...,
execution_settings = list(
instance_type = "c4.2xlarge;ebs-gp2;2000",
max_parallel_instances = 2
)
)
```
For details about `execution_settings`, please check [create a new draft task](https://docs.sevenbridges.com/v1.0/reference#create-a-new-task).
## Task Monitoring
To monitor your task as it runs, you can always request a task `update` to ask your task to report its status. Or, you can monitor a running task with a hook function, which triggers the function when that status is "completed". Please check the details in [section](#hooktask) below.
```{r}
tsk$update()
```
By default, your task alerts you by email when it has been completed.
```{r}
# Monitor your task (skip this part)
# tsk$monitor()
```
Use the following to download all files from a completed task.
```{r}
tsk$download("~/Downloads")
```
<a name="hooktask"/>Instead of the default task monitor action, you can use `setTaskHook` to connect a function call to the status of a task. When you run `tsk$monitor(time = 30)` it will check your task every 30 seconds to see if the current task status matches one of the following statuses: "queued", "draft", "running", "completed", "aborted", and "failed". When it finds a match for the task status, `getTaskHook` returns the function call for the specific status.
```{r, eval = TRUE}
getTaskHook("completed")
```
If you want to customize the monitor function, you can adjust the following requirement. Your function must return `TRUE` or `FALSE` in the end. When it is `TRUE` (or non-logical value) it means the monitoring will be terminated after it finds a status matched and the function executes, such as when the task is completed. When it is `FALSE`, it means the monitoring will continue for the next iteration of checking, e.g., when it is "running", you want to keep tracking.
Follow the example below to set a new function to monitor the status "completed". Then, when the task is completed, it will download all task output files to the local folder.
```{r}
setTaskHook("completed", function() {
tsk$download("~/Downloads")
return(TRUE)
})
tsk$monitor()
```
<div align="right"><a href="#top">top</a></div>
<a name="reference"/>
# Seven Bridges API Reference
The `sevenbridges` package provides a user-friendly interface, so you do not have to combine several `api()` calls and constantly reference the API documentation to issue API requests.
<div align="right"><a href="#top">top</a></div>
<a name="auth-reference"/>
## Authentication
Before you can interact with the API, you need to construct an `Auth` object which stores the following information:
- Your authentication token. This is used to authenticate your credentials with the API. Learn more about obtaining your authentication token on the [Seven Bridges Platform](https://docs.sevenbridges.com/v1.0/docs/get-your-authentication-token) and the [Cancer Genomics Cloud](http://docs.cancergenomicscloud.org/v1.0/docs/get-your-authentication-token). The approach for obtaining the authentication token also applies to the other Seven Bridges platforms.
- The path for the API (base URL).
- The platform name you are using. This is an optional field, as the base URL of the API ultimately decides where the API calls will be sent to. This field will only be blank when the URL was directly provided, and the platform name could not be inferred from that URL.
The general authentication logic for `Auth()` is as follows:
1. The package will use the direct authentication method if `from` is not specified explicitly or specified as `from = "direct"`.
2. The package will load the authentication information from environment variables when `from = "env"`, or user configuration file when `from = "file"`.
### Direct authentication
To use direct authentication, users need to specify one of `platform` or `url`,
with the corresponding `token`. Examples of direct authentication:
```{r}
a <- Auth(
token = "your_token",
platform = "aws-us"
)
```
The above will use the Seven Bridges Platform on AWS (US).
```{r}
a <- Auth(
token = "your_token",
url = "https://gcp-api.sbgenomics.com/v2"
)
```
The above will use the specified `url` as the base URL for the API calls. In this example, the `url` points to the Seven Bridges Platform on Google Cloud Platform (GCP).
```{r}
a <- Auth(token = "your_token")
```
The above will use the Cancer Genomics Cloud environment since no `platform` nor `url` were explicitly specified (not recommended).
***Note:*** `platform` and `url` should not be specified at the same time.
<div align="right"><a href="#top">top</a></div>
### Authentication via system environment variables
The R API client supports reading authentication information stored in
system environment variables.
To set the two environment variables in your system, you could use
the function `sbg_set_env()`. For example:
```{r}
sbg_set_env(
url = "https://cgc-api.sbgenomics.com/v2",
token = "your_token"
)
```
See if the environment variables are correctly set:
```{r}
sbg_get_env("SB_API_ENDPOINT")
## "https://cgc-api.sbgenomics.com/v2"
sbg_get_env("SB_AUTH_TOKEN")
## "your_token"
```
To create an `Auth` object:
```{r}
a <- Auth(from = "env")
```
To unset the two environment variables:
```{r}
Sys.unsetenv("SB_API_ENDPOINT")
Sys.unsetenv("SB_AUTH_TOKEN")
```
<div align="right"><a href="#top">top</a></div>
### Authentication via user configuration file
You can create an ini-like file named `credentials` under the folder `$HOME/.sevenbridges/` and maintain your credentials for multiple accounts across various Seven Bridges environments. An example:
```
[aws-us-rfranklin]
api_endpoint = https://api.sbgenomics.com/v2
auth_token = token_for_this_user
# This is a comment:
# another user on the same platform
[aws-us-rosalind-franklin]
api_endpoint = https://api.sbgenomics.com/v2
auth_token = token_for_this_user
[default]
api_endpoint = https://cgc-api.sbgenomics.com/v2
auth_token = token_for_this_user
[gcp]
api_endpoint = https://gcp-api.sbgenomics.com/v2
auth_token = token_for_this_user
```
Please make sure to have two fields **exactly** named as `api_endpoint` and `auth_token` under each profile.
To load the default profile (named `[default]`) from the default user
configuration file (`$HOME/.sevenbridges/credentials`), please use:
```{r}
a <- Auth(from = "file")
```
To load the user profile `aws-us-rfranklin` from this configuration file,
change the `profile_name`:
```{r}
a <- Auth(from = "file", profile_name = "aws-us-rfranklin")
```
To use a user configuration file from other locations (not recommended),
please specify the file path using the argument `config_file`. For example:
```{r}
a <- Auth(
from = "file", config_file = "~/sevenbridges.cfg",
profile_name = "aws-us-rfranklin"
)
```
***Note:*** If you edited the `credentials` file, please use `Auth()` to re-authenticate.
<div align="right"><a href="#top">top</a></div>
## List All API Calls
If you did not pass any parameters to `api()` from `Auth`, it would list all API calls. Any parameters you provide will be passed to the `api()` function, but you do not have to pass your input token and path once more since the `Auth` object already has this information. The following call from the `Auth` object will check the response as well.
```{r}
a$api()
```
<div align="right"><a href="#top">top</a></div>
## Offset, Limit, Search, and Advance Access Features
### `offset` and `limit`
Every API call accepts two arguments named `offset` and `limit`.
- `offset` defines where the retrieved items started.
- `limit` defines the number of items you want to get.
By default, `offset` is set to `0` and `limit` is set to `50`. As such, your API request returns the __first 100 items__ when you list items or search for items by name. To search and list all items, use `complete = TRUE` in your API request.
```{r}
getOption("sevenbridges")$offset
getOption("sevenbridges")$limit
```
<div align="right"><a href="#top">top</a></div>
### Search by ID
When searching by ID, your request will return your exact resource as it is unique. As such, you do not have to set `offset` and `limit` manually. It is a good practice to find your resources by their ID and pass this ID as an input to your task. You can find a resource's ID in the final part of the URL on the visual interface or via the API requests to list resources or get a resource's details.
<div align="right"><a href="#top">top</a></div>
### Search by name
Search by name returns all partial matches unless you specify `exact = TRUE`. This type of search will only search across current pulled content, so use `complete = TRUE` if you want to search across everything.
For example, to list all public apps, use `visibility` argument, but make sure you pass `complete = TRUE` to it, to show everything. This arguments generally works for items like "App", "Project", "Task", "File", etc.
```{r}
# first, search by id is fast
x <- a$app(visibility = "public", id = "admin/sbg-public-data/sbg-ucsc-b37-bed-converter/1")
# show 100 items from public
x <- a$app(visibility = "public")
length(x) # 100
x <- a$app(visibility = "public", complete = TRUE)
length(x) # 272 by Nov 2016
# this return nothing, because it is not in the first 100 returned names
a$app(visibility = "public", name = "bed converter")
# this return an app, because it pulls *all* app names and did search
a$app(visibility = "public", name = "bed converter", complete = TRUE)
```
<div align="right"><a href="#top">top</a></div>
### Experiment with Advance Access features
Similar to `offset` and `limit`, every API call accepts an argument named `advance_access`. This argument was first introduced in August 2017 and controls if a special field in the HTTP request header will be sent, which can enable the access to the "Advance Access" features in the Seven Bridges API. Note that the Advance Access features in the API are **not officially released yet**, therefore the API usages are subject to change, so please use with discretion.
In addition to modifying each API call that uses Advance Access features, the option can also be set globally at the beginning of your API script. This offers a one-button switch for users who want to experiment with the Advance Access features. The option is disabled by default:
```{r}
library("sevenbridges")
getOption("sevenbridges")$advance_access
```
```
## [1] FALSE
```
For example, the Markers API is in Advance Access as of November 2018. If we try to use the Markers API to list markers available on a BAM file with the `advance_access` option disabled, it will return an error message:
```{r}
req <- api(
token = "your_token", path = "genome/markers?file={bam_file_id}",
method = "GET"
)
httr::content(req)$"message"
```
```
## [1] "Advance access feature needs X-SBG-Advance-Access: advance header."
```
To enable the Advance Access features, one can use
```{r}
opt <- getOption("sevenbridges")
opt$advance_access <- TRUE
options(sevenbridges = opt)
```
at the beginning of their scripts. Let's check if the option has been enabled:
```{r}
getOption("sevenbridges")$advance_access
```
```
## [1] TRUE