# Upload a datafile to a Dataset with an upload link

The script in this notebook creates a Dataset using a user-provided upload link to a datafile. Datasets can only be added or replaced in private Collections (including private revisions of published Collections).

In order to use this script, you must have a Curation API key (obtained from upper-righthand dropdown in the CZ CELLxGENE Discover data portal after logging in).

### Import dependencies

In [None]:
library("readr")
library("httr")
library("stringr")
library("rjson")

#### <font color='#bc00b0'>Please fill in the required values:</font>

<font color='#bc00b0'>(Required) Provide the path to your api key file</font>

In [None]:
api_key_file_path <- "path/to/api-key-file"

<font color='#bc00b0'>(Required) Provide the link to the h5ad datafile to upload</font>

In [None]:
datafile_link <- "https://some.download.link"

<font color='#bc00b0'>(Required) Enter the id of the Collection to which you wish to add this datafile as a Dataset</font>

_The Collection id can be found by looking at the url path in the address bar 
when viewing your Collection in the CZ CELLxGENE Discover data portal: `/collections/{collection_id}`. You can only add/replace Datasets in private Collections or private revisions of published Collections. In order to edit a published Collection, you must first create a revision of that Collection._

In [None]:
collection_id <- "01234567-89ab-cdef-0123-456789abcdef"

<font color='#bc00b0'>(Required) Enter the id of the Dataset to which you wish to upload your datafile</font>

_The Dataset id can be found by using the `/collections/{collection_id}` endpoint and filtering for the Dataset of interest OR by looking at the url path in the address when viewing your Dataset using the CZ CELLxGENE Explorer browser tool: `/e/{dataset_id}.cxg/`. See heading of create_dataseta_from_local_file.ipynb for rules about adding vs updating Datasets._

In [None]:
dataset_id <- "abcdef01-2345-6789-abcd-ef0123456789"

### Specify domain (and API url)

In [None]:
domain_name <- "cellxgene.cziscience.com"
site_url <- str_interp("https://${domain_name}")
api_url_base <- str_interp("https://api.${domain_name}")

### Use API key to obtain a temporary access token

In [None]:
api_key <- read_file(api_key_file_path)
access_token_path <- "/curation/v1/auth/token"
access_token_url <- str_interp("${api_url_base}${access_token_path}")
res <- POST(url=access_token_url, add_headers(`x-api-key`=api_key))
stop_for_status(res)
access_token <- content(res)$access_token

##### (optional, debug) verify status code of response

In [None]:
print(res$status_code)

### Formulate request and upload a new dataset from a file link.

In [None]:
upload_path <- str_interp("/curation/v1/collections/${collection_id}/datasets/${dataset_id}")
body <- list("link"=datafile_link)
bearer_token <- str_interp("Bearer ${access_token}")
url <- str_interp("${api_url_base}${upload_path}")
res <- PUT(url=url, body=toJSON(body), add_headers(`Authorization`=bearer_token, `Content-Type`="application/json"))
stop_for_status(res)
print(res$status_code)