# Create a Dataset

The script in this notebook creates a new, empty Dataset in the specified Collection. After creation, you can use the returned Dataset `id` with either the `upload_local_datafile_to_dataset.ipynb` notebook or the `upload_datafile_from_link_to_dataset.ipynb` notebook to populate the newly created Dataset with the contents of your datafile.

In order to use this script, you must have a Curation API key (obtained from upper-righthand dropdown in the CZ CELLxGENE Discover data portal after logging in).

### Import dependencies

In [None]:
import requests

#### <font color='#bc00b0'>Please fill in the required values:</font>

<font color='#bc00b0'>(Required) Provide the path to your api key file</font>

In [None]:
api_key_file_path = "path/to/api-key-file"

<font color='#bc00b0'>(Required) Enter the id of the Collection in which you would like to create a new Dataset</font>

_The Collection id can be found by looking at the url path in the address bar 
when viewing your Collection in the CZ CELLxGENE Discover data portal: `/collections/{collection_id}`._

In [None]:
collection_id = "01234567-89ab-cdef-0123-456789abcdef"

### Specify domain (and API url)

In [None]:
domain_name = "cellxgene.cziscience.com"
site_url = f"https://{domain_name}"
api_url_base = f"https://api.{domain_name}"

### Use API key to obtain a temporary access token

In [None]:
with open(api_key_file_path) as f:
    api_key = f.read()
access_token_path = "/curation/v1/auth/token"
access_token_url = f"{api_url_base}{access_token_path}"
res = requests.post(access_token_url, headers={"x-api-key": api_key})
res.raise_for_status()
access_token = res.json().get("access_token")

##### (optional, debug) verify status code of response

In [None]:
print(res.status_code)

### Formulate request and create a Dataset

In [None]:
dataset_path = f"/curation/v1/collections/{collection_id}/datasets"
bearer_token = f"Bearer {access_token}"
url = f"{api_url_base}{dataset_path}"
res = requests.post(url=url, headers={"Authorization": bearer_token})
res.raise_for_status()
res_content = res.json()
print(res_content)
dataset_id = res_content["id"]
print(f"New empty Dataset with id {dataset_id} in Collection at url:")
print(f"{site_url}/collections/{collection_id}")