# Fetch full metadata for a Dataset Version

The script in this notebook retrieves full metadata for a given Dataset Version.

Fetching a Dataset Version requires only the Dataset id; it does not require an API key/access token.

### Import dependencies

In [None]:
import requests

#### <font color='#bc00b0'>Please fill in the required values:</font>

<font color='#bc00b0'>(Required) Enter the id of the Dataset Version</font>

_A Dataset Version id can be found by using the `/datasets/{dataset_id}/versions` OR `/collections/{collection_id}/versions` endpoints and filtering for the Dataset Version of interest OR by looking at the url path in the address when viewing your Dataset Version using the CZ CELLxGENE Explorer browser tool: `/e/{dataset_version_id}.cxg/`._

In [None]:
dataset_version_id = "abcdef01-2345-6789-abcd-ef0123456789"

### Specify domain (and API url)

In [None]:
domain_name = "cellxgene.cziscience.com"
site_url = f"https://{domain_name}"
api_url_base = f"https://api.{domain_name}"

### Formulate request and fetch a Dataset Version's metadata

In [None]:
dataset_version_path = f"/curation/v1/dataset_versions/{dataset_version_id}"
url = f"{api_url_base}{dataset_version_path}"
res = requests.get(url=url)
res.raise_for_status()
res_content = res.json()
print(res_content)

### Download Dataset Assets

The dataset metadata provides download URLs for every asset associated with this particular dataset version.

These download URLs are permalinks to download the assets for this dataset version.

In [None]:
assets = res_content["assets"]
dataset_id = res_content["dataset_version_id"]
for asset in assets:
    download_filename = f"{dataset_id}.{asset['filetype']}"
    print(f"\nDownloading {download_filename}... ")
    with requests.get(asset["url"], stream=True) as res:
        res.raise_for_status()
        filesize = int(res.headers["Content-Length"])
        with open(download_filename, "wb") as df:
            total_bytes_received = 0
            for chunk in res.iter_content(chunk_size=1024 * 1024):
                df.write(chunk)
                total_bytes_received += len(chunk)
                percent_of_total_upload = float("{:.1f}".format(total_bytes_received / filesize * 100))
                color = "\033[38;5;10m" if percent_of_total_upload == 100 else ""
                print(f"\033[1m{color}{percent_of_total_upload}% downloaded\033[0m\r", end="")
print("\n\nDone downloading assets")