## Basic Bundle and File Operations

Here are some examples of basic operations with the HCA DSS: getting bundle and file metadata and contents. Here we'll illustrate this via direct calls to the API endpoints using [httpie](https://httpie.org).

First, install httpie so we can make some requests.

In [1]:
import sys
!{sys.executable} -m pip install httpie



Now, we're going to get the "manifest" of a bundle. This is metadata about a bundle and its contents. We'll construct a url based on the bundle's UUID and version.

In [2]:
bundle_uuid = "ead66505-a78b-44ee-81f6-418be859ab65"
bundle_version = "2018-12-06T043139.806469Z"

import urllib.parse
base_url = "https://dss.data.humancellatlas.org/v1/"
bundle_url = base_url + "bundles/" + bundle_uuid + '?' + urllib.parse.urlencode({"replica": "aws", "version": bundle_version})
print(f"Bundle url is: {bundle_url}")

Bundle url is: https://dss.data.humancellatlas.org/v1/bundles/ead66505-a78b-44ee-81f6-418be859ab65?replica=aws&version=2018-12-06T043139.806469Z


Now, we'll just GET that url.

In [3]:
!http GET "$bundle_url"

[34mHTTP[39;49;00m/[34m1.1[39;49;00m [34m200[39;49;00m [36mOK[39;49;00m
[36mAccess-Control-Allow-Headers[39;49;00m: Authorization,Content-Type,X-Amz-Date,X-Amz-Security-Token,X-Api-Key
[36mAccess-Control-Allow-Origin[39;49;00m: *
[36mConnection[39;49;00m: keep-alive
[36mContent-Length[39;49;00m: 8968
[36mContent-Type[39;49;00m: application/json
[36mDate[39;49;00m: Tue, 11 Dec 2018 23:03:29 GMT
[36mStrict-Transport-Security[39;49;00m: max-age=31536000; includeSubDomains; preload
[36mX-Amzn-Trace-Id[39;49;00m: Root=1-5c104241-50719294ff227f303b62fde2;Sampled=0
[36mx-amz-apigw-id[39;49;00m: Rw9KNEq6IAMFgZw=
[36mx-amzn-RequestId[39;49;00m: f9456103-fd98-11e8-ba5c-f53ff065ea82

{
    [34;01m"bundle"[39;49;00m: {
        [34;01m"creator_uid"[39;49;00m: [34m8008[39;49;00m,
        [34;01m"files"[39;49;00m: [
            {
                [34;01m"content-type"[39;49;00m: [33m"application/json; dcp-type=\"metadata/biomaterial\""[39;49;0

And there's the contents of that bundle along with some metadata. We can get metadata for a particular file by issuing a HEAD requests to a different url:

In [4]:
file_uuid = "9adf9f89-f546-4889-86ff-b430e3123c8b"
file_version = "2018-12-04T191256.554000Z"
file_url = base_url + "files/" + file_uuid + '?' + urllib.parse.urlencode({"replica": "aws", "version": file_version})
print(f"File url is: {file_url}")

File url is: https://dss.data.humancellatlas.org/v1/files/9adf9f89-f546-4889-86ff-b430e3123c8b?replica=aws&version=2018-12-04T191256.554000Z


In [5]:
!http HEAD "$file_url"

[34mHTTP[39;49;00m/[34m1.1[39;49;00m [34m200[39;49;00m [36mOK[39;49;00m
[36mAccess-Control-Allow-Headers[39;49;00m: Authorization,Content-Type,X-Amz-Date,X-Amz-Security-Token,X-Api-Key
[36mAccess-Control-Allow-Origin[39;49;00m: *
[36mConnection[39;49;00m: keep-alive
[36mContent-Length[39;49;00m: 0
[36mContent-Type[39;49;00m: text/html; charset=utf-8
[36mDate[39;49;00m: Tue, 11 Dec 2018 23:03:30 GMT
[36mStrict-Transport-Security[39;49;00m: max-age=31536000; includeSubDomains; preload
[36mX-Amzn-Trace-Id[39;49;00m: Root=1-5c104242-e86cdbe51a3f134449d6993d;Sampled=0
[36mX-DSS-CONTENT-TYPE[39;49;00m: application/json; dcp-type="metadata/process"
[36mX-DSS-CRC32C[39;49;00m: 3c8c3cc5
[36mX-DSS-CREATOR-UID[39;49;00m: 8008
[36mX-DSS-S3-ETAG[39;49;00m: 003989f2df2aa8f39723f3925e5fb851
[36mX-DSS-SHA1[39;49;00m: 9bff1801fb12b3302b7f7d7dd57a3a3cb6070914
[36mX-DSS-SHA256[39;49;00m: 7a967973684ffa8ac3fe5004177e2fc7d8f3a2c822bbede536dcd58d9af7bdb5
[

This gives a response with metadata about the file in the header. If we want the actual contents of the file, we GET rather than HEAD that url:

In [6]:
!http GET "$file_url"

[34mHTTP[39;49;00m/[34m1.1[39;49;00m [34m302[39;49;00m [36mFound[39;49;00m
[36mAccess-Control-Allow-Headers[39;49;00m: Authorization,Content-Type,X-Amz-Date,X-Amz-Security-Token,X-Api-Key
[36mAccess-Control-Allow-Origin[39;49;00m: *
[36mConnection[39;49;00m: keep-alive
[36mContent-Length[39;49;00m: 1687
[36mContent-Type[39;49;00m: text/html; charset=utf-8
[36mDate[39;49;00m: Tue, 11 Dec 2018 23:03:32 GMT
[36mLocation[39;49;00m: https://org-hca-dss-checkout-prod.s3.amazonaws.com/blobs/7a967973684ffa8ac3fe5004177e2fc7d8f3a2c822bbede536dcd58d9af7bdb5.9bff1801fb12b3302b7f7d7dd57a3a3cb6070914.003989f2df2aa8f39723f3925e5fb851.3c8c3cc5?AWSAccessKeyId=ASIARSZHKI4KOVDXHRWK&Signature=KW1YGr44t5ob%2F%2FgN%2BWawwd9WWIA%3D&x-amz-security-token=FQoGZXIvYXdzEIf%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDGhmzBCT03o%2FJU6qGSLgAXSrzriN7DR2k04v4yE931hYaSm6niBbLcV9F8qvNahb8ab7FyjuoSfpUqYRvcq%2F%2Ft1bBBz9ORIkzBA6cCI%2BWFRr1qN6j7CdymmOhg9iS89icSROOtPDxV4xWAagzl3xUHpLT7OKuMlWiCosNXQaEU1UEV

Okay, that didn't quite get us the file. It responded with a 302 and a "Location" in the header. If we want the file itself, we need to follow that redirect:

In [7]:
!http --follow GET "$file_url"

[34mHTTP[39;49;00m/[34m1.1[39;49;00m [34m200[39;49;00m [36mOK[39;49;00m
[36mAccept-Ranges[39;49;00m: bytes
[36mContent-Length[39;49;00m: 408
[36mContent-Type[39;49;00m: application/json; dcp-type="metadata/process"
[36mDate[39;49;00m: Tue, 11 Dec 2018 23:03:35 GMT
[36mETag[39;49;00m: "003989f2df2aa8f39723f3925e5fb851"
[36mLast-Modified[39;49;00m: Tue, 11 Dec 2018 23:00:17 GMT
[36mServer[39;49;00m: AmazonS3
[36mx-amz-expiration[39;49;00m: expiry-date="Wed, 19 Dec 2018 00:00:00 GMT", rule-id="dss_checkout_expiration"
[36mx-amz-id-2[39;49;00m: o2x3gUqcf/HnTpgUgcT9UeI8Wicag+L9R7XGfw3efs6Z4WAmiLp/mhOWe32hS78amKWF2IW+U4c=
[36mx-amz-request-id[39;49;00m: 0EF4D3BBBDB4B564
[36mx-amz-tagging-count[39;49;00m: 4

{
    [34;01m"describedBy"[39;49;00m: [33m"https://schema.humancellatlas.org/type/process/6.0.2/process"[39;49;00m,
    [34;01m"process_core"[39;49;00m: {
        [34;01m"process_id"[39;49;00m: [33m"E18_20160930_Neurons_Sample_71_S0

This follows the redirect and gives us the contents of the (small) file in the response body.