## Basic Bundle and File Operations with the DCP CLI

Here are some examples of basic operations with the HCA DSS: getting bundle and file metadata and contents. Here we'll illustrate this using the HCAs DCP CLI.

First, install the CLI so we can make some requests.

In [1]:
import sys
!{sys.executable} -m pip install hca



Now, we're going to get the "manifest" of a bundle. This is metadata about a bundle and its contents. We'll make a request with the CLI using the bundle's UUID and version.

In [2]:
bundle_uuid = "ead66505-a78b-44ee-81f6-418be859ab65"
bundle_version = "2018-12-06T043139.806469Z"

The `hca dss get-bundle` command retrieves the manifest:

In [3]:
!hca dss get-bundle --uuid "$bundle_uuid" --version "$bundle_version" --replica aws

{
  "bundle": {
    "creator_uid": 8008,
    "files": [
      {
        "content-type": "application/json; dcp-type=\"metadata/biomaterial\"",
        "crc32c": "ec0d3d14",
        "indexed": true,
        "name": "cell_suspension_0.json",
        "s3_etag": "fee4d8354468476052a1c0da67fb1f03",
        "sha1": "6f77abbdd0892a99e6fae5c920daaaf3018b24fd",
        "sha256": "5d41934e1af911f510be04506d41b59ebd91d1678f42df759819b223a898b1da",
        "size": 1356,
        "uuid": "3e4e6f8e-0c67-470c-8059-950e0bd1ebff",
        "version": "2018-12-04T190346.297000Z"
      },
      {
        "content-type": "application/json; dcp-type=\"metadata/biomaterial\"",
        "crc32c": "73fc7859",
        "indexed": true,
        "name": "specimen_from_organism_0.json",
        "s3_etag": "5ba0e2a0506f3fab2d69c5d19227c5c3",
        "sha1": "53275ddd170d3d6dd31239674bff86ede607ab2f",
        "sha256": "3b76d6f98b4912fd9fc233054cd47c321bb2411e5a56c656efc5c7e9327f2a32",
        "

And there's the contents of that bundle along with some metadata. We can check for the existence of a file using `hca dss head-file`:

In [4]:
file_uuid = "9adf9f89-f546-4889-86ff-b430e3123c8b"
file_version = "2018-12-04T191256.554000Z"

In [5]:
!hca dss head-file --uuid "$file_uuid" --version "$file_version" --replica aws

"<Response [200]>"


Not very helpful. The 200 tells us the file exists, but all the information is in the header, which the CLI doesn't return. If we want to get the file contents, we can use `hca dss get-file`:

In [6]:
!hca dss get-file --uuid "$file_uuid" --version "$file_version" --replica aws

{
  "process_core": {
    "process_id": "E18_20160930_Neurons_Sample_71_S068_L007_006"
  },
  "schema_type": "process",
  "describedBy": "https://schema.humancellatlas.org/type/process/6.0.2/process",
  "provenance": {
    "document_id": "9adf9f89-f546-4889-86ff-b430e3123c8b",
    "submission_date": "2018-12-04T18:59:53.218Z",
    "update_date": "2018-12-04T19:12:56.554Z"
  }
}


And there are the contents of the file.

### Downloading Entire Bundles

The CLI also exposes another command, hca dss download that downloads an entire bundle to disk with one command:

In [7]:
!hca dss download --bundle-uuid "$bundle_uuid" --version "$bundle_version" --replica aws

INFO:hca:File cell_suspension_0.json: Retrieving...
INFO:hca:File cell_suspension_0.json: GET SUCCEEDED. Stored at ead66505-a78b-44ee-81f6-418be859ab65/cell_suspension_0.json.
INFO:hca:File specimen_from_organism_0.json: Retrieving...
INFO:hca:File specimen_from_organism_0.json: GET SUCCEEDED. Stored at ead66505-a78b-44ee-81f6-418be859ab65/specimen_from_organism_0.json.
INFO:hca:File donor_organism_0.json: Retrieving...
INFO:hca:File donor_organism_0.json: GET SUCCEEDED. Stored at ead66505-a78b-44ee-81f6-418be859ab65/donor_organism_0.json.
INFO:hca:File sequence_file_0.json: Retrieving...
INFO:hca:File sequence_file_0.json: GET SUCCEEDED. Stored at ead66505-a78b-44ee-81f6-418be859ab65/sequence_file_0.json.
INFO:hca:File sequence_file_1.json: Retrieving...
INFO:hca:File sequence_file_1.json: GET SUCCEEDED. Stored at ead66505-a78b-44ee-81f6-418be859ab65/sequence_file_1.json.
INFO:hca:File sequence_file_2.json: Retrieving...
INFO:hca:File sequence_file_2.json: GET SUCCEEDED. Stored at ead

In [9]:
!ls -l "$bundle_uuid"

total 516280
-rw-r--r-- 1 jovyan users      1356 Dec 12 03:08 cell_suspension_0.json
-rw-r--r-- 1 jovyan users      1191 Dec 12 03:08 dissociation_protocol_0.json
-rw-r--r-- 1 jovyan users      1430 Dec 12 03:08 donor_organism_0.json
-rw-r--r-- 1 jovyan users  38942279 Dec 12 03:08 E18_20160930_Neurons_Sample_71_S068_L007_I1_006.fastq.gz
-rw-r--r-- 1 jovyan users  94751979 Dec 12 03:08 E18_20160930_Neurons_Sample_71_S068_L007_R1_006.fastq.gz
-rw-r--r-- 1 jovyan users 386122988 Dec 12 03:09 E18_20160930_Neurons_Sample_71_S068_L007_R2_006.fastq.gz
-rw-r--r-- 1 jovyan users      1187 Dec 12 03:08 library_preparation_protocol_0.json
-rw-r--r-- 1 jovyan users      1968 Dec 12 03:08 links.json
-rw-r--r-- 1 jovyan users       408 Dec 12 03:08 process_0.json
-rw-r--r-- 1 jovyan users       377 Dec 12 03:08 process_1.json
-rw-r--r-- 1 jovyan users       376 Dec 12 03:08 process_2.json
-rw-r--r-- 1 jovyan users      3960 Dec 12 03:08 project_0.json
-rw-r--r-- 1 jovyan users       68