# Accessing AZ Blob storage
This notebook demonstrate how to use the Azure CLI to grab the necessity data required in the other notebooks.

Prerequisite: 
- Must have az cli installed. You can install it via the [`environment.yml`](../../environment.yml).
- Must have the `blob-key.json` and store it in the `../secret` folder (or any other folder you want)

### Find out what containers are available in azure blob storage

In [None]:
import os, json

In [None]:
with open("../secret/blob-key.json") as f:
    az_account = json.load(f)
account_name = az_account["account_name"]
key = az_account["key"]

In [None]:
# Get all containers info from azure blob storage account
if not os.path.exists('response'):
    os.makedirs('response')
! az Storage container list --account-key {key} --account-name {account_name} > response/{account_name}.json

In [None]:
# This will print the name of all containers available in our azure blob storage account
with open(f"response/{account_name}.json") as f:
    containers = json.load(f)
[c["name"] for c in containers]

We can see from the response that there are 5 different containers available in our account

### Get item list in a container
If you want to know what items are available inside a container, you can do the following:

In [None]:
query = "processed" 
# List all items in a container
! az Storage blob list --container-name {query} --account-name {account_name} \
    --account-key {key} > response/{account_name}-{query}.json

In [None]:
with open(f"response/{account_name}-{query}.json") as f:
    query_response = json.load(f)

In [None]:
item_list = [q["name"] for q in query_response]

The `item_list` is a list of all blob items we have in the container. Note that these are pseudo directories, in reality its just 1 layer.

### Download required items for downstream analysis

In [None]:
blob_items = ["bigslice_query/reports.zip",
              "p__Myxococcota_all/bigscape/p__Myxococcota_all_antismash_6.0.1.zip",
              "p__Nitrospirota_all/bigscape/p__Nitrospirota_all_antismash_6.0.1.zip"
             ]

In [None]:
# I should've change this with the python api
for item in blob_items:
    download_dir = "../data"
    fn = os.path.join(download_dir, os.path.basename(item))
    os_command = f"az Storage blob download --account-key {key} --account-name {account_name} --container-name {query} --file {fn} --name {item}"
    ! {os_command}

In [None]:
### Download batch items
#! az Storage blob download-batch --account-key {key} --account-name {account_name} \
#    --source {query} --destination . --pattern tables/*.csv