# Data Management Sample
This sample shows you how to connect and use Azure Quantum with external datasources such as Azure Blob Storage. 

## Work with data in Azure Blob Storage
This section will show you how to work with JSON and other file types in Blob Storage.

Prerequisites:
- Storage Account

You must have a Azure Storage account deployed to use Blob Storage for your data. If you are accessing this sample through Azure Quantum hosted notebooks, you should already have a Storage account set up for your [Quantum Workspace](https://docs.microsoft.com/azure/quantum/how-to-create-workspace?tabs=tabid-quick). 
You can find details for this storage account by navigating to your Azure Quantum Workspace in the portal - it is shown in the 'Essentials' section at the top - if you click the name of the storage account it will take you to view it in the portal.

If you wish to use a different storage account, you can absolutely do so. More information on setting up a storage account can be found [here](https://docs.microsoft.com/azure/storage/common/storage-account-create?tabs=azure-portal).


### Setup
You can find more information about the Blob Storage SDK [here](https://docs.microsoft.com/python/api/overview/azure/storage-blob-readme?view=azure-python).

In [None]:
# Import Azure Storage Python SDK & json utils
from azure.storage.blob import BlobServiceClient
import json

In [None]:
# Log in to the Azure CLI (only need to do once per session)
!az login

### Connect to Blob Storage

You can choose to set up a Blob Storage container for anonymous access which means no authentication is required - in that case you do not need to provide any credentials and can simply use the URL to refer directly your file (see the `Import data from URL` section for details on how to do this).

There are several authentication options available, feel free to choose one from the selection shared [here](https://docs.microsoft.com/python/api/overview/azure/storage-blob-readme?view=azure-python#types-of-credentials). For the purposes of this sample, we will use the [connection string method](https://docs.microsoft.com/python/api/overview/azure/storage-blob-readme?view=azure-python#creating-the-client-from-a-connection-string).

You can use the storage account attached to your Azure Quantum Workspace or any other storage account that you like. For further information on the Azure Blob Storage service, please refer to the docs [here](https://docs.microsoft.com/azure/storage/blobs/storage-blobs-introduction).

> Don't forget to update the details in the cell below to match those for your storage account and resource group. 

In [None]:
# Get the connection string for your storage account by specifying the name of the storage account as well as the resource group you deployed it into
# Update the last two parameters to the names for your storage account and resource group
az_cli_output = !az storage account show-connection-string -g my-resource-group-name -n my-storage-account-name
az_cli_output = json.loads("".join(az_cli_output))

# Extract connection string from CLI output
connection_string = az_cli_output["connectionString"]

# Create blob service client to connect with the Blob Storage service 
service_client = BlobServiceClient.from_connection_string(conn_str=connection_string)

### Create a Blob Storage container
If you don't already have a container set up in your Blob Storage account to store your files, you can use the following code snippet to create one.

In [None]:
from azure.storage.blob import ContainerClient

# Create container client using the connection string for your storage account
# Replace "data" with your choice of container name or keep "data" for ease of use during the rest of this sample
container_client = ContainerClient.from_connection_string(conn_str=connection_string, container_name="data")
container_client.create_container()

### List files in a Blob Storage container

This will allow you to view the files you have already stored in the container. You can also view these files [in the portal](https://docs.microsoft.com/azure/storage/blobs/storage-quickstart-blobs-portal) or through the [Storage Explorer desktop application](https://docs.microsoft.com/azure/storage/blobs/quickstart-storage-explorer).

In [None]:
from azure.storage.blob import ContainerClient

# Create container client using the connection string for your storage account
container = ContainerClient.from_connection_string(conn_str=connection_string, container_name="data")

# Fetch list of blobs in the container
blob_list = container.list_blobs()

# Print list of available blobs
for blob in blob_list:
    print(blob.name)

### Download JSON data

In [None]:
# Create blob client for your file
blob_client = service_client.get_blob_client(container="data", blob="file.json")

# Download file from blob storage & parse JSON into input_data object
download_stream = blob_client.download_blob()
input_data = json.loads(download_stream.readall())

### Download text/other data

In [None]:
# Create blob client for your file
blob_client = service_client.get_blob_client(container="data", blob="file.txt")

# Download file from blob storage & assign to input_data bytes object
download_stream = blob_client.download_blob()
input_data = download_stream.readall()

### Upload JSON data

In [None]:
# Define object for upload
json_data_for_upload = {
    "data": {
        "type": "demo",
        "value": 123
    }
}

# Create blob client for the new file
blob_client = service_client.get_blob_client(container="data", blob="uploaded_json_file.json")

# Upload JSON data
# Enable/disable overwrite to specify behaviour when a blob already exists with that name
blob_client.upload_blob(json.dumps(json_data_for_upload), overwrite=True)

### Upload text/other data

In [None]:
# Define text for upload
text_data_for_upload = "Some text to upload to a file. File could be .csv, .txt or any other format you choose."

# Create blob client for the new file (make sure to choose the appropriate extension e.g. .txt or .csv)
blob_client = service_client.get_blob_client(container="data", blob="uploaded_text_file.txt")

# Upload data
# Enable/disable overwrite to specify behaviour when a blob already exists with that name
blob_client.upload_blob(text_data_for_upload, overwrite=True)

## Work with data from an external URL
The following code snippet shows you how to load data from a URL (with no authentication) and save the data to a file in Blob Storage.

### Download data from URL

In [None]:
# Import requests module
import requests

# Paste the URL for your file here
url = "url-to-data"

# Get data from URL
r = requests.get(url)

# Work with the data - e.g. could upload to Blob Storage after processing
# See https://www.w3schools.com/PYTHON/ref_requests_response.asp for further properties of the requests.Response object which you can use for processing
data_bytes = r.content

## Save data to local storage (temporary storage)
You can use the following code snippets to write data to local storage. Local storage is treated as a temp storage location as it is hard to locate local files afterwards and the storage may not persist across sessions. It is therefore recommended that you only use this for temporary file operations and use the Blob Storage access methods provided in this sample to more permanently store your data.

### Save JSON data to file in local storage

In [None]:
# Write JSON to local storage
with open("file.json", 'w') as f:
    f.write(json.dumps(input_data))

### Save text/other data to file in local storage

In [None]:
# Write text to local storage
with open("file.txt", 'w') as f:
    f.write("Text to write to file.")

### Save bytes to file in local storage

In [None]:
# Write file bytes to local storage
with open("file.extension", 'wb') as f:
    f.write(r.content) # Bytes content from requests.Response object previously returned from get request

## End-to-end example

The sample below shows how to load a spin-orbital Hamiltonian from a file containing orbital integrals. Features of the Hamiltonian are then computed and the results are saved to a .csv file in Blob Storage. The input data is loaded from a publicly-accessible GitHub URL.

The code for this example is taken directly from the [chemistry section of the samples repo](https://github.com/microsoft/Quantum/tree/main/samples/chemistry). This specific example computes the features for Ozone:

> Ozone is formed from dioxygen by the action of ultraviolet light (UV) and electrical discharges within the Earth's atmosphere. It is present in very low concentrations throughout the latter, with its highest concentration high in the ozone layer of the stratosphere, which absorbs most of the Sun's ultraviolet (UV) radiation. Quantum mechanical excited-state studies, including localization of avoided crossings and conical intersection play a critical role in understanding its role in Earth atmosphere.

### Setup
You only need to run the two cells below if you have not already run the cells from the `Setup` and `Connect to Blob Storage` sections at the start of this sample.

In [None]:
# Log in to the Azure CLI (only need to do once per session)
!az login

In [None]:
# Get the connection string for your storage account by specifying the name of the storage account as well as the resource group you deployed it into
# Update the last two parameters to the names for your storage account and resource group
az_cli_output = !az storage account show-connection-string -g my-resource-group-name -n my-storage-account-name
az_cli_output = json.loads("".join(az_cli_output))

# Extract connection string from CLI output
connection_string = az_cli_output["connectionString"]

# Create blob service client to connect with the Blob Storage service 
service_client = BlobServiceClient.from_connection_string(conn_str=connection_string)

### Compute Hamiltonian features for the Ozone molecule
You can find further orbital integral data in the samples repo [here](https://github.com/microsoft/Quantum/tree/main/samples/chemistry/IntegralData/YAML).

To load data from a GitHub URL using the `requests` module, you need to use the 'raw' content URL (should have `raw.githubusercontent.com` as the domain). 

You can find this URL by navigating to the file you would like to use and selecting 'Raw' from the panel at the top of the document view. This should reload the page - use the URL for the page that just loaded.

In [None]:
import numpy as np
from numpy import linalg as LA
from qsharp.chemistry import load_broombridge, load_fermion_hamiltonian, IndexConvention
import requests

# Fetch data from URL
# See note above for instructions on how to test with other files
r = requests.get("https://raw.githubusercontent.com/microsoft/Quantum/main/samples/chemistry/IntegralData/YAML/O3_ccpvtz/o3_13_6_6_90deg_ccvtz.yaml")

ozone_filename = "o3_13_6_6_90deg_ccvtz.yaml"

# Write file to local storage (temp storage)
with open(ozone_filename, 'wb') as f:
    f.write(r.content) 

print(f"Processing the following file: {ozone_filename}\n")

# Load Broombridge schema for Ozone
broombridge = load_broombridge(ozone_filename)

# Load Hamiltonian data
general_hamiltonian = broombridge.problem_description[0].load_fermion_hamiltonian(index_convention=IndexConvention.UpDown)

# Calculate one-norms and save results to output string
output = "Type,one-norm\n"
print("End of file. Computing One-norms:")
for term, matrix in general_hamiltonian.terms:
    one_norm = LA.norm(np.asarray([v for k, v in matrix], dtype=np.float32), ord=1)
    output += f"{term},{one_norm}\n"
    print(f"\tOne-norm for term type {term}: {one_norm}")

print()

# Create blob client for the new .csv output file
blob_client = service_client.get_blob_client(container="data", blob="ozone_results.csv")

# Upload data to Blob Storage
# Enable/disable overwrite to specify behaviour when a blob already exists with that name
blob_client.upload_blob(output, overwrite=True)
