# Loading to Blob Storage

## Useful Resources

* https://docs.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-python


## Naming

Azure blob storage has the same concepts as the AWS S3 service but different naming convetion, so find below 1:1 mappings AWS to Azure.

* S3 to Blob storage
* Bucket to Container
* Key to Blob Name

In [None]:
import uuid
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient, __version__


try:
    print("Azure Blob Storage v" + __version__ + " - Python quickstart sample")

    # Quick start code goes here

except Exception as ex:
    print('Exception:')
    print(ex)

## Set Enviroment Variables

### On Windows

```powershell
setx AZURE_STORAGE_CONNECTION_STRING "<yourconnectionstring>"
```

You can fetch the connection string by going to the Azure storage resource, settings, type "connection" and copy the value under key1

In [None]:
import os
import shutil
import pandas

connect_str = os.getenv('AZURE_STORAGE_CONNECTION_STRING')
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
container_name = "testing"

# Create a local directory to hold blob data
local_path = "./data"
try:
    os.mkdir(local_path)
except FileExistsError:
    shutil.rmtree(local_path)
    os.mkdir(local_path)

# Create a file in the local data directory to upload and download
local_file_name = "alessio" + ".txt"
upload_file_path = os.path.join(local_path, local_file_name)

# Write text to the file
file = open(upload_file_path, 'w')
file.write("Hello, World!")
file.close()

# Create a blob client with blob name inside blob virtual folder "/test"
blob_name = "/test" + local_file_name
blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob_name)

print("\nUploading to Azure Storage as blob:\n\t" + local_file_name)

# Upload the created file
with open(upload_file_path, "rb") as data:
    blob_client.upload_blob(data, overwrite=True)

# Now Loading a Parquet File data.parquet
print("Uploading the pandas parquet file")
# Creating the file
data = {"country":["italy", "germany", "spain"], "sales":[100, 40, 20]}
df = pandas.DataFrame(data)
df.to_parquet("data/data.parquet")
# We need a new client as we are using a new blob_name
blob_name = "/test/" + "data.parquet"
blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob_name)

with open("data/data.parquet", "rb") as data:
    blob_client.upload_blob(data)