# Day 2: Azure Environment Setup & Data Upload

This notebook walks through connecting to Azure Data Lake Storage Gen2 and uploading sample COVID-19 data using Python (Azure SDK).

## 📦 Prerequisites
- Azure subscription
- ADLS Gen2 account and container created
- `azurerm` role permissions assigned to your identity
- Install `azure-storage-blob` and `azure-identity` packages

In [None]:
# Install packages (uncomment if needed)
# %pip install azure-storage-blob azure-identity

In [None]:
# Import required libraries
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient
import os

## 🔐 Azure Configuration (Update placeholders with your values)

In [None]:
# Set your storage account name and container
account_url = 'https://<your_storage_account>.blob.core.windows.net'
container_name = '<your_container_name>'

# Use Azure credentials (if running from Azure or with CLI login)
credential = DefaultAzureCredential()

In [None]:
# Connect to the blob service
blob_service_client = BlobServiceClient(account_url=account_url, credential=credential)
container_client = blob_service_client.get_container_client(container_name)

## 📤 Upload Sample Data to Azure Data Lake

In [None]:
# Upload the sample CSV file to the container
local_file_path = 'data-samples/cleaned/sample_countries.csv'
blob_name = 'raw/sample_countries.csv'

with open(local_file_path, 'rb') as data:
    container_client.upload_blob(name=blob_name, data=data, overwrite=True)

print(f"Uploaded {blob_name} to Azure Data Lake.")

✅ You have now uploaded a sample COVID-19 data file to Azure Data Lake Storage. This file can be accessed and processed from Databricks in the next step.