## Azure Databricks Environment Configuration
## Resource Group: sm-databricks-rg

1. Create Azure Databricks Workspace "sm-databricks-ws"
 - Create a new Azure Databricks workspace with the name sm-databricks-ws.

2. Create SQL Azure Server and Azure SQL Database
- Server: Create an Azure SQL logical server. Its fully qualified domain name (FQDN) will be sm-srv.database.windows.net. 
- Database: On the server above, create an Azure SQL Database named sm-db.

3. Connect Workspace to Metastore for Cluster-Unity Catalog Integration
- Configure the newly created Databricks workspace (sm-databricks-ws) to connect to an existing, external Hive metastore. This enables clusters within the workspace to leverage Unity Catalog for centralized governance.

4. Configure Access to Cloud Storage
4.1. Create an Access Connector for Azure Databricks
- Create an Access Connector for Azure Databricks resource named sm-databricks-ext-ac. This provides a managed identity for Azure Databricks.

4.2. Create a Storage Account (Azure Data Lake Storage Gen2)
- Provision a new Azure Data Lake Storage Gen2 (ADLS Gen2) account.

4.3. Assign Storage Account Role to the Access Connector
- On the ADLS Gen2 storage account, assign the Storage Blob Data Contributor role to the Access Connector's managed identity (sm-databricks-ext-ac). This grants read, write, and delete permissions to the data in the storage account.

4.4. Create a Storage Credential in Unity Catalog
- Within the Unity Catalog interface (Catalog > Credentials), create a new Storage Credential.

Name: sm_databricks_ext_sc
Type: Azure Managed Identity

Azure Managed Identity: Use the Managed Identity from the Access Connector sm-databricks-ext-ac. This credential securely references the storage account from within Unity Catalog.

In [0]:
%sql
SELECT current_metastore()

### 1. Create Credentials to Azure Access Connector

### 2. Create External Location

In [0]:
%sql
CREATE EXTERNAL LOCATION sm_ext_dl_content
  URL 'abfss://content@smdatabricksext.dfs.core.windows.net/'
  WITH (STORAGE CREDENTIAL sm_databricks_ext_sc)
  COMMENT 'External location for Social Media Data Engineering Project'

In [0]:
#%fs ls 'abfss://content@smdatabricksext.dfs.core.windows.net'

### 3. Create Catalog

In [0]:
%sql
CREATE CATALOG content
   MANAGED LOCATION 'abfss://content@smdatabricksext.dfs.core.windows.net/'

### 4. Create Target Schema in DLT 

In [0]:
%sql
CREATE SCHEMA IF NOT EXISTS content.target
  MANAGED LOCATION 'abfss://content@smdatabricksext.dfs.core.windows.net/target';
CREATE SCHEMA IF NOT EXISTS content.landing
  MANAGED LOCATION 'abfss://content@smdatabricksext.dfs.core.windows.net/landing';

### 5. Create External Volume 

To trzeba będzie usunąć json_files_data i stworzyc nowy schemat content.landing i tam umiescic volume 

In [0]:
%sql
USE catalog content;
USE SCHEMA landing

In [0]:
%sql
CREATE EXTERNAL VOLUME IF NOT EXISTS json_files_data
  LOCATION 'abfss://content@smdatabricksext.dfs.core.windows.net/landing/json_files_data'
  COMMENT 'place of storage json files (user details)'

### Test connection with Azure SQL database with Azure Key Vault

In [0]:
employees_table = (spark.read
  .format("jdbc")
  .option("url", dbutils.secrets.get(scope="sm-secret-scope", key = "social-media-project-db-jdbc"))
  .option("dbtable", dbutils.secrets.get(scope="sm-secret-scope", key = "social-media-project-db-tab-acc-users"))
  .option("user", dbutils.secrets.get(scope="sm-secret-scope", key = "social-media-project-dblog"))
  .option("password", dbutils.secrets.get(scope="sm-secret-scope", key = "social-media-project-secret"))
  .load()
)

In [0]:
employees_table.display()