### Configuring external location to access it from databricks

### Prerequisites

- **Create storage account**
-- You need an Azure Storage Account (preferably with Data Lake Storage Gen2 enabled and hierarchical namespace turned on). This is where your data will reside and be accessed by Databricks.

- **Create Databricks access connector**
-- An Azure Databricks Access Connector is a managed identity resource that allows Databricks to securely access Azure resources (like your storage account) without needing to manage secrets or keys.

- **Assign Databricks access connector to a role of Storage Data Contributor in storage account created**
-- You must grant the Access Connector the "Storage Blob Data Contributor" role on your storage account. This role assignment allows Databricks to read and write data in the storage account.

- **In Databricks workspace, create storage credential to wrap access connector in order to connect to external storage** -- In Databricks, you create a storage credential that uses the Access Connector. This credential is then referenced when you define external locations, allowing Databricks to authenticate and access your storage securely.

These steps ensure secure, managed, and scalable access from Databricks to your Azure storage resources.

In [0]:
%sql
CREATE EXTERNAL LOCATION IF NOT EXISTS databricks_external_loc_gizmobox
    URL 'abfss://gizmobox@databricksextstorageacc.dfs.core.windows.net/'
    WITH (STORAGE CREDENTIAL databricks_ext_sc)


In [0]:
%fs ls 'abfss://gizmobox@databricksextstorageacc.dfs.core.windows.net/Landing/Operational_data'

**Catalog**
- A catalog is the top-level container in Unity Catalog. It organizes schemas (databases) and provides a way to manage access control and data governance across multiple schemas and tables.

**Schema**
- A schema (also called a database) is a logical grouping of tables, views, and functions within a catalog. Schemas help organize data objects and manage permissions at a finer level than catalogs.

**Volume**
- A volume is a Unity Catalog object that provides managed, file-level access to data stored in cloud object storage. Volumes are used to store non-tabular data (such as images, documents, or other files) and are accessible via file APIs, not SQL tables.

###  Create Catalog

In [0]:
%sql
show catalogs

In [0]:
%sql
CREATE CATALOG IF NOT EXISTS  gizmobox
      MANAGED LOCATION 'abfss://gizmobox@databricksextstorageacc.dfs.core.windows.net/' 

### Create Schema

In [0]:
%sql
USE CATALOG gizmobox;
CREATE SCHEMA IF NOT EXISTS Landing
  MANAGED LOCATION 'abfss://gizmobox@databricksextstorageacc.dfs.core.windows.net/landing'
      

In [0]:
%sql
CREATE SCHEMA IF NOT EXISTS Bronze
  MANAGED LOCATION 'abfss://gizmobox@databricksextstorageacc.dfs.core.windows.net/bronze';
CREATE SCHEMA IF NOT EXISTS Silver
  MANAGED LOCATION 'abfss://gizmobox@databricksextstorageacc.dfs.core.windows.net/silver';
CREATE SCHEMA IF NOT EXISTS Gold
  MANAGED LOCATION 'abfss://gizmobox@databricksextstorageacc.dfs.core.windows.net/gold'

In [0]:
%sql
show schemas

### Create Volume

In [0]:
%sql
USE CATALOG gizmobox;
USE SCHEMA landing;

CREATE EXTERNAL VOLUME IF NOT EXISTS operational_data 
LOCATION 'abfss://gizmobox@databricksextstorageacc.dfs.core.windows.net/Landing/Operational_data'

In [0]:
%sql
USE CATALOG gizmobox;
USE SCHEMA landing;

CREATE EXTERNAL VOLUME IF NOT EXISTS external_data 
LOCATION 'abfss://gizmobox@databricksextstorageacc.dfs.core.windows.net/Landing/Operational_data'

In [0]:
%fs ls /Volumes/gizmobox/landing/operational_data/addresses

In [0]:
%sql
CREATE EXTERNAL LOCATION databricks_external_loc_gizmobox
    URL 'abfss://gizmobox@databricksextstorageacc.dfs.core.windows.net'
    WITH (STORAGE CREDENTIAL databricks_ext_loc_sc
)
    