<a href = "https://www.pieriantraining.com"><img src="../PT Centered Purple.png"> </a>

<em style="text-align:center">Copyrighted by Pierian Training</em>

# Azure Synapse Workspaces with Python

## Azure Actions Covered

* Creating, listing, and deleting a Synapse Workspace with Python
* Creating, listing, and deleting dedicated SQL pools in a Synapse Workspace

In this lecture, we'll learn how to set up Synapse Workspaces in Azure with Python.

To begin, we'll need to import our usual libraries as well as any useful environment variables (e.g. AZURE_SUBSCRIPTION_ID). We'll add some new imports as well.

In [1]:
from azure.identity import AzureCliCredential
# New imports for data lake storage
from azure.storage.filedatalake import DataLakeServiceClient
from azure.mgmt.synapse import SynapseManagementClient
from azure.mgmt.synapse import models

from settings import AZURE_SUBSCRIPTION_ID, DEFAULT_LOCATION, DEFAULT_RESOURCE_GROUP

We'll first set up our credential and then our two clients, one for the data lake and one for our Synapse Workspaces.

In [2]:
credential = AzureCliCredential()
dl_service_client = DataLakeServiceClient(
    account_url='https://bendatalake1234.dfs.core.windows.net/',
    credential=credential
)
synapse_client = SynapseManagementClient(credential, AZURE_SUBSCRIPTION_ID)

Let's store our data lake storage account URL in a variable.

In [3]:
dl_account_url = dl_service_client.primary_endpoint
dl_account_url

'https://bendatalake1234.dfs.core.windows.net/'

We'll also need the file system URL before setting up the Synapse Workspace.

In [5]:
for fs in dl_service_client.list_file_systems():
    print(fs)

{'name': 'dl-file-system', 'last_modified': datetime.datetime(2023, 5, 29, 19, 55, 1, tzinfo=datetime.timezone.utc), 'etag': '"0x8DB607E96E390E8"', 'lease': {'status': 'unlocked', 'state': 'available', 'duration': None}, 'public_access': None, 'has_immutability_policy': False, 'has_legal_hold': False, 'metadata': None, 'deleted': None, 'deleted_version': None, 'encryption_scope': <azure.storage.filedatalake._models.EncryptionScopeOptions object at 0x7f2af84fad90>}


To create a new Synapse Workspace, we'll use the `workspaces.begin_create_or_update()` method, which takes the following parameters:
* `resource_group_name` - Name of resource group to which to attach the workspace
* `workspace_name` - Name for Synapse Workspace
* `workspace_info` - Parameters for workspace based on `Workspace` model.

The `Workspace` model has parameters that include:

* `location` - Azure location for the Synapse Workspace
* `identity` - Identity of the workspace
* `default_data_lake_storage` - `DataLakeStorageAccountDetails` object, which will take:
    * `acount_url` - Storage account URL
    * `filesystem` - File system path in storage account
* `sql_administrator_login` - User name for SQL database administrator
* `sql_administrator_login_password` - Password for SQL database administrator

For full list of parameters for these classes, see:
* the [Workspace class](https://learn.microsoft.com/en-us/python/api/azure-mgmt-synapse/azure.mgmt.synapse.models.workspace?view=azure-python)
* the [DataLakeStorageAccountDetails class](https://learn.microsoft.com/en-us/python/api/azure-mgmt-synapse/azure.mgmt.synapse.models.datalakestorageaccountdetails?view=azure-python)

In [18]:
workspace = synapse_client.workspaces.begin_create_or_update(
    resource_group_name=DEFAULT_RESOURCE_GROUP,
    workspace_name='bens-synapse-workspace1234',
    workspace_info=models.Workspace(
        location=DEFAULT_LOCATION,
        identity=models.ManagedIdentity(type='SystemAssigned'),
        default_data_lake_storage=models.DataLakeStorageAccountDetails(
            account_url=dl_account_url,
            filesystem='dl-file-system'
        ),
        sql_administrator_login='benadmin',
        sql_administrator_login_password='testpassword123!'
    )
)

Now let's list all of our available Synapse Workspaces.

In [4]:
for workspace in synapse_client.workspaces.list():
    print(workspace.name)

bens-synapse-workspace1234


Although one of the main benefits of the Azure Synapse is the availability of serverless SQL pools, you can also create dedicated SQL pools via the Python SDK. To use the `begin_create()` method, you'll need to use the following parameters:
* `resource_group_name` - Name of the resource group under which to create the SQL pool
* `workspace_name` - Name of Synapse Workspace where pool will be created
* `sql_pool_name` - Name of the dedicated SQL pool
* `sql_pool_info` - Parameters for SQL pool. For full list of parameters, see the [SqlPool class](https://learn.microsoft.com/en-us/python/api/azure-mgmt-synapse/azure.mgmt.synapse.models.sqlpool?view=azure-python)

In [6]:
sql_pool = synapse_client.sql_pools.begin_create(
    resource_group_name=DEFAULT_RESOURCE_GROUP,
    workspace_name='bens-synapse-workspace1234',
    sql_pool_name='mySqlPool',
    sql_pool_info=models.SqlPool(
        location=DEFAULT_LOCATION
    )
)
result = sql_pool.result()

We can see all the dedicated SQL pools in our workspace.

In [12]:
for pool in synapse_client.sql_pools.list_by_workspace(DEFAULT_RESOURCE_GROUP, 'bens-synapse-workspace1234'):
    print(pool)

{'additional_properties': {}, 'id': '/subscriptions/bf8c33be-e4bb-46c8-871a-85182d913c50/resourceGroups/default-resource-group/providers/Microsoft.Synapse/workspaces/bens-synapse-workspace1234/sqlPools/mySqlPool', 'name': 'mySqlPool', 'type': 'Microsoft.Synapse/workspaces/sqlPools', 'tags': None, 'location': 'eastus', 'sku': <azure.mgmt.synapse.models._models_py3.Sku object at 0x7f7413f58070>, 'max_size_bytes': 263882790666240, 'collation': 'SQL_Latin1_General_CP1_CI_AS', 'source_database_id': None, 'recoverable_database_id': None, 'provisioning_state': 'Succeeded', 'status': 'Online', 'restore_point_in_time': None, 'create_mode': None, 'creation_date': datetime.datetime(2023, 6, 16, 16, 51, 39, 283000, tzinfo=<isodate.tzinfo.Utc object at 0x7f7421052280>), 'storage_account_type': 'GRS'}


Dedicated SQL pools can be expensive, so let's delete ours.

In [14]:
poller = synapse_client.sql_pools.begin_delete(
    resource_group_name=DEFAULT_RESOURCE_GROUP,
    workspace_name='bens-synapse-workspace1234',
    sql_pool_name='mySqlPool'
)
result = poller.result()

Finally, let's delete our workspace as well.

In [15]:
poller = synapse_client.workspaces.begin_delete(
    resource_group_name=DEFAULT_RESOURCE_GROUP,
    workspace_name='bens-synapse-workspace1234'
)
result = poller.result()