# Training custom GPT2 model

We will use nanoGPT by Andrej Karpathy.

For full source see https://github.com/karpathy/nanoGPT.git

In [6]:
# Download the nanoGPT from Andrej Karpathy's github
import urllib.request
base_url = "https://github.com/karpathy/nanoGPT/raw/master/"
urllib.request.urlretrieve(f"{base_url}/model.py", "model.py")
urllib.request.urlretrieve(f"{base_url}/train.py", "train.py")
urllib.request.urlretrieve(f"{base_url}/configurator.py", "configurator.py")
urllib.request.urlretrieve(f"{base_url}/sample.py", "sample.py")

('sample.py', <http.client.HTTPMessage at 0x7f3e4b698400>)

Model configuration is in configs/azure_docs_training.py, but it is mostly on defaults (GPT2 in its small 124M version).

Max iterations is set to 3000 as we do not want to spend more on GPU in this demo.

In [12]:
!/bin/python3 train.py configs/azure_docs_training.py

# 61 minutes on NVIDIA A100 GPU

Overriding config with configs/azure_docs_training.py:
out_dir = 'azure_docs_out'
eval_interval = 250 # keep frequent because we'll overfit
eval_iters = 200
log_interval = 10

always_save_checkpoint = False

wandb_log = False
wandb_project = 'azure_docs'
wandb_run_name = 'nano-gpt-training'

dataset = 'azure_docs'
batch_size = 12
block_size = 1024
gradient_accumulation_steps = 5 * 8

max_iters = 3000

tokens per iteration will be: 491,520
Initializing a new model from scratch
defaulting to vocab_size of GPT-2 to 50304 (50257 rounded up for efficiency)
number of parameters: 123.59M
num decayed parameter tensors: 50, with 124,354,560 parameters
num non-decayed parameter tensors: 25, with 19,200 parameters
using fused AdamW: True
compiling the model... (takes a ~minute)
step 0: train loss 10.9024, val loss 10.9210
iter 0: loss 10.9571, time 29977.77ms, mfu -100.00%
iter 10: loss 8.9336, time 3398.18ms, mfu 39.63%
iter 20: loss 9.5884, time 3423.06ms, mfu 39.60%
iter 30: loss 9.0825, time 

In [20]:
import subprocess

cmd = "/bin/python3 sample.py --out_dir=azure_docs_out --start='To configure blob storage account permissions, you need to '"
output = subprocess.check_output(cmd, shell=True)

In [22]:
from IPython.display import Markdown

Markdown(output.decode("utf-8"))

Overriding: out_dir = azure_docs_out
Overriding: start = To configure blob storage account permissions, you need to 
number of parameters: 123.59M
No meta.pkl found, assuming GPT-2 encodings...
To configure blob storage account permissions, you need to  blob storage account configuration settings.

The following sections describe how to define the blob indexer connection string. Depending on the context of your data connector has, you can use `DefaultStorageAccountSasToken`.

### Blob storage access signature
The blob indexer requires an Azure Storage account to retrieve a blob from Blob storage. Blob storage is an Azure resource that doesn't use blob storage as a storage account with a blob access policy. The following command creates a Blob storage container and uploads a blob to it:

```azurecli
az storage blob create -i <storage-account-name> -s <container-name> --account-name <storage-account-key>
```

To access blob data in the same virtual machine, the storage account must be granted access to a resource, such as a blob or a group.

## Read data

The blob indexer requires a read-only access signature. The blob indexer requires a read-only access signature. The key is of one of the following formats:

| Format | Format | Default | Notes |
|--------|----------------|-------|--------------|
| `<storage-account-name>` |`{`blob`*`| String | The blob name to read data from the storage account. Valid characters are `{` and `}` when using the blob name. Only valid characters are allowed.|
| `<container-name>` |`{`blob`*`| String | The name of the indexer to read data from the storage account. Default is `myblobstorage`. |
| `<container-name>` |`{`blob` name`| String | The name of the blob in storage account that contains the data. For example, `myfilestorage.jpg`|
| `<container-name>` |`{`blob` name`| String | The name of the blob in storage account that contains the data. The blob name can be `file` or `sig`. |
| `<container-name>` |`{`blob` name`| String | The name of the blob in storage account that contains the data. Valid characters are `blob` and `sig`. |
| `<ir-username>` |`{`user
---------------
To configure blob storage account permissions, you need to  [Blob Data Contributor](../role-based-access-control/built-in-roles.md#storage-account-contributor) or [Blob Data Reader](../role-based-access-control/built-in-roles.md#storage-blob-data-reader) roles, which can be one of the following permissions:

- Create user
- Set ACLs for both the source and target storage accounts

With a Blob Storage account, you can create a user with Azure role-based access control (Azure RBAC) to grant the user the ability to grant two permissions:

- Set the target storage accounts to the target storage accounts
- Set storage accounts to the target storage accounts
- Set storage accounts to Block Blob Data | Block Blob Data Contributor | Block Blob Data Contributor | Block Blob Data Reader | Block Blob Data Contributor | Block Blob Data Reader |

> [!NOTE]
> Storage account administrators can grant the following permissions at management account scope and scope, by ensuring that users only have the following permissions.

> [!div class="mx-tableFixed"]
> | Permission | Scope | Length | Valid Characters |
> | --- | --- | --- | --- |
> | accounts | storage | 5 | 20 |
> | accounts | storage | 5 | 20 |
> | accounts / storage accounts | 10 | 20 |
> | accounts / Storage accounts | 100 | 20 | |
> | accounts / storage accounts | 1 | 20 | 30 |
> | accounts / storage accounts | 100 | 20 | 30 |
> | accounts / storage accounts | 250 | 20 | 30 |
> | accounts / storage accounts | 250 | 20 | 30 |
> | accounts / storage accounts | 125 | 30 | 30 |
> | accounts / storage accounts | 250 | 20 | 30 |
> | accounts / storage accounts | 250 | 20 | 30 |
> | accounts / storage accounts | 250 | 20 | 30 |
> | accounts / storage accounts | 250 | 40 | 30 |
> | accounts / storage accounts | 250 | 20 | 30 |
> | accounts / storage accounts | 250 | 20 | 30 |
> | accounts / storage accounts | 250 | 20 | 30 |

## Microsoft.Storage

> [!div class
---------------
To configure blob storage account permissions, you need to 
to grant a user the appropriate permissions on the destination container to be changed.

If you have already configured a destination container to be changed, you can use the `--users` parameter in the Azure portal. The user can then use an existing user account that's associated with it, and then rerun the copy.

To add the source container to a destination, select the destination container in the context menu on the left as you can add a destination container.

![Screenshot shows how to add a destination container.](./media/connector-configure-storage-account-azure-portal/add-dest-container-container.png)


To add a destination container to a destination share, select the destination container and select the destination container. Select the destination container and then select the destination container.

![Screenshot shows how to add a destination container.](./media/connector-configure-storage-account-azure-portal/dest-container-select-destination.png)

For example:

![Screenshot shows how to add an additional destination container.](./media/connector-configure-storage-account-azure-portal/add-destination-container.png)


## Create a destination share

After you start creating a destination share, you can add a destination share.

To add a destination share to a destination share, select the destination share and then select the destination share in the context menu on the left side of the context menu.

![Screenshot shows how to create a destination share.](./media/connector-configure-storage-account-azure-portal/select-dest-share-1.png)

To create a destination share:

1. Under the **Destination** section, select **Add**.
1. On the **Add destination** pane, provide the name of the destination share.

![Screenshot shows how to add an additional destination share.](./media/connector-configure-storage-account-azure-portal/add-destination-share-2.png)

1. On the **Event** page, provide the path to the destination share.

## Create a destination share

After you create a destination share, you can optionally configure the storage account to store the
---------------
To configure blob storage account permissions, you need to 
- **Storage Blob Data Reader** - A blob container holds 
- **Blob Data Reader** - A blob container holds 
- **File Blob Data Reader** - An Azure storage container holds 

## Configure managed identity

In this section, you learn how to enable managed identity for Storage account, including how to enable storage account **managed identity** for Storage blob container. Data owner can disallow the creation of a storage account, and on a container that also grants **Storage Blob Data Reader** role on the container or a share access to the storage account storage.

### Enable system-managed identity for Azure Storage blob container

> [!NOTE]
> Currently the Azure portal doesn't support system-managed identity when creating a storage account.

1. Navigate to the storage account that has the user-assigned managed identity you want to disable from the portal.

    The storage account must be in the same region and in the same region as your Azure Storage account.

1. Navigate to your storage account's page in **Search Azure services**.

    ![Screenshot that shows search for search for storage account.](media/howto-manage-data-credentials/search-storage.png)

1. On the **System assigned** section, select **System assigned** to enable the system-assigned identity.

    ![Screenshot that shows the system assigned.](media/howto-manage-data-credentials/storage.png)

1. Select **Save**.

This action configures the system-assigned identity of the Storage Blob Data Reader role.

### Enable system-assigned identity

1. Navigate to your storage account's page in **Security** > **System assigned**.

    ![Screenshot that shows the system assigned](media/howto-manage-data-credentials/storage.png)

1. Select **Access control (IAM)**.

    ![Screenshot that shows the Access control page.](media/howto-manage-data-credentials/storage.png)

1. In the **Role assignment** section, select **Storage Blob Data Reader** role and select **Next**.
![Screenshot that shows
---------------
To configure blob storage account permissions, you need to  [Azure Virtual Machines for SAP workload](planning-guide.md).

### Configure a service principal

If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.

1. On the left navigation pane, select **All services**. Then, select **+ New service principal**.

1. In **Add service principal**, enter or select the following information:

    | Setting | Value |
    | ------- | ----- |
    | **Service principal key** | Select **Service principal key**. |
    | **Service principal key** | Select **Service principal key**. |
    | **Service principal key** | Select **Service principal key**. |
    | **Service principal key** | Select **Service principal key**. |
    | **Service principal key** | Select **Service principal key**. |
    | **Service principal key** | Select **Service principal key**. |
    | **Tenant ID** | Select **Service principal key**. |

1. Select **Add**.

1. Select the service principal or enter a **Service principal key**.

1. In the search box, enter the following information:

    | Setting | Value |
    | ------- | ----- |
    | **Service principal key** | Select **Service principal key**. |
    | **Service principal key** | Select **Service principal key**. |
    | **Service principal key** | Select **Service principal key**. |

1. Select **Add**.

### Create an Azure service principal

Create an Azure service principal for the service principal that you created in the previous step.

1. In the search box, enter the following information:

    | Setting | Value |
    | ------- | ----- |
    | **Service principal key** | Select **Service principal key**. |
    | **Service principal key** | Select **Service principal key**. |
    | **Service principal key** | Select **Service principal key
---------------
To configure blob storage account permissions, you need to  first [Azure Storage permissions](../storage/common/storage-account-overview.md). If you don't have an Azure Storage account, you can [sign up for a free trial](https://azure.microsoft.com/pricing/free-trial/).

The following table summarizes the process of creating a Storage account.

| Storage account                                          |Storage account access                                                                           |
|------------------------------------------------|-----------------------------------------------------------------------------------|
| SQL                                                             | [Copy data from SQL database from Blob](#step-2-copy-data-from-sql-database-from-blob)                                                          |
| SQL                                                                                                                               | [Copy data from SQL database to Azure SQL Database](#step
---------------
To configure blob storage account permissions, you need to  [Blob Data Lake Storage Gen2](data-lake-storage-configure-storage-accounts.md#create-an-azure-storage-account-azure-storage-account-through-the-azure-portal) and [Blob Data Lake Storage Gen2](data-lake-storage-use-blob-storage.md).

To access blob data:

1. Create a container.
2. Get a container.
3. Create a blob container.

The following example shows how to add a blob container to the **account** storage account. The account name needs to be specified in the blob storage account.

```powershell
$storageContainer = "storeame"
$containerName = "storeame"
$accountName = "aclname"
$containerName = "storename"
$containerName = "storename"

New-AzStorageBlobContainer -Name $containerName -Container $containerName -Context $ctx -Container $containerName -Context $ctx -Blob $containerName

```

## Create a blob container

An Azure Storage blob container provides a unique access model information for a blob. First, you need to set the **access level** of an anonymous access policy using the **New-AzStorageBlobAccessPolicy** flag. For more information about which access permissions were created and how to set access policies, see [Understand and fine-grained access permissions](../storage/blobs/storage-auth-abac.md).

To access blob data:

```powershell
$blob = Set-AzStorageBlobAccessPolicy -AccessType BlockBlob -AccessName Blob -AccessPolicy Blob -AccessPolicy Blob  -Context $ctx -Blob $blob
$blob.AccessPolicy.Name.ToString()
$policy = Set-AzStorageBlobAccessPolicy -AccessPolicy Blob -Context $ctx -Blob $blobName -Context $ctx
```

And the following command returns a list of the blobs:

```powershell
$blob = Set-AzStorageBlobContent -AccessPolicy Blob $ctx -Blob $blobName -Context $ctx

$blob.Blob.Description.Format.
---------------
To configure blob storage account permissions, you need to  Blob Data Owner role. Permission is the permission to create, read, write, and delete permissions. For more information about permissions, see [Permissions in Azure Data Box](../storage/blobs/data-lake-storage-auth-support.md).

## Step 5: Add a blob

The following steps describe how to add a blob in the list of Blob Storage.

1. In the **Storage Accounts** section, go to **Access Control List** > **Blob Data Owner**.
2. In the **Access Control List** section, select **Blob Data Owner**.
3. Search for the blob you're working in, and then select it.
4. Select the blob in the list of Storage Accounts you want to grant the permission to.
5. Select the **Access Control List** button.
6. Select **+ Add Blob Data Owner**.
7. Search for the context menu, and then select **Azure Blob Storage**.
8. Search for the context menu, and then select **Storage account**.
9. Select the container that you want to restrict.
10. Select **Review + Add** again, and then select **Review + Add** again.
11. Select **Create**.

## Step 3: Set an access control setting

1. In the storage account settings for Blob Storage, select **Blob Containers**.
2. Select the container that you want the access control setting.
3. Select **Access controls (IAM)** at the top.
4. Select **Add**.
5. Select **Review + Configure** again, and then select **Create** again.

## Step 3: Upload data to blob storage

1. In Blob Storage, select the blob you wish to upload.
2. Select the blob you wish to upload.
3. Upload the blob to a container in Azure Storage.
4. Upload each blob in Azure Storage.

## Step 4: Upload data to blob storage

1. In Storage Explorer, select the blob you wish to upload.
2. Select the blob you wish to upload.
3. Upload the uploaded data to Blob storage.
4. Upload the blob to Azure Data Lake Storage.
4. Upload the blob to a container.
5. Upload the
---------------
To configure blob storage account permissions, you need to 
- The blob can't be in the Gen 1, Gen 1, or Gen 2 storage accounts.

The following diagram shows how the **Blob service access permissions permissions (write permissions):**

![Diagram showing storage account permissions access permissions.](./media/blob-data-operations-access-control/create-blob-storage.png)

The **Blob service access permissions** section contains a list of **Blob service access permissions** for a storage account.

### How Blob access permissions work.

You can create your access policy to perform an action against a storage account on the storage account. For example, you can grant permissions to perform the action against a storage account, such as downloading blobs and calling Blob REST APIs.

For example, if you use Storage Explorer to submit blob storage data actions, you can be scoped to the Storage account, and then the blob service needs to run the action.

For example, let's assume that you have a Blob Storage resource for your blob container or blob, you need to create a Blob Storage account connection.

### How to create a blob connection

The following example shows how to create a Blob Storage connection.

```javascript
const Az.Storage.blobServiceConnectionString = "azstorage account get-connection-connection-id";
const BlobContainerClient = BlobContainerClient.from_string(storageConnectionString, containerName);
```

The [Azure Storage client library](../storage/blobs/storage-quickstart-blobs-powershell.md) is used to set up ACL and create a container for your storage account.

The following example shows how to create a container and then assign it to a blob container. Copy the contents of the container and create a blob every hour.

```javascript
blobStorageClient.blobContainerClient.SetContainer();
```

### How to publish data changes

In some cases, you can expect the operation to finish, which updates to a file's URL. If you want to see an error when you call the upload, you can specify that property to be modified. If a blob's property has been altered, it's cached to its source blob. This section provides an error when the upload is being processed.

If you
---------------
To configure blob storage account permissions, you need to  [blob container administrative access](../role-based-access-control/built-in-roles.md#contributor) or [Azure Blob Storage developer's Guide](../storage/blobs/documentation-government-overview.md).

You can assign access permissions to users when you create or invite a new file. For instructions to create a new Blob Storage container, see [Create a blob container and blob container](../storage/blobs/storage-quickstart-blobs-portal.md). 

> [!IMPORTANT]
> If you want to create a new Blob Storage container, see [Configure a Blob container](../storage/common/storage-configure-persistent.md?tabs=portal).
>
> Storage account key or SAS token can't be used for data access outside of Azure Blob Storage. For more information, see [Azure Key Vault access control](../storage/common/storage-configure-key-vault.md).

## Configure and share cross-tenant storage classes

The following steps will demonstrate how to configure and share cross-tenant storage classes:

1. Use the following command to set the access level:

    ```azurepowershell-interactive
    $storage`

    To set the access level for the cross-tenant storage class:

    ```azurepowershell-interactive
    Set-AzStorageAccount -ResourceGroupName <ResourceGroupName> -Name <Name of your Azure storage account>
    ```

1. When the access level is set to **Append Blob**, the Blob Storage client uses the [New-AzStorageServiceAcls](/powershell/module/az.storage/new-azstorageservicebusserver) cmdlet.

    ```azurepowershell-interactive
    Set-AzStorageContainerAcls -Name "<Name of your Azure storage account>/default" -StorageAccountKey "<Key>"
    ```

1. Create a container with the shared access signature or SAS token that you created in [Prerequisites](#prerequisites) and give the container to be named `ContainerService.Service
---------------
