Skip to content

tojozefi/azurebatch-beeond

Repository files navigation

Batch pool with scratch BeeOND shared filesystem

This repo offers scripts to easily deploy an Azure Batch pool with BeeOND shared filesystem.
You may find it useful if you need a high-performant shared scratch filesystem for your MPI jobs run on Azure Batch.

BeeOND filesystem is hosted on pool nodes' local SSD disks and is rebuilt to utilize RDMA InfiniBand for BeeOND internal communication.
The second NVME local disk of the VMs is used for BeeOND filesystem whenever present (e.g. in HB120rs_v2).

Note: This repo is dedicated for Azure VM SKUs with IB SR-IOV, currently: Standard_HB60rs, Standard_HC44rs and Standard_HB120rs_v2.

Note2: Credits are due to HPC-azbatch and azurehpc/beeond projects for inspiration and implementation ideas ;-)

Prerequisites

  1. Azure subscription
  2. Azure Batch account and a blob storage account linked to it.
  3. Core quota for the VM SKUs that you want to use in chosen region, either in your Batch account or in your Azure subscription (for user subscription allocation mode).

Quickstart

  1. Open a Cloud Shell (Bash) session from the Azure Portal, or open a Linux shell session with Azure CLI v2.0 and jq packages installed.
  2. Clone the repository: git clone https://github.com/tojozefi/azurebatch-beeond.git
  3. Grant execute access to .sh scripts: cd azurebatch-beeond; chmod +x 0*.sh

Procedure

  1. Update the params.tpl file with the values specific to your environment:
  • subscription : subscription id where your Azure Batch account is created
  • resource_group : the Batch account's resource group
  • AZURE_BATCH_ACCOUNT : the name of the Batch account
  • AZURE_BATCH_ACCESS_KEY : Batch account key (optional)
  • storage_account_name : the name of storage account linked with your Batch account
  1. Login to the Azure Batch account
    ./00-login.sh params.tpl
  2. Create the Azure Batch pool
    ./01-createpool.sh params.tpl
  3. Run a sample MPI job to mount the BeeOND filesystem and test its performance with IOR
    ./02-runjob.sh params.tpl

Monitor your job

Use Batch Explorer to monitor your pools and jobs.

About

Batch pools with BeeOND shared filesystem

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published