This section will provide information about how Azure works, how best to take advantage of Azure, and best practices when using the doAzureParallel package.
Azure Introduction (link)
Using Azure Batch
Getting Started (link)
Using the Getting Started to create credentials
i. Generate Credentials Script (link)
- Pre-built bash script for getting Azure credentials without Azure Portal
ii. National Cloud Support (link)
- How to run workload in Azure national clouds
Customize Cluster (link)
Setting up your cluster to user's specific needs
i. Virtual Machine Sizes (link)
- How do you choose the best VM type/size for your workload?
ii. Autoscale (link)
- Automatically scale up/down your cluster to save time and/or money.
iii. Building Containers (link)
- Creating your own Docker containers for reproducibility
Managing Cluster (link)
Managing your cluster's lifespan
Setting up your job to user's specific needs
i. Asynchronous Jobs (link)
- Best practices for managing long running jobs
ii. Foreach Azure Options (link)
- Use Azure package-defined foreach options to improve performance and user experience
iii. Error Handling (link)
- How Azure handles errors in your Foreach loop?
Package Management (link)
Best practices for managing your R packages in code. This includes installation at the cluster or job level as well as how to use different package providers.
i. Distributing your Data (link)
- Best practices and limitations for working with distributed data.
ii. Persistent Storage (link)
- Taking advantage of persistent storage for long-running jobs
iii. Accessing Azure Storage through R (link)
- Manage your Azure Storage files via R
Performance Tuning (link)
Best practices on optimizing your Foreach loop
Debugging and Troubleshooting (link)
Best practices on diagnosing common issues
Azure Limitations (link)
Learn about the limitations around the size of your cluster and the number of foreach jobs you can run in Azure.
Read our FAQ for known issues and common questions.