doAzureParallel Guide

This section will provide information about how Azure works, how best to take advantage of Azure, and best practices when using the doAzureParallel package.

Azure Introduction (link)

Using the Data Science Virtual Machine (DSVM) & Azure Batch
Virtual Machine Sizes (link)

How do you choose the best VM type/size for your workload?
Autoscale (link)

Automatically scale up/down your cluster to save time and/or money.
Azure Limitations (link)

Learn about the limitations around the size of your cluster and the number of foreach jobs you can run in Azure.
Package Management (link)

Best practices for managing your R packages in code. This includes installation at the cluster or job level as well as how to use different package providers.
Distributing your Data (link)

Best practices and limitations for working with distributed data.
Parallelizing on each VM Core (link)

Best practices and limitations for parallelizing your R code to each core in each VM in your pool
Persistent Storage (link)

Taking advantage of persistent storage for long-running jobs
Customize Cluster (link)

Setting up your cluster to user's specific needs
Long Running Job (link)

Best practices for managing long running jobs
Programmatically generated config (linik)

Generate credentials and cluster config at runtime programmatically

Additional Documentation

Take a look at our Troubleshooting Guide for information on how to diagnose common issues.

Read our FAQ for known issues and common questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

doAzureParallel Guide

Additional Documentation

Files

README.md

Latest commit

History

README.md

File metadata and controls

doAzureParallel Guide

Additional Documentation