Skip to content
This repository has been archived by the owner on Oct 12, 2023. It is now read-only.

Latest commit

 

History

History
51 lines (27 loc) · 1.98 KB

README.md

File metadata and controls

51 lines (27 loc) · 1.98 KB

doAzureParallel Guide

This section will provide information about how Azure works, how best to take advantage of Azure, and best practices when using the doAzureParallel package.

  1. Azure Introduction (link)

    Using the Data Science Virtual Machine (DSVM) & Azure Batch

  2. Virtual Machine Sizes (link)

    How do you choose the best VM type/size for your workload?

  3. Autoscale (link)

    Automatically scale up/down your cluster to save time and/or money.

  4. Azure Limitations (link)

    Learn about the limitations around the size of your cluster and the number of foreach jobs you can run in Azure.

  5. Package Management (link)

    Best practices for managing your R packages in code. This includes installation at the cluster or job level as well as how to use different package providers.

  6. Distributing your Data (link)

    Best practices and limitations for working with distributed data.

  7. Parallelizing on each VM Core (link)

    Best practices and limitations for parallelizing your R code to each core in each VM in your pool

  8. Persistent Storage (link)

    Taking advantage of persistent storage for long-running jobs

  9. Customize Cluster (link)

    Setting up your cluster to user's specific needs

  10. Long Running Job (link)

    Best practices for managing long running jobs

  11. Programmatically generated config (linik)

Generate credentials and cluster config at runtime programmatically

Additional Documentation

Take a look at our Troubleshooting Guide for information on how to diagnose common issues.

Read our FAQ for known issues and common questions.