Name		Name	Last commit message	Last commit date
parent directory ..
1.intro-to-dask.ipynb		1.intro-to-dask.ipynb
README.md		README.md

README.md

page_type

languages

products

description

sample

python

azurecli

azure-machine-learning

Learn how to read from cloud data and scale PyData tools (Numpy, Pandas, Scikit-Learn, etc.) with [Dask](https://dask.org) and Azure ML.

Using Dask

"Dask natively scales Python" and "provides advanced parallelism for analytics, enabling performance at scale for the tools you love." It is open source, freely available, and sits in the PyData ecosystem of tools, develop in coordination with other projects like Numpy, Pandas, and Scikit-Learn. It provides familiar APIs for Python users, allows for low-level customization and streaming with a futures API, and scales up on clusters.

Dask is often compared to Spark - see this page to help evaluate which is the better tool for you. Common ML tools like Optuna, Scikit-Learn, XGBoost, LightGBM, and more can be distributed via Dask. There are numerous packages available for scaling on cloud clusters.

In this tutorial, the following notebooks demonstrate using Dask with Azure:

1.intro-to-dask.ipynb

The main dask and distributed themselves are small and focused. Thousands of tools, some built by the Dask organization and most not, utilize Dask for parallel or distributed processing. Some of the most useful for data science include:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using-dask

using-dask

README.md

Using Dask

Files

using-dask

Directory actions

More options

Directory actions

More options

Latest commit

History

using-dask

Folders and files

parent directory

README.md

Using Dask