Skip to content

Commit

Permalink
DOCS-#3795: Expand using_pandas_on_dask.rst doc (#3808)
Browse files Browse the repository at this point in the history
Signed-off-by: Alexey Prutskov <alexey.prutskov@intel.com>
  • Loading branch information
prutskov committed Dec 8, 2021
1 parent 6882ec2 commit 003f338
Showing 1 changed file with 19 additions and 4 deletions.
23 changes: 19 additions & 4 deletions docs/developer/using_pandas_on_dask.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,23 @@
pandas on Dask
==============

The Dask engine and documentation could use your help! Consider opening a
`pull request`_ or an issue_ to contribute or ask clarifying questions.
This section describes usage related documents for the pandas on Dask component of Modin.

.. _pull request: https://github.com/modin-project/modin/pulls
.. _issue: https://github.com/modin-project/modin/issues
Modin uses pandas as a primary memory format of the underlying partitions and optimizes queries
ingested from the API layer in a specific way to this format. Thus, there is no need to care of choosing it
but you can explicitly specify it anyway as shown below.

One of the execution engines that Modin uses is Dask. To enable the pandas on Dask execution you should set the following environment variables:

.. code-block:: bash
export MODIN_ENGINE=dask
export MODIN_STORAGE_FORMAT=pandas
or turn them on in source code:

.. code-block:: python
import modin.config as cfg
cfg.Engine.put('dask')
cfg.StorageFormat.put('pandas')

0 comments on commit 003f338

Please sign in to comment.