Skip to content

Commit

Permalink
DOCS-#3782: Add using_pandas_on_python.rst doc (#3809)
Browse files Browse the repository at this point in the history
Signed-off-by: Alexey Prutskov <alexey.prutskov@intel.com>
  • Loading branch information
prutskov committed Dec 9, 2021
1 parent 61bf043 commit 195b668
Show file tree
Hide file tree
Showing 4 changed files with 41 additions and 3 deletions.
4 changes: 2 additions & 2 deletions docs/developer/architecture.rst
Original file line number Diff line number Diff line change
Expand Up @@ -200,8 +200,8 @@ documentation page on :doc:`contributing </developer/contributing>`.
- Uses the `Dask Futures`_ execution framework.
- The storage format is `pandas` and the in-memory partition type is a pandas DataFrame.
- For more information on the execution path, see the :doc:`pandas on Dask </flow/modin/core/execution/dask/implementations/pandas_on_dask/index>` page.
- pandas on Python
- Uses native python execution - mainly used for for debugging.
- :doc:`pandas on Python </developer/using_pandas_on_python>`
- Uses native python execution - mainly used for debugging.
- The storage format is `pandas` and the in-memory partition type is a pandas DataFrame.
- For more information on the execution path, see the :doc:`pandas on Python </flow/modin/core/execution/python/implementations/pandas_on_python/index>` page.
- :doc:`OmniSci on Native </developer/using_omnisci>` (experimental)
Expand Down
1 change: 1 addition & 0 deletions docs/developer/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Developer
partition_api
using_pandas_on_ray
using_pandas_on_dask
using_pandas_on_python
using_omnisci
using_pyarrow_on_ray
using_sql_on_ray
Expand Down
37 changes: 37 additions & 0 deletions docs/developer/using_pandas_on_python.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
pandas on Python
================

This section describes usage related documents for the pandas on Python component of Modin.

Modin uses pandas as the primary memory format of the underlying partitions and optimizes queries
from the API layer in a specific way to this format. Since it is a default, you do not need to specify
the pandas memory format, but we show how to explicitly set it below.

One of the execution engines that Modin uses is Python. This engine is sequential and used for debugging.
To enable the pandas on Python execution you should set the following environment variables:

.. code-block:: bash
export MODIN_ENGINE=python
export MODIN_STORAGE_FORMAT=pandas
or turn a debug mode on:

.. code-block:: bash
export MODIN_DEBUG=True
export MODIN_STORAGE_FORMAT=pandas
or do the same in source code:

.. code-block:: python
import modin.config as cfg
cfg.Engine.put('python')
cfg.StorageFormat.put('pandas')
.. code-block:: python
import modin.config as cfg
cfg.IsDebug.put(True)
cfg.StorageFormat.put('pandas')
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Queries that perform data transformation, data ingress or data egress using the
pass through the Modin components detailed below.

`pandas on Python` execution is sequential and it's used for the debug purposes. To enable `pandas on Python` execution,
please refer to the usage section in :doc:`pandas on Python </UsingPandasonPython/index>`.
please refer to the usage section in :doc:`pandas on Python </developer/using_pandas_on_python>`.

Data Transformation
'''''''''''''''''''
Expand Down

0 comments on commit 195b668

Please sign in to comment.