PERF: Stop recomputing both indices for user-defined and dict-like axis-wide applies

### Describe the problem
We recompute both column and row indices for column-wise and row-wise applies of user-defined callables and dict-like functions. [Here](https://github.com/modin-project/modin/blob/49f47b1ed66583d4b90d3c06faf3e1d457cf4df8/modin/core/storage_formats/pandas/query_compiler.py#L2394) for dict applies and [here](https://github.com/modin-project/modin/blob/49f47b1ed66583d4b90d3c06faf3e1d457cf4df8/modin/core/storage_formats/pandas/query_compiler.py#L2465) for callable applies, we specify neither `new_index` nor `new_columns` so we end up [recomputing](https://github.com/modin-project/modin/blob/685f1b4eb60c3dd6c9127bee93b6c138f890f700/modin/core/dataframe/pandas/dataframe/dataframe.py#L2325) both axes, so we always block on both the first row of partitions and the first column of partitions. You can observe this unnecessary blocking here:

```python
import modin.pandas as pd
import numpy as np
import time
from modin.config import MinPartitionSize

num_columns = MinPartitionSize.get() + 1

# 3 rows where each row has the numbers from 0 through num_columns 
# exclusive in sequential order
# so the entire frame has a single row of two partitions,
# where the first has num_columns - 1 columns and the second has one.
df = pd.DataFrame(np.tile(np.arange(num_columns), (3, 1)))

# This takes hours for the the column containing 32,
# but should return much faster for the other columns.
def col_func(col):
    if col[0] == num_columns - 1:
        time.sleep(10000)
    return col * 2

# This blocks on the last column to recompute
# the column index, so it blocks 10000 seconds.
# instead, it should finish immediately while the last 
# apply completes asynchronously.
print('starting apply...')
result = df.apply(col_func)
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PERF: Stop recomputing both indices for user-defined and dict-like axis-wide applies #4445

Describe the problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PERF: Stop recomputing both indices for user-defined and dict-like axis-wide applies #4445

Description

Describe the problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions