Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify rechunk implementation #2745

Merged
merged 1 commit into from
Feb 25, 2022
Merged

Conversation

qinxuye
Copy link
Collaborator

@qinxuye qinxuye commented Feb 23, 2022

What do these changes do?

This PR simplifies the rechunk implementation while the older version is really complicated and hard to understand.

Master branch:

In [1]: import numpy as np; import mars; import mars.tensor as mt

In [2]: mars.new_session()
Web service started at http://0.0.0.0:30452
Out[2]: <mars.deploy.oscar.session.SyncSession at 0x7f8441101250>

In [3]: r = np.random.RandomState(0).rand(12000, 12000)

In [4]: a = mt.array(r, chunk_size=1680).execute()
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100.0/100 [00:04<00:00, 23.39it/s]

In [5]: %time a.rechunk(2111).execute()
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100.0/100 [00:02<00:00, 34.07it/s]
CPU times: user 1.48 s, sys: 302 ms, total: 1.78 s
Wall time: 2.95 s
Out[5]: 
array([[0.5488135 , 0.71518937, 0.60276338, ..., 0.84348096, 0.94290928,
        0.83282242],
       [0.5646904 , 0.83974605, 0.37688365, ..., 0.40347954, 0.46089504,
        0.01048474],
       [0.15319619, 0.0328116 , 0.2287953 , ..., 0.90022341, 0.49806663,
        0.77000204],
       ...,
       [0.81724816, 0.69873782, 0.06925849, ..., 0.67440032, 0.90861683,
        0.80968811],
       [0.45391215, 0.72735698, 0.11037642, ..., 0.45976168, 0.62173263,
        0.12291647],
       [0.96026503, 0.01277616, 0.49155639, ..., 0.29565832, 0.12990437,
        0.82810578]])

Current branch

In [1]: import numpy as np; import mars; import mars.tensor as mt

In [2]: mars.new_session()
Web service started at http://0.0.0.0:27007
Out[2]: <mars.deploy.oscar.session.SyncSession at 0x7ff911701970>

In [3]: r = np.random.RandomState(0).rand(12000, 12000)

In [4]: a = mt.array(r, chunk_size=1680).execute()
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100.0/100 [00:04<00:00, 23.36it/s]

In [5]: %time a.rechunk(2111).execute()
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100.0/100 [00:02<00:00, 49.79it/s]
CPU times: user 1.39 s, sys: 209 ms, total: 1.6 s
Wall time: 2.02 s
Out[5]: 
array([[0.5488135 , 0.71518937, 0.60276338, ..., 0.84348096, 0.94290928,
        0.83282242],
       [0.5646904 , 0.83974605, 0.37688365, ..., 0.40347954, 0.46089504,
        0.01048474],
       [0.15319619, 0.0328116 , 0.2287953 , ..., 0.90022341, 0.49806663,
        0.77000204],
       ...,
       [0.81724816, 0.69873782, 0.06925849, ..., 0.67440032, 0.90861683,
        0.80968811],
       [0.45391215, 0.72735698, 0.11037642, ..., 0.45976168, 0.62173263,
        0.12291647],
       [0.96026503, 0.01277616, 0.49155639, ..., 0.29565832, 0.12990437,
        0.82810578]])

Related issue number

Fixes #xxxx

Check code requirements

  • tests added / passed (if needed)
  • Ensure all linting tests pass, see here for how to run them

Copy link
Member

@wjsi wjsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@hekaisheng hekaisheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hekaisheng hekaisheng merged commit d554d97 into mars-project:master Feb 25, 2022
@qinxuye qinxuye deleted the perf/rechunk branch February 25, 2022 03:55
qinxuye pushed a commit to hekaisheng/mars that referenced this pull request Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants