Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mt.sort support #827

Merged
merged 3 commits into from Nov 21, 2019
Merged

Conversation

qinxuye
Copy link
Collaborator

@qinxuye qinxuye commented Nov 18, 2019

What do these changes do?

This PR adds support for mt.sort which implements the parallel sorting by regular sampling algorithm that mentioned in http://csweb.cs.wfu.edu/bigiron/LittleFE-PSRS/build/html/PSRSalgorithm.html.

I did a little experiment to sort a 100 million 1-d array.

In [1]: n = 100_000_000                                                         

In [2]: import numpy as np                                                      

In [3]: import mars.tensor as mt                                                

In [4]: a = np.random.rand(n)                                                   

In [5]: a.nbytes                                                                
Out[5]: 800000000

In [6]: %%time 
   ...: np.sort(a) 
   ...:  
   ...:                                                                         
CPU times: user 12.1 s, sys: 355 ms, total: 12.4 s
Wall time: 11.4 s

In [9]: at = mt.tensor(a, chunk_size=2_000_000)                                 

In [10]: %%time 
    ...: mt.sort(at).execute(fetch=False) 
    ...:  
    ...:                                                                        
CPU times: user 20.1 s, sys: 1.84 s, total: 21.9 s
Wall time: 3.48 s

It's about 3.3x faster for mt.sort than np.sort on a 100 million length array.

Related issue number

Resolves #828

@qinxuye qinxuye added mod: tensor type: feature New feature to be backported Indicate that the PR need to be backported to stable branch labels Nov 18, 2019
@qinxuye qinxuye added this to the v0.3.0rc1 milestone Nov 18, 2019
@qinxuye qinxuye added this to In progress in Tensor via automation Nov 18, 2019
@qinxuye qinxuye added this to PR-In progress in v0.3 Release via automation Nov 18, 2019
@qinxuye qinxuye moved this from PR-In progress to Issue-P0 in v0.3 Release Nov 19, 2019
@qinxuye qinxuye moved this from Issue-P0 to Issue-Needs prioritizing in v0.3 Release Nov 19, 2019
@qinxuye qinxuye moved this from Issue-Needs prioritizing to PR-In progress in v0.3 Release Nov 19, 2019
@qinxuye qinxuye moved this from PR-In progress to PR-Needs review in v0.3 Release Nov 19, 2019
Copy link
Contributor

@hekaisheng hekaisheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! some comments left.

mars/tensor/base/sort.py Outdated Show resolved Hide resolved
mars/tensor/base/sort.py Outdated Show resolved Hide resolved
mars/tensor/base/sort.py Outdated Show resolved Hide resolved
mars/tensor/base/sort.py Show resolved Hide resolved
Copy link
Contributor

@hekaisheng hekaisheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Tensor automation moved this from In progress to Reviewer approved Nov 21, 2019
@hekaisheng hekaisheng merged commit d8f4ae3 into mars-project:master Nov 21, 2019
Tensor automation moved this from Reviewer approved to Done Nov 21, 2019
v0.3 Release automation moved this from PR-Needs review to PR-Done Nov 21, 2019
@qinxuye qinxuye deleted the feature/tensor-sort branch November 21, 2019 08:57
qinxuye pushed a commit to qinxuye/mars that referenced this pull request Dec 13, 2019
* add support for `mt.sort`

* add more ut and docs

* refine according to comments

(cherry picked from commit d8f4ae3)
wjsi pushed a commit that referenced this pull request Dec 13, 2019
@wjsi wjsi added backported already PR has been backported and removed to be backported Indicate that the PR need to be backported to stable branch labels Dec 14, 2019
@qinxuye qinxuye mentioned this pull request Dec 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backported already PR has been backported mod: tensor type: feature New feature
Projects
Tensor
  
Done
v0.3 Release
  
PR-Done
Development

Successfully merging this pull request may close these issues.

Add mt.sort to support sorting on a tensor
3 participants