Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistant results between numpy.median() and torch.median() #1837

Closed
lijunzh opened this issue Jun 18, 2017 · 7 comments
Closed

Inconsistant results between numpy.median() and torch.median() #1837

lijunzh opened this issue Jun 18, 2017 · 7 comments

Comments

@lijunzh
Copy link

lijunzh commented Jun 18, 2017

Numpy seems to give the correct median of even number of elements (which is the mean of the center two elements) as defined in Wikipedia while torch gives one of the center elements that are closer to their mean (I guessed from the results I see). I am not sure if this is intended behavior or we need to fix this bug? As least, I think we should have a function that does the standard median for comparision with numpy or any other math programs, such as MATLAB.

In [88]: a = np.random.randn(4, 4)

In [89]: a
Out[89]:
array([[ 0.21654775, -2.84948564, -0.89086479,  0.31074037],
       [-2.02333919,  0.59465567,  1.55680421, -0.33646373],
       [ 1.15586001,  0.16046311,  0.01646207, -1.19663499],
       [ 0.7947269 , -0.22558656, -1.25525967, -0.29217645]])

In [90]: b = torch.from_numpy(a)

In [91]: np.median(a, axis=0)
Out[91]: array([ 0.50563732, -0.03256172, -0.43720136, -0.31432009])

In [92]: torch.median(b, dim=0)[0]
Out[92]:

 0.2165 -0.2256 -0.8909 -0.3365
[torch.DoubleTensor of size 1x4]
@lijunzh
Copy link
Author

lijunzh commented Jun 19, 2017

By the way, it seems that PyTorch's tensor in cuda does not have a median implemenation yet. I got this error message when trying to call it:

AttributeError: 'torch.cuda.DoubleTensor' object has no attribute 'median'

or

TypeError: Type torch.cuda.DoubleTensor doesn't implement stateless method median

Maybe it should be a separate issue, but I think I should mention it here since it is also related to median funciton.

@soumith
Copy link
Member

soumith commented Jul 13, 2017

wrt current behavior, it is intended and we will not be fixing it.

wrt median being defined for cuda, it is now defined on master and will be present in the next release.

@soumith soumith closed this as completed Jul 13, 2017
@fmassa
Copy link
Member

fmassa commented Jul 14, 2017

@lijunzh for some background on the decision to implement median as is, see torch/torch7#182

@stiv-yakovenko
Copy link

HI! I've learned how to emulate np.median with torch:

import torch
import numpy as np
y =[1, 2, 3, 5, 9, 1]
print("numpy=",np.median(y))
print(sorted([1, 2, 3, 5, 9, 1]))
yt = torch.tensor(y,dtype=torch.float32)
ymax = torch.tensor([yt.max()])
print("torch=",yt.median())
print("torch_fixed=",(torch.cat((yt,ymax)).median()+yt.median())/2.)

@lijunzh
Copy link
Author

lijunzh commented May 1, 2019

@stiv-yakovenko Thanks for the suggestion.

ymax = torch.tensor([yt.max()])

This line can be simplified as ymax = yt.max()[None].

@sztal
Copy link

sztal commented Nov 2, 2019

Hi, I know this issue is closed already, but I wonder why PyTorch has to implement median in such a inconsistent way with respect to other standard numerical libraries? I just got hit by this weird behavior and it took me a while to figure out what is going on.

I understand that there are some good reasons for this design decision, but I just wonder what they are as the documentation seems not to discuss this issue.

@fxmarty
Copy link

fxmarty commented Sep 18, 2020

It should be stated explicitly in the documentation that torch and numpy median behave differently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants