Initial implementation of quantile operator #39417

heitorschueroff · 2020-06-02T22:54:11Z

Implementing the quantile operator similar to numpy.quantile.

For this implementation I'm reducing it to existing torch operators to get free CUDA implementation. It is more efficient to implement multiple quickselect algorithm instead of sorting but this can be addressed in a future PR.

dr-ci · 2020-06-03T21:01:15Z

💊 CI failures summary and remediations

As of commit 678b1dc (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 74 times.

aten/src/ATen/native/native_functions.yaml

torch/_torch_docs.py

aten/src/ATen/native/Sorting.cpp

test/test_torch.py

torch/_torch_docs.py

mruberry · 2020-07-09T20:45:27Z

Ping me when this is ready for review again, @heitorschueroff

torch/_torch_docs.py

test/test_torch.py

mruberry · 2020-07-09T22:52:11Z

test/test_torch.py

+                result = torch.quantile(a, q, dim=dim, keepdim=keepdim).cpu()
+                expected = np.quantile(a.cpu().numpy(), q.cpu().numpy(), axis=dim, keepdims=keepdim)
+                expected = torch.from_numpy(np.array(expected)).type(result.type())
+                self.assertTrue(torch.allclose(result, expected, rtol=1e03, atol=1e06))


1e-03 and 1e-06?

Was there a problem with the defaults?

I've been struggling to find good numbers for atol and rtol because NumPy promotes float to double and hence has better accuracy. For double it works fine, but for floats I keep failing tests even though the numbers are pretty close to about 5 digits.

Yes, because NumPy promotes float to double I'm struggling to find numbers that work for the float case.

mruberry · 2020-07-09T23:06:35Z

aten/src/ATen/native/Sorting.cpp

+    quantiles = quantiles.permute(numpy_dim_order);
+  }
+
+  out.copy_(quantiles.reshape(out_shape));


This copy is unfortunate. Can it be eliminated? One way to eliminate it most of the time would be to have this function return the reshaped quantiles tensor and have the out variant perform the shape check and copy it into the out_ param, if given. Another option would be to not support an out variant at all. Out is little-used and getting the result shape of this operation correct is tricky, after all.

I'll make it so only the out variant makes the copy but I agree that this is not optimal. I also don't like how I have to permute the dimensions to follow NumPy shape. I think this would really benefit from a custom cpu/cuda implementation. Maybe a follow-up PR?

mruberry

Looks really good overall. Made a few test and doc suggestions and asked a question about the algorithm.

mruberry · 2020-07-13T18:04:48Z

torch/_torch_docs.py

+quantile(input, q) -> Tensor
+
+Returns the q-th quantiles of all elements in the :attr:`input` tensor, doing a linear 
+interpolation when the q-th quantile lies between two data points i < j.


I'm a little confused by the "i < j" here. What's that intended to express? Rest of this sentence looks great.

mruberry · 2020-07-13T18:05:14Z

torch/_torch_docs.py

+
+Returns the q-th quantiles of each row of the :attr:`input` tensor along the dimension 
+:attr:`dim`, doing a linear interpolation when the q-th quantile lies between two data points 
+i < j.. By default, :attr:`dim` is `None` resulting in the :attr:`input` tensor


Same question about "i < j" here. Also you have two periods following "j."

facebook-github-bot

@heitorschueroff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Author: Heitor Schueroff <heitorschueroff@fb.com> Date: Mon Jun 8 09:18:02 2020 -0700

facebook-github-bot

@heitorschueroff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-07-17T19:37:50Z

@heitorschueroff merged this pull request in c7798dd.

Attempting to land quantile again after being landed here #39417 and reverted here #41616. [ghstack-poisoned]

Summary: Pull Request resolved: #42755 Attempting to land quantile again after being landed here #39417 and reverted here #41616. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D23030338 Pulled By: heitorschueroff fbshipit-source-id: 124a86eea3aee1fdaa0aad718b04863935be26c7

mruberry self-requested a review June 3, 2020 02:04

mruberry added the module: numpy Related to numpy support, and also numpy compatibility of our operators label Jun 3, 2020

heitorschueroff requested review from apaszke, mrshenli, pritamdamania87 and zhaojuanmao as code owners June 3, 2020 18:31

mruberry removed request for apaszke, pritamdamania87, mrshenli and zhaojuanmao June 3, 2020 20:33