Choice equivalent for PyTorch #18624

LeviViana · 2019-03-29T18:27:48Z

Related to #16897 and #18457.

ezyang · 2019-04-01T22:05:49Z

Let us know if you want review prior to removal of WIP

LeviViana · 2019-04-02T05:36:19Z

Let us know if you want review prior to removal of WIP

I think it would be nice.

LeviViana · 2019-04-04T21:49:04Z

aten/src/ATen/native/TensorFactories.cpp

+    Tensor uniform_samples = at::rand({k}, weights.options());
+    Tensor cdf = weights.cumsum(0);
+    cdf /= cdf[-1];
+    samples = (uniform_samples.unsqueeze(1) > cdf.unsqueeze(0)).sum(1);


I'm making some changes in the sampling_with_replacement for performance reasons. These are going to be the last improvements I have in mind prior to a review.

LeviViana · 2019-04-05T15:44:33Z

The CUDA extensions are not randomizing correctly. It's like the random states don't actually change that much.

import torch
x = torch.arange(10).cuda()
w = torch.arange(10).float().cuda()
# This sampling will always give either 6 or 9
torch.choice(x, w, True, 1) # gives 9
torch.choice(x, w, True, 1) # gives 6
torch.choice(x, w, True, 1) # gives 9
torch.choice(x, w, True, 1) # gives 6

If I added THCRandom_seed(state) before getting the gen_states, then it would randomize correctly, but I wouldn't validate the reproducibility tests.

I'm investigating how to fix it. I'll try to get some inspiration from the implementation of randperm.

LeviViana · 2019-04-19T21:11:42Z

I've made some checks available here. I won't be bringing any more changes until a review, and I'm looking forward to it! Thanks!

soumith

I was reviewing this PR, and my biggest comment is around implemeting choice kernels.
I think it'd be much simpler and probably much more performant if choice just used torch.multinomial + advanced indexing, instead of it's own kernels -- considering how long we took to optimize multinomial.

Also, the correctness tests look like a good start but insufficient

LeviViana · 2019-05-04T20:59:22Z

Thanks @soumith for your feedback. I've made some benchmarks, and I've noticed that choice is faster than multinomial + indexing.

python3 -m timeit --setup="import torch; x = torch.arange(10); w = torch.arange(10).float()" "torch.choice(x, w, False, 6)" # 11 µsec
python3 -m timeit --setup="import torch; x = torch.arange(10); w = torch.arange(10).float()" "x[torch.multinomial(w, 6, replacement=False)]" # 12 µsec

python3 -m timeit --setup="import torch; x = torch.arange(10); w = torch.arange(10).float()" "torch.choice(x, w, True, 6)" 8 µsec
python3 -m timeit --setup="import torch; x = torch.arange(10); w = torch.arange(10).float()" "x[torch.multinomial(w, 6, replacement=True)]" 11 µsec

python3 -m timeit --setup="import torch; x = torch.arange(10)" "torch.choice(x, False, 6)" 7 µsec
python3 -m timeit --setup="import torch; x = torch.arange(10); w = torch.ones(10).float()" "x[torch.multinomial(w, 6, replacement=False)]" 13 µsec

python3 -m timeit --setup="import torch; x = torch.arange(10)" "torch.choice(x, True, 6)" 4 µsec
python3 -m timeit --setup="import torch; x = torch.arange(10); w = torch.ones(10).float()" "x[torch.multinomial(w, 6, replacement=True)]" 13 µsec

I could perform more tests with bigger tensors and with CUDA later if needed. Indeed, when I first implemented choice I wasn't aware of the existence of multinomial.

I can improve the correctness tests, but first I'd like to know what would make you interested in this implementation. It seems to me that this implementation only makes sense now if I could prove performance gains in all cases.

LeviViana · 2019-05-24T13:32:09Z

I've done some tests with big CUDA Tensors x = torch.arange(10 ** 5), k = 10 ** 3, here are the results:

Weighted Sampling WITHOUT replacement: 707 µsec vs 2.8 sec
Weighted Sampling WITH replacement: 739 µsec vs 2.8 sec
Uniform Sampling WITHOUT replacement: 233 µsec vs 3.17 sec
Uniform Sampling WITH replacement: 12 µsec vs 2.9 sec

It looks like torch.choice is faster, but these gaps seem huge to me... I can't find any error. Here below you'll find some of the snippets to reproduce these results.

python3 -m timeit --setup="import torch; x = torch.arange(10 ** 5).cuda(); w = torch.arange(10 ** 5).float().cuda()" "torch.choice(x, w, True, 10 ** 3)" 
python3 -m timeit --setup="import torch; x = torch.arange(10 ** 5).cuda(); w = torch.arange(10 ** 5).float().cuda()" "x[torch.multinomial(w, 10 ** 3, replacement=True)]" 


python3 -m timeit --setup="import torch; x = torch.arange(10 ** 5).cuda()" "torch.choice(x, True, 10 ** 3)" 
python3 -m timeit --setup="import torch; x = torch.arange(10 ** 5).cuda(); w = torch.ones(10 ** 5).float().cuda()" "x[torch.multinomial(w, 10 ** 3, replacement=True)]"

LeviViana · 2019-05-29T08:24:26Z

I've been checking the choice's distribution sampling correctness. So far, everything is working properly. Here is a snippet I used to do some tests:

Change the parameters, m, n, k, replace and device.

import torch
import torch.nn.functional as F
import numpy as np

m = 20
n = 10000
k = 4
replace = False
device = 'cpu'

###################################
# Comparing Choice vs Multinomial #
###################################

multinomial_samples = []
choice_samples = []

weights = torch.rand(m, device=device)

for _ in range(n):
	multinomial_samples += torch.multinomial(
						  weights,
						  k,
						  replace
						).cpu().numpy().tolist()
	choice_samples += torch.choice(
					torch.arange(m).to(device),
					weights,
					replace,
					k
				      ).cpu().numpy().tolist()

_, multinomial_dist = np.unique(multinomial_samples, return_counts=True)
_, choice_dist = np.unique(choice_samples, return_counts=True)

multinomial_dist = torch.Tensor(multinomial_dist) / (n * k)
choice_dist = torch.Tensor(choice_dist) / (n * k)

print(F.kl_div(choice_dist.log(), multinomial_dist, reduction='sum'))

############################################
# Comparing Choice vs Correct distribution #
############################################

choice_samples = []
weights = torch.rand(m, device=device)

for _ in range(n):
	choice_samples += torch.choice(
					torch.arange(m).to(device),
					weights,
					replace,
					1
				      ).cpu().numpy().tolist()

correct_dist =  weights / weights.sum()
correct_dist = correct_dist.to('cpu')
_, choice_dist = np.unique(choice_samples, return_counts=True)
choice_dist = torch.Tensor(choice_dist) / n

print(F.kl_div(choice_dist.log(), correct_dist, reduction='sum'))

fmassa · 2019-05-29T11:14:09Z

I had a quick look at our current implementation of multinomial on the GPU (not alias_multinomial, which is currently private).

It seems that our GPU implementation launches n_samples kernels

pytorch/aten/src/THC/generic/THCTensorRandom.cu

Lines 252 to 274 in b6d0f6c

    
                 for (int sample = 0; sample < n_sample; ++sample) { 
        
                   if (sample > 0) { 
        
                     // Update probabilities 
        
                     // Renorm along rows 
        
                     THCTensor_(copy)(state, normDist, origDist); 
        
                     THCTensor_(renormRows)(state, normDist); 
        
                     // Prefix sum along rows 
        
                     THCTensor_(cumsum)(state, prefixSum, normDist, 1); 
        
                   } 
        
                   // The kernel can only draw one sample before we have to 
        
                   // recalculate our distribution 
        
                   sampleMultinomialWithoutReplacement 
        
                     <<<grid, block, 0, THCState_getCurrentStream(state)>>>( 
        
                       gen->state.gen_states, 
        
                       n_sample, 
        
                       sample, 
        
                       THCudaLongTensor_data(state, self), 
        
                       numDist, numCategories, 
        
                       THCTensor_(data)(state, origDist), 
        
                       THCTensor_(data)(state, prefixSum)); 
        
                 }

which might explain the slowdown of torch.multinomial compared to @LeviViana implementation of choice, which launches only one or two kernels.

I didn't think carefully why this is needed for torch.multinomial, maybe it's because torch.multinomial supports passing a matrix of probabilities, while choice only supports a vector.

Also, looking at the implementation of choice, it seems that the last step in choice is always to call index_select on the first tensor. If this is indeed the case, then it could make sense to potentially remove the index_select from the kernels, and maybe see if there is something that we could do to integrate some of the ideas from choice in the implementation of multinomial.

umanwizard · 2019-06-06T18:26:23Z

aten/src/ATen/native/native_functions.yaml

@@ -1802,6 +1802,16 @@
    CPU: randperm_out_cpu
    CUDA: randperm_out_cuda

+- func: choice(Tensor input, Tensor weights, bool replace, int k) -> Tensor


I would prefer if the arguments had the same names and order as in its inspiration in NumPy.

(With the exception of input, which we can leave as-is, since that's a PyTorch standard).

I'll make this change as well.

umanwizard · 2019-06-06T18:30:04Z

aten/src/ATen/native/TensorFactories.cpp

+  int64_t k
+){
+  at::Tensor weights = at::empty({0}, input.options().dtype(at::kFloat));
+  if (replace){


Here instead of duplicating the code you can just call through to native::choice_cpu(input, weights, replace, k)

Indeed, I'll make this change.

umanwizard · 2019-06-06T18:35:35Z

aten/src/ATen/native/TensorFactories.cpp

+  const Tensor& weights,
+  int64_t k
+){
+  int n = x.size(0);


So, I guess we are assuming the input tensor x is 1-D, which seems reasonable since that's what's NumPy does.

But we need to actually check that and report an error if it's not the case.

I'm not assuming x is 1-D, instead I'm forcing the sampling to happen only in the first dimension. If x = torch.Tensor([[1, 2], [3, 4]]) then torch.choice(x, w, True, 3) can be torch.Tensor([[1, 2], [1, 2], [3, 4]]) for instance (i.e. it is sampling 0, 0, 1).

Got it. Anyway, we should still check that there is at least one dimension, because this will barf if you call it on a 0-dim tensor.

umanwizard · 2019-06-06T18:40:26Z

aten/src/ATen/native/TensorFactories.cpp

+    int64_t *samples_ptr = samples.data<int64_t>();
+
+    Tensor cdf = weights.cumsum(0);
+    cdf /= cdf[-1];


If somebody passes in a tensor with all zero weights, this divides by zero.

You are right, I'll fix this.

umanwizard · 2019-06-06T18:51:05Z

aten/src/ATen/native/TensorFactories.cpp

+    Tensor cdf = weights.cumsum(0);
+    cdf /= cdf[-1];
+
+    AT_DISPATCH_FLOATING_TYPES(weights.scalar_type(), "Sampling with replacement", [&] {


Why not allow weights to be an integral type?

I'll make that change as well. I guess some casting will be necessary in the CUDA kernels though.

umanwizard · 2019-06-06T18:56:09Z

aten/src/ATen/native/TensorFactories.cpp

+
+  AT_CHECK(
+    weights.is_contiguous(),
+    "The sampling weights must be contiguous."


You are right, It doesn't. I can just check whether the weights are contiguous and in the case they aren't I can just force it to be. I'll make this change, thanks.

fmassa

This is looking good, thanks!

I've looked the CPU part for now, there are a few things that I think should be improved. Let me know what you think

fmassa · 2019-06-07T12:50:54Z

aten/src/ATen/native/TensorFactories.cpp

+    Tensor weights_contiugous;
+
+    if(!weights.is_contiguous()){
+        weights_contiugous = weights.contiguous();


you can always make weights.contiguous() unconditionally, this will avoid a copy in the case it is already contiguous.

fmassa · 2019-06-07T12:51:28Z

aten/src/ATen/native/TensorFactories.cpp

+    if(!weights.is_contiguous()){
+        weights_contiugous = weights.contiguous();
+    }else{
+        weights_contiugous = weights.clone();


do you need to clone it here?

Not anymore !

fmassa · 2019-06-07T12:51:45Z

aten/src/ATen/native/TensorFactories.cpp

+    }
+
+    AT_CHECK(
+      weights_contiugous.device() == x.device(),


there is a typo here. it's meant to be contiguous

fmassa · 2019-06-07T12:52:44Z

aten/src/ATen/native/TensorFactories.cpp

+    AT_DISPATCH_FLOATING_TYPES(weights_contiugous.scalar_type(), "generate keys", [&] {
+      generate_keys<scalar_t>(
+        keys.data<scalar_t>(),
+        weights_contiugous.data<scalar_t>(),


you have an assert in the beginning that the weights should be float, so this doesn't work for double?

You are right, I'll fix this.

fmassa · 2019-06-07T12:53:07Z

aten/src/ATen/native/TensorFactories.cpp

+){
+
+  AT_CHECK(
+    weights.dtype() == kFloat,


do you mean that you want it to be floating point type? Like double or float?

I'll remove this check.

LeviViana · 2019-06-08T21:12:26Z

I've done some benchmarks against numpy. I'm using only the CPU implementation for these tests. Here are the results for sampling 2k items out of 10k elements.

Edit: These benchmarks give the same results using OMP_NUM_THREADS=1.

Sampling type	torch.choice	numpy.random.choice
Weighted without replacement	550 µsec	980 µsec
Weighted with replacement	240 µsec	390 µsec
Uniform without replacement	53 µsec	191 µsec
Uniform with replacement	35 µsec	44 µsec

To reproduce:

python3 -m timeit --setup="import torch; n=10000; k=2000; x = torch.arange(n); w = torch.arange(n).float()" "torch.choice(x, k, False, w)" # 550 µsec
python3 -m timeit --setup="import numpy as np; n=10000; k=2000; x = np.arange(n); w = np.arange(n); w = w / w.sum()" "np.random.choice(x, k, False, w)" # 980 µsec

python3 -m timeit --setup="import torch; n=10000; k=2000; x = torch.arange(n); w = torch.arange(n).float()" "torch.choice(x, k, True, w)" # 240 µsec
python3 -m timeit --setup="import numpy as np; n=10000; k=2000; x = np.arange(n); w = np.arange(n); w = w / w.sum()" "np.random.choice(x, k, True, w)" # 390 µsec

python3 -m timeit --setup="import torch; n=10000; k=2000; x = torch.arange(n)" "torch.choice(x, k, False)" # 53 µsec
python3 -m timeit --setup="import numpy as np; n=10000; k=2000; x = np.arange(n)" "np.random.choice(x, k, False)" # 191 µsec

python3 -m timeit --setup="import torch; n=10000; k=2000; x = torch.arange(n)" "torch.choice(x, k, True)" # 35 µsec
python3 -m timeit --setup="import numpy as np; n=10000; k=2000; x = np.arange(n)" "np.random.choice(x, k, True)" # 44 µsec

ptrblck · 2019-09-10T22:36:29Z

@LeviViana I'm not sure, how python -m timeit works internally, but is it synchronizing the CUDA calls automatically or are we missing the torch.cuda.synchronize() calls in these tests:

python3 -m timeit --setup="import torch; x = torch.arange(10 ** 7).cuda(); w = torch.arange(10 ** 7).cuda().float() + 1; J, q = torch._multinomial_alias_setup(w)" "x[torch._multinomial_alias_draw(q, J, 10 ** 4)]"
python3 -m timeit --setup="import torch; x = torch.arange(10 ** 7).cuda(); w = torch.arange(10 ** 7).cuda().float() + 1" "torch.choice(x, 10 ** 4, True, w)"

LeviViana · 2019-09-11T12:21:11Z

Thanks @ptrblck, you are right, torch.cuda.synchronize() is missing. I've performed some tests and the times reported by the cuda calls are underestimated. However, for the moment all the conclusions drawn so far are still valid.

ptrblck · 2019-09-11T13:22:26Z

Thanks for the update @LeviViana!

I've built your current branch and used this script to profile both methods:

import torch
import time

# Setup
x = torch.arange(10 ** 7).cuda()
w = torch.arange(10 ** 7).cuda().float() + 1
nb_iters = 1000

# warmup
for _ in range(50):
    J, q = torch._multinomial_alias_setup(w)
    output = x[torch._multinomial_alias_draw(q, J, 10 ** 4)]

# Profile1
torch.cuda.synchronize()
t0 = time.time()

for _ in range(nb_iters):
    J, q = torch._multinomial_alias_setup(w)
    output = x[torch._multinomial_alias_draw(q, J, 10 ** 4)]

torch.cuda.synchronize()
t1 = time.time()
print('elapsed {:.6f}s/iter'.format((t1 - t0)/nb_iters))


# warmup
for _ in range(50):
    output = torch.choice(x, 10 ** 4, True, w)

# Profile2
torch.cuda.synchronize()
t0 = time.time()

for _ in range(nb_iters):
    output = torch.choice(x, 10**4, True, w)

torch.cuda.synchronize()
t1 = time.time()
print('elapsed {:.6f}s/iter'.format((t1 - t0)/nb_iters))

On a TitanV, CUDA10.1.105 and cudnn7500 I get the following numbers:

# multinomial 
elapsed 0.002027s/iter

# choice
elapsed 0.005599s/iter

Could you run this script and check your current output?

LeviViana · 2019-09-11T13:38:31Z

Thanks @ptrblck for the script. These are my outputs (RTX 2080Ti, CUDA 10.1, CuDNN 7.5):

# multinomial
elapsed 0.002217s/iter

# choice
elapsed 0.004415s/iter

Indeed, the multinomial aliases are faster than choice for sampling with replacement, despite the torch._multinomial_alias_setup overhead.

LeviViana · 2020-02-12T10:01:33Z

It would be interesting to establish an objective to this PR. So far, what I've noticed is that the torch._multinomial_alias_draw is faster for sampling with replacement. So, I propose the following plan:

torch.choice for sampling with replacement will just call torch._multinomial_alias_setup and torch._multinomial_alias_draw
Keep torch.choice for sampling without replacement

This way we get the best of both implementations. If you validate this plan, I can make the changes and update the PR. What do you think @gchanan @soumith ?

pytorchbot · 2022-04-12T02:33:02Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
Stale pull requests will automatically be closed 30 days after being marked Stale

kmfrick · 2022-04-22T07:55:33Z

Any updates on this? It could be interesting to integrate this in the C++ API (I don't know if the .yaml files allow for auto-generating C++ API code too), since you have to use a slower .index() method to perform random sampling on tensors there (see comments on this issue).

kmfrick · 2022-05-12T13:58:23Z

Also, why is this getting compared to torch._multinomial_alias_draw when that function has been killed?

github-actions · 2022-07-11T14:45:26Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

github-actions · 2022-09-09T19:34:27Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

facebook-github-bot · 2022-10-03T23:02:48Z

/easycla

As part of the transition to the PyTorch Foundation, this project now requires contributions be covered under the new CLA. See #85559 for additional details.

This comment will trigger a new check of this PR. If you are already covered, you will simply see a new "EasyCLA" check that passes. If you are not covered, a bot will leave a new comment with a link to sign.

linux-foundation-easycla · 2022-10-03T23:02:58Z

❌ - login: @LeviViana / name: Levi Viana . The commit (86510e4, c7e0325, f4a7e81, 322cbdc, 1a815fc, 8ee2fbe, 0157252, 98d7385, 4364807, d3e6846, 6012f6c, 2b58385, 83156a3, 0fb56c4, 2713c95, de1395b, 5b20d07, aac9b3e, 1c60a45, c0f33eb, fb8252d, cc430ed, ba140bd, 760fd07, ec4096c, 06cebcb, b863356) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please submit a support request ticket.

ducha-aiki · 2023-03-23T14:41:40Z

Any hope to get this re-opened?

stupidcucumber · 2024-04-19T08:06:46Z

Why did this issue closed?... It would be really great to have torch.choice() for clean code(. There are numerous examples where I import numpy just to implement function that does np.random.choice(). So sad to see as this beautiful PR has drown...

[WIP] Choice equivalent for PyTorch, returns

86510e4

LeviViana added 2 commits April 2, 2019 05:13

Adding tests for multi-dimensional tensors

c7e0325

Fixing tests for CPU

f4a7e81

LeviViana commented Apr 4, 2019

View reviewed changes

Improving sampling with replacement

322cbdc

Fixing randomization

1a815fc

LeviViana changed the title ~~[WIP] Choice equivalent for PyTorch, returns~~ Choice equivalent for PyTorch Apr 8, 2019

LeviViana added 2 commits April 19, 2019 18:33

Fixing randomization, properly

8ee2fbe

Fixing curand for multiple blocks

0157252

soumith reviewed May 4, 2019

View reviewed changes

ezyang added the open source label Jun 5, 2019

gchanan requested a review from umanwizard June 5, 2019 22:22

ezyang requested a review from fmassa June 6, 2019 18:54

ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 6, 2019

umanwizard suggested changes Jun 6, 2019

View reviewed changes

Implementing changes, allowing weights of integral is missing

98d7385

pytorchbot added module: cuda Related to torch.cuda, and CUDA support in general module: operators labels Jun 7, 2019

fmassa requested changes Jun 7, 2019

View reviewed changes

LeviViana added 2 commits June 8, 2019 00:52

More changes

4364807

Getting rid of dispatching types

d3e6846

xidulu mentioned this pull request Sep 7, 2019

[Numpy] Random.choice implemented apache/mxnet#16089

Merged

7 tasks

Setting at::cuda::CurrentCUDAStream() for all kernels

b863356

fritzo mentioned this pull request Mar 26, 2020

Speed up SubsampleMessenger pyro-ppl/pyro#2383

Merged

mruberry removed the module: operators (deprecated) label Oct 7, 2020

facebook-github-bot added the cla signed label Oct 30, 2020

pytorchbot added Stale and removed Stale labels Apr 12, 2022

github-actions bot added the Stale label Jul 11, 2022

ezyang self-requested a review July 11, 2022 18:51

ezyang removed the Stale label Jul 11, 2022

aschuh-hf mentioned this pull request Aug 21, 2022

Move mi_loss() random sampling functions to deepali.core BioMedIA/deepali#21

Merged

kmfrick mentioned this pull request Aug 29, 2022

Implement numpy.random.choice equivalent #16897

Open

github-actions bot added the Stale label Sep 9, 2022

github-actions bot closed this Nov 2, 2022

ezyang reopened this Mar 24, 2023

github-actions bot closed this May 17, 2023

Choice equivalent for PyTorch #18624

Choice equivalent for PyTorch #18624

Conversation

LeviViana commented Mar 29, 2019

ezyang commented Apr 1, 2019

LeviViana commented Apr 2, 2019

Choose a reason for hiding this comment

LeviViana commented Apr 5, 2019

LeviViana commented Apr 19, 2019

soumith left a comment

Choose a reason for hiding this comment

LeviViana commented May 4, 2019 • edited

LeviViana commented May 24, 2019 • edited

LeviViana commented May 29, 2019 • edited

fmassa commented May 29, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fmassa left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LeviViana commented Jun 8, 2019 • edited

ptrblck commented Sep 10, 2019

LeviViana commented Sep 11, 2019 • edited

ptrblck commented Sep 11, 2019

LeviViana commented Sep 11, 2019 • edited

LeviViana commented Feb 12, 2020

pytorchbot commented Apr 12, 2022

kmfrick commented Apr 22, 2022

kmfrick commented May 12, 2022

github-actions bot commented Jul 11, 2022

github-actions bot commented Sep 9, 2022

facebook-github-bot commented Oct 3, 2022

linux-foundation-easycla bot commented Oct 3, 2022

ducha-aiki commented Mar 23, 2023

stupidcucumber commented Apr 19, 2024

LeviViana commented May 4, 2019 •

edited

LeviViana commented May 24, 2019 •

edited

LeviViana commented May 29, 2019 •

edited

LeviViana commented Jun 8, 2019 •

edited

LeviViana commented Sep 11, 2019 •

edited

LeviViana commented Sep 11, 2019 •

edited