Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cauchy_ few fixes (1) check gamma > 0 (2) better dtype error log #93314

Closed
wants to merge 2 commits into from

Conversation

min-jean-cho
Copy link
Collaborator

@min-jean-cho min-jean-cho commented Jan 30, 2023

Related #92047

(1) torch.Tensor.cauchy_ is missing check for gamma > 0 (torch.distributions.cauchy.Cauchy correctly checks gamma > 0).
(2) add better error log on dtype similar to exponential_

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

@pytorch-bot
Copy link

pytorch-bot bot commented Jan 30, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/93314

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7370c0a:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions github-actions bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Jan 30, 2023
min-jean-cho added a commit to min-jean-cho/pytorch that referenced this pull request Jan 31, 2023
Copy link
Collaborator

@lezcano lezcano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entry point to these functions for both CPU and CUDA is the function called cauchy_. Why don't you put all these checks there?
The other common place for CPU and CUDA is cauchy_impl_, so either should work.

@lezcano
Copy link
Collaborator

lezcano commented Jan 31, 2023

Also, in the future, consider using ghstack for submitting these related PRs

@@ -322,6 +322,9 @@ Tensor& exponential_impl_(Tensor& self, double lambda, c10::optional<Generator>

template<template<typename> class cauchy_kernel, typename RNG>
Tensor& cauchy_impl_(Tensor& self, double median, double sigma, c10::optional<Generator> gen) {
// TODO: instead of variable name 'sigma', use 'gamma' or 'scale'
// the variance, squared sigma, is undefined for cauchy distribution
TORCH_CHECK(sigma > 0.0, "cauchy_ expects sigma > 0.0, but found sigma=", sigma);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why restrict to sigma > 0 rather than sigma >= 0? I would expect .cauchy_(median, sigma=0) to behave as .fill_(median).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fritzo, I have two comments regarding sigma = 0.

(1) lim_{sigma -> 0} F^{-1} (x ; sigma, median) = median. This is perfectly sound for as sigma -> 0.
(2) both the PDF and CDF of cauchy distribution are not sound when sigma = 0

Considering the two above, sure we can define sigma = 0 as the behavior at limit as sigma -> 0, but is not the same as sigma = 0, and PDF, CDF are collapsed for sigma = 0.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH, this is a sampler, and the distribution of samples converges. I'd imagine this function would be used for initialization, and it's sure nice if I can set sigma=0 rather than add if ... else logic to decide whether to inject noise.

(I think this less important, but the CDF does converge pointwise almost everywhere to the step function)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a sampler, and the distribution of samples converges

Cauchy distribution doesn't obey the law of large numbers, the sample mean doesn't converge to a finite number reflecting that the expected value is undefined for cauchy, instead converges trivially to a cauchy random variable. I don't think it's sound to return .fill_(median), for sigma =0 and have mean({x_1, x_2, ..., x_n}) = median.

I'd imagine this function would be used for initialization, and it's sure nice if I can set sigma=0 rather than add if ... else logic to decide whether to inject noise.

I'm not sure which applications popularly use cauchy random variable but I would guess many don't soundly set sigma = 0. Intuitively, a distribution of cutted distances not cut ?

More importantly, the derivation of cauchy CDF from tan θ = x / b isn't sound for b = 0. What would we return for the cdf ?

So I don't think it's much sound intuitively nor physically to define for sigma = 0. What do you think ?

[reference] https://mathworld.wolfram.com/CauchyDistribution.html eq. 1-9
https://en.wikipedia.org/wiki/Cauchy_distribution#/media/File:Mean_estimator_consistency.gif

Copy link
Collaborator

@fritzo fritzo Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my main motivation is to continue satisfying an equation like

x.cauchy_(median, sigma) == x.cauchy_().mul_(sigma).add_(median)  # in distribution

for any median and sigma >= 0, as the current code satisfies. I'm worried if we add a check for sigma > 0, then we'll bloat user code like this:

- x.cauchy_(median, sigma)
+ if sigma > 0:
+     x.cauchy_(median, sigma)
+ else:
+     x.zero_()

but maybe that's ok 🤔

I guess in the bigger picture, this is a sampler and not a Distribution object, so I'd think it's less important to talk about the CDF or central limit theorem. I do agree with you that torch.distributions.Cauchy should restrict scale > 0 since that Distribution object should support methods like .log_prob() that are undefined at scale == 0. Maybe as a policy we could require stricter checks to torch.distributions wrappers and use looser checks in .{distribution}_() methods?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but maybe that's ok 🤔

Thanks @fritzo, I think this is OK : ) . At the end of the day, whether torch.Tensor.cauchy_ is a sampler or torch.distributions.Cauchy is a distribution, the generated random variable should follow the cauchy distribution.

If X follows standard cauchy distribution, and median a real number and sigma > 0, then Y = median + sigma * X follows the cauchy distribution. And we won't guarantee any behavior for parameterization outside. What do you think?

Copy link
Collaborator

@fritzo fritzo Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. Thanks for caring about design choices, I think it's worth the debate 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, thanks @fritzo, enjoyed our discussion debate : )

@min-jean-cho
Copy link
Collaborator Author

@jgong5, accidentally clicked re-request from you. Please feel free to ignore.

@lezcano
Copy link
Collaborator

lezcano commented Feb 3, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 3, 2023
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Approval needed from one of the following (Rule 'superuser'):
gottbrath, c-odrin, xing-liu, mattjgalloway, nanoax, ...

Details for Dev Infra team Raised by workflow job

@lezcano
Copy link
Collaborator

lezcano commented Feb 3, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged module: cpu CPU specific problem (e.g., perf, algorithm) open source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants