Problem with Polytope Sampler #1225

jduerholt · 2022-05-17T10:06:13Z

Hi,

I encountered a problem with the get_polytope_samples when the individual variables live in different dimensions. Here is a small didactic example:

from torch import tensor
from botorch.utils.sampling import get_polytope_samples
import matplotlib.pyplot as plt

samples = get_polytope_samples(
    n=100,
    bounds = tensor([[0.1,0.3,0.1,30.],[0.6,0.7,0.7,700.]]),
    equality_constraints=[(tensor([1, 2, 0]), tensor([1., 1., 1.]), 1.0)]
)

fig, ax = plt.subplots()
ax.hist(samples.numpy()[:,-1])
plt.show()

The bounds are as follows:

0.1 - 0.6
0.3 - 0.7
0.1 - 0.7
30 - 700

In addition an equality constraint is applied on the first three variables with all coefficients equal to 1 and an RHS of 1.

Sampling 100 samples leads to the following histogram for the fourth variables (bounds: 30-700). The distribution is not uniform at all and the values stay in a very slim interval.

In the next step, I scaled all variables to into the 0 - 1 range, including the equality constraint by using the formula below:

where xi* is the scaled variable and s_i is the difference between upper and lower bound for the variable. The code below generates 100 samples for this case including scaling back into the original bounds:

lower = tensor([0.1,0.3,0.1,30.])
upper = tensor([0.6,0.7,0.7,700.])

space = upper - lower

scaled_samples = get_polytope_samples(
    n=100,
    bounds = tensor([[0., 0., 0., 0.],[1., 1., 1., 1.]]),
    equality_constraints=[(tensor([1, 2, 0]), tensor([0.4000, 0.6000, 0.5000]), 0.5)]
)

samples = lower + scaled_samples*space

fig, ax = plt.subplots()
ax.hist(samples.numpy()[:,-1])
plt.show()

Now, the fourth variables is sampled in much more uniform way:

I think the reason for this behavior is that the Hit and Run sampler needs only a small step to reach the boundary of the polytope (defined by the equality constraint) and that this step is to small to guarantee a proper sampling of the fourth variable.

I would recommend to build this scaling directly into the get_polytope_samples method (PR could be provided by me), as it is also used by gen_batch_initial_conditions when linear constraints are present. In the case that people are not running the optimization in the scaled variable space but using a Normalize InputTransform instead, this behavior can lead to problems in the optimization process.

What do you think?

Best,

Johannes

The text was updated successfully, but these errors were encountered:

Balandat · 2022-05-17T17:02:12Z

Thanks for flagging, this is a great observation. It is indeed a problem that "small step" means different things in different directions, we really want this to be isotropic. I'm not quite sure I fully understand where things go wrong in the code, I'll have to take a closer look to better understand that.

But I think your suggestion sounds great, and we'd very much welcome a PR! I think we can go even a bit more low-level than build it just into the get_polytope_samples method. Ideally we could actually make this part of the PolytopeSampler class, do the normalization on __init__, save the scaling factors there, and then change draw to call an abstract _draw method and then automatically rescale the outputs. This might be a bit more work than your suggestion though, so I guess we can start with the simpler approach.

One of the things we may want to think about a bit more is how we deal with cases in which we don't have bounds, and the polytope is defined purely via inequality constraints. I guess we could just normalize the coefficients in each column of A to have a range comparable to those in the other columns and then record those scaling factors?

jduerholt · 2022-05-18T15:35:31Z

I will first file a PR with the simpler solution in get_polytope_samples for the case that we have actually bounds. For the case without having bounds, I am not yet 100% sure.

I hope to have a first PR until end of the week ;)

esantorella · 2022-08-09T14:46:46Z

I approved #1341, but I'm going to leave this open since there are a few minor improvements we can make:

Fix problem with polytope sampler #1341 introduces a minor break to backwards compatibility by raising an error when non-integer indices are passed to inequality_constraints or equality_constraints. In a way this is good because non-integer indices don't make sense and aren't usually allowed, but it would be good to warn rather than error at least for the next release. We could eventually deprecate support for the dtypes that don't work here.
Since this problem likely arises deeper in the stack than in get_polytope_samples, the fix should live where the issue starts (likely HitAndRunPolytopeSampler.draw or PolytopeSampler).
Ideally, we'd have tests for get_polytope_samples and PolytopeSampler subclasses that fail given the old behavior and pass given the new behavior. We should have been more specific about what polytope samples ought to look like in the first place to prevent this bug from arising.
It would be nice to have a clear understanding of where the flawed behavior came from in the first place!

Summary: ## Motivation As discussed in issue #1225, the polytope sampler runs into problems if the variables live in different dimensions. This PR fixes the issue using the approach mentioned in the issue above. ### Have you read the [Contributing Guidelines on pull requests](https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests)? Yes Pull Request resolved: #1341 Test Plan: Unit tests Reviewed By: esantorella Differential Revision: D38485035 Pulled By: Balandat fbshipit-source-id: 8e6ed1c1b81f636d108becb1ab5cf2ca13294420

Summary: ## Motivation This is fixing a very minor issue and I've probably already spent too much time on it. Unless anyone feels really strongly about this, I'd prefer to either have this quickly either accepted or rejected rather than spend a while iterating on revisions. In the past it was possible to use indexers with dtypes that torch does not accept as indexers via `equality_constraints` and `inequality_constraints`. This was never really intended behavior and stopped being supported in #1341 (discussed in #1225 ) . This PR makes errors more informative if someone does try to use the wrong dtypes, since the existing error message did not make clear where the error came from. I aslo refactored a test in test_initalizers.py. ### Have you read the [Contributing Guidelines on pull requests](https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests)? Yes Pull Request resolved: #1345 Test Plan: Unit tests for errors raised ## Related PRs #1341 Reviewed By: Balandat Differential Revision: D38627695 Pulled By: esantorella fbshipit-source-id: e9e4e917b79f81a36d74b79ff3b0f710667283cb

* adds an `n_thinning` argument to `sample_polytope` and `HitAndRunPolytopeSampler`; changes the defaults for `HitAndRunPolytopeSampler` args to `n_burnin=200` and `n_thinning=20` * normalizes the (inequality and equality) constraints in `HitAndRunPolytopeSampler` to avoid the same issue as pytorch#1225 * introduces `normalize_dense_linear_constraints` to normalize constraint given in dense format to the unit cube * deprecates `normalize_linear_constraint`; `normalize_sparse_linear_constraints` is to be used instead Note: This change is in preparation for fixing facebook/Ax#2373

Summary: This set of changes does the following: * adds an `n_thinning` argument to `sample_polytope` and `HitAndRunPolytopeSampler`; changes the defaults for `HitAndRunPolytopeSampler` args to `n_burnin=200` and `n_thinning=20` * Changes `HitAndRunPolytopeSampler` to take the `seed` arg in its constructor, and removes the arg from the `draw()` method (the method on the base class is adjusted accordingly). The resulting behavior is that if a `HitAndRunPolytopeSampler` is instantiated with the same args and seed, then the sequence of `draw()`s will be deterministic. `DelaunayPolytopeSampler` is stateless, and so retains its existing behavior. * normalizes the (inequality and equality) constraints in `HitAndRunPolytopeSampler` to avoid the same issue as pytorch#1225. If `bounds` are note provided, emits a warning that this cannot be performed (doing this would require vertex enumeration of the constraint polytope, which is NP-hard and too costly). * introduces `normalize_dense_linear_constraints` to normalize constraint given in dense format to the unit cube * removes `normalize_linear_constraint`; `normalize_sparse_linear_constraints` is to be used instead * simplifies some of the testing code Note: This change is in preparation for fixing facebook/Ax#2373 Test Plan: Ran a stress test to make sure this doesn't cause flaky tests: https://www.internalfb.com/intern/testinfra/testconsole/testrun/3940649908470083/ Differential Revision: D58068753 Pulled By: Balandat

Summary: This set of changes does the following: * adds an `n_thinning` argument to `sample_polytope` and `HitAndRunPolytopeSampler`; changes the defaults for `HitAndRunPolytopeSampler` args to `n_burnin=200` and `n_thinning=20` * Changes `HitAndRunPolytopeSampler` to take the `seed` arg in its constructor, and removes the arg from the `draw()` method (the method on the base class is adjusted accordingly). The resulting behavior is that if a `HitAndRunPolytopeSampler` is instantiated with the same args and seed, then the sequence of `draw()`s will be deterministic. `DelaunayPolytopeSampler` is stateless, and so retains its existing behavior. * normalizes the (inequality and equality) constraints in `HitAndRunPolytopeSampler` to avoid the same issue as #1225. If `bounds` are note provided, emits a warning that this cannot be performed (doing this would require vertex enumeration of the constraint polytope, which is NP-hard and too costly). * introduces `normalize_dense_linear_constraints` to normalize constraint given in dense format to the unit cube * removes `normalize_linear_constraint`; `normalize_sparse_linear_constraints` is to be used instead * simplifies some of the testing code Note: This change is in preparation for fixing facebook/Ax#2373 Pull Request resolved: #2358 Test Plan: Ran a stress test to make sure this doesn't cause flaky tests: https://www.internalfb.com/intern/testinfra/testconsole/testrun/3940649908470083/ Reviewed By: saitcakmak Differential Revision: D58068753 Pulled By: Balandat fbshipit-source-id: 9a75c547a3493e393cd7e724edd984318b76e1f4

jduerholt mentioned this issue Aug 5, 2022

Fix problem with polytope sampler #1341

Closed

esantorella self-assigned this Aug 11, 2022

esantorella mentioned this issue Aug 11, 2022

More informative errors for wrong index dtype #1345

Closed

esantorella closed this as not planned Won't fix, can't repro, duplicate, stale May 5, 2023

parrangoiz mentioned this issue Apr 17, 2024

Problem when Sobol falls back to HitAndRunPolytopeSampler facebook/Ax#2373

Open

Balandat mentioned this issue Jun 2, 2024

Update polytope sampling code and add thinning capability #2358

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with Polytope Sampler #1225

Problem with Polytope Sampler #1225

jduerholt commented May 17, 2022

Balandat commented May 17, 2022

jduerholt commented May 18, 2022

esantorella commented Aug 9, 2022

Problem with Polytope Sampler #1225

Problem with Polytope Sampler #1225

Comments

jduerholt commented May 17, 2022

Balandat commented May 17, 2022

jduerholt commented May 18, 2022

esantorella commented Aug 9, 2022