Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with Polytope Sampler #1225

Closed
jduerholt opened this issue May 17, 2022 · 3 comments
Closed

Problem with Polytope Sampler #1225

jduerholt opened this issue May 17, 2022 · 3 comments
Assignees

Comments

@jduerholt
Copy link
Contributor

Hi,

I encountered a problem with the get_polytope_samples when the individual variables live in different dimensions. Here is a small didactic example:

from torch import tensor
from botorch.utils.sampling import get_polytope_samples
import matplotlib.pyplot as plt

samples = get_polytope_samples(
    n=100,
    bounds = tensor([[0.1,0.3,0.1,30.],[0.6,0.7,0.7,700.]]),
    equality_constraints=[(tensor([1, 2, 0]), tensor([1., 1., 1.]), 1.0)]
)

fig, ax = plt.subplots()
ax.hist(samples.numpy()[:,-1])
plt.show()

The bounds are as follows:

  • 0.1 - 0.6
  • 0.3 - 0.7
  • 0.1 - 0.7
  • 30 - 700

In addition an equality constraint is applied on the first three variables with all coefficients equal to 1 and an RHS of 1.

Sampling 100 samples leads to the following histogram for the fourth variables (bounds: 30-700). The distribution is not uniform at all and the values stay in a very slim interval.
image

In the next step, I scaled all variables to into the 0 - 1 range, including the equality constraint by using the formula below:

image
where xi* is the scaled variable and s_i is the difference between upper and lower bound for the variable. The code below generates 100 samples for this case including scaling back into the original bounds:

lower = tensor([0.1,0.3,0.1,30.])
upper = tensor([0.6,0.7,0.7,700.])

space = upper - lower

scaled_samples = get_polytope_samples(
    n=100,
    bounds = tensor([[0., 0., 0., 0.],[1., 1., 1., 1.]]),
    equality_constraints=[(tensor([1, 2, 0]), tensor([0.4000, 0.6000, 0.5000]), 0.5)]
)

samples = lower + scaled_samples*space

fig, ax = plt.subplots()
ax.hist(samples.numpy()[:,-1])
plt.show()

Now, the fourth variables is sampled in much more uniform way:
image
I think the reason for this behavior is that the Hit and Run sampler needs only a small step to reach the boundary of the polytope (defined by the equality constraint) and that this step is to small to guarantee a proper sampling of the fourth variable.

I would recommend to build this scaling directly into the get_polytope_samples method (PR could be provided by me), as it is also used by gen_batch_initial_conditions when linear constraints are present. In the case that people are not running the optimization in the scaled variable space but using a Normalize InputTransform instead, this behavior can lead to problems in the optimization process.

What do you think?

Best,

Johannes

@Balandat
Copy link
Contributor

Thanks for flagging, this is a great observation. It is indeed a problem that "small step" means different things in different directions, we really want this to be isotropic. I'm not quite sure I fully understand where things go wrong in the code, I'll have to take a closer look to better understand that.

But I think your suggestion sounds great, and we'd very much welcome a PR! I think we can go even a bit more low-level than build it just into the get_polytope_samples method. Ideally we could actually make this part of the PolytopeSampler class, do the normalization on __init__, save the scaling factors there, and then change draw to call an abstract _draw method and then automatically rescale the outputs. This might be a bit more work than your suggestion though, so I guess we can start with the simpler approach.

One of the things we may want to think about a bit more is how we deal with cases in which we don't have bounds, and the polytope is defined purely via inequality constraints. I guess we could just normalize the coefficients in each column of A to have a range comparable to those in the other columns and then record those scaling factors?

@jduerholt
Copy link
Contributor Author

I will first file a PR with the simpler solution in get_polytope_samples for the case that we have actually bounds. For the case without having bounds, I am not yet 100% sure.

I hope to have a first PR until end of the week ;)

@esantorella
Copy link
Member

I approved #1341, but I'm going to leave this open since there are a few minor improvements we can make:

  • Fix problem with polytope sampler #1341 introduces a minor break to backwards compatibility by raising an error when non-integer indices are passed to inequality_constraints or equality_constraints. In a way this is good because non-integer indices don't make sense and aren't usually allowed, but it would be good to warn rather than error at least for the next release. We could eventually deprecate support for the dtypes that don't work here.
  • Since this problem likely arises deeper in the stack than in get_polytope_samples, the fix should live where the issue starts (likely HitAndRunPolytopeSampler.draw or PolytopeSampler).
  • Ideally, we'd have tests for get_polytope_samples and PolytopeSampler subclasses that fail given the old behavior and pass given the new behavior. We should have been more specific about what polytope samples ought to look like in the first place to prevent this bug from arising.
  • It would be nice to have a clear understanding of where the flawed behavior came from in the first place!

facebook-github-bot pushed a commit that referenced this issue Aug 10, 2022
Summary:
## Motivation

As discussed in issue #1225, the polytope sampler runs into problems if the variables live in different dimensions. This PR fixes the issue using the approach mentioned in the issue above.

### Have you read the [Contributing Guidelines on pull requests](https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests)?

Yes

Pull Request resolved: #1341

Test Plan: Unit tests

Reviewed By: esantorella

Differential Revision: D38485035

Pulled By: Balandat

fbshipit-source-id: 8e6ed1c1b81f636d108becb1ab5cf2ca13294420
@esantorella esantorella self-assigned this Aug 11, 2022
facebook-github-bot pushed a commit that referenced this issue Aug 12, 2022
Summary:
## Motivation

This is fixing a very minor issue and I've probably already spent too much time on it. Unless anyone feels really strongly about this, I'd prefer to either have this quickly either accepted or rejected rather than spend a while iterating on revisions.

In the past it was possible to use indexers with dtypes that torch does not accept as indexers via `equality_constraints` and `inequality_constraints`. This was never really intended behavior and stopped being supported in #1341 (discussed in #1225 ) . This PR makes errors more informative if someone does try to use the wrong dtypes, since the existing error message did not make clear where the error came from.

I aslo refactored a test in test_initalizers.py.

### Have you read the [Contributing Guidelines on pull requests](https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests)?

Yes

Pull Request resolved: #1345

Test Plan:
Unit tests for errors raised

## Related PRs

#1341

Reviewed By: Balandat

Differential Revision: D38627695

Pulled By: esantorella

fbshipit-source-id: e9e4e917b79f81a36d74b79ff3b0f710667283cb
@esantorella esantorella closed this as not planned Won't fix, can't repro, duplicate, stale May 5, 2023
Balandat added a commit to Balandat/botorch that referenced this issue Jun 2, 2024
* adds an `n_thinning` argument to `sample_polytope` and `HitAndRunPolytopeSampler`; changes the defaults for `HitAndRunPolytopeSampler` args to `n_burnin=200` and `n_thinning=20`
* normalizes the (inequality and equality) constraints in `HitAndRunPolytopeSampler` to avoid the same issue as pytorch#1225
* introduces `normalize_dense_linear_constraints` to normalize constraint given in dense format to the unit cube
* deprecates `normalize_linear_constraint`; `normalize_sparse_linear_constraints` is to be used instead

Note: This change is in preparation for fixing facebook/Ax#2373
Balandat added a commit to Balandat/botorch that referenced this issue Jun 2, 2024
* adds an `n_thinning` argument to `sample_polytope` and `HitAndRunPolytopeSampler`; changes the defaults for `HitAndRunPolytopeSampler` args to `n_burnin=200` and `n_thinning=20`
* normalizes the (inequality and equality) constraints in `HitAndRunPolytopeSampler` to avoid the same issue as pytorch#1225
* introduces `normalize_dense_linear_constraints` to normalize constraint given in dense format to the unit cube
* deprecates `normalize_linear_constraint`; `normalize_sparse_linear_constraints` is to be used instead

Note: This change is in preparation for fixing facebook/Ax#2373
Balandat added a commit to Balandat/botorch that referenced this issue Jun 2, 2024
* adds an `n_thinning` argument to `sample_polytope` and `HitAndRunPolytopeSampler`; changes the defaults for `HitAndRunPolytopeSampler` args to `n_burnin=200` and `n_thinning=20`
* normalizes the (inequality and equality) constraints in `HitAndRunPolytopeSampler` to avoid the same issue as pytorch#1225
* introduces `normalize_dense_linear_constraints` to normalize constraint given in dense format to the unit cube
* deprecates `normalize_linear_constraint`; `normalize_sparse_linear_constraints` is to be used instead

Note: This change is in preparation for fixing facebook/Ax#2373
Balandat added a commit to Balandat/botorch that referenced this issue Jun 5, 2024
Summary:
This set of changes does the following:
* adds an `n_thinning` argument to `sample_polytope` and `HitAndRunPolytopeSampler`; changes the defaults for `HitAndRunPolytopeSampler` args to `n_burnin=200` and `n_thinning=20`
* Changes `HitAndRunPolytopeSampler` to take the `seed` arg in its constructor, and removes the arg from the `draw()` method (the method on the base class is adjusted accordingly). The resulting behavior is that if a `HitAndRunPolytopeSampler` is instantiated with the same args and seed, then the sequence of `draw()`s will be deterministic. `DelaunayPolytopeSampler` is stateless, and so retains its existing behavior.
* normalizes the (inequality and equality) constraints in `HitAndRunPolytopeSampler` to avoid the same issue as pytorch#1225. If `bounds` are note provided, emits a warning that this cannot be performed (doing this would require vertex enumeration of the constraint polytope, which is NP-hard and too costly).
* introduces `normalize_dense_linear_constraints` to normalize constraint given in dense format to the unit cube
* removes `normalize_linear_constraint`; `normalize_sparse_linear_constraints` is to be used instead
* simplifies some of the testing code

Note: This change is in preparation for fixing facebook/Ax#2373


Test Plan: Ran a stress test to make sure this doesn't cause flaky tests: https://www.internalfb.com/intern/testinfra/testconsole/testrun/3940649908470083/

Differential Revision: D58068753

Pulled By: Balandat
Balandat added a commit to Balandat/botorch that referenced this issue Jun 5, 2024
Summary:
This set of changes does the following:
* adds an `n_thinning` argument to `sample_polytope` and `HitAndRunPolytopeSampler`; changes the defaults for `HitAndRunPolytopeSampler` args to `n_burnin=200` and `n_thinning=20`
* Changes `HitAndRunPolytopeSampler` to take the `seed` arg in its constructor, and removes the arg from the `draw()` method (the method on the base class is adjusted accordingly). The resulting behavior is that if a `HitAndRunPolytopeSampler` is instantiated with the same args and seed, then the sequence of `draw()`s will be deterministic. `DelaunayPolytopeSampler` is stateless, and so retains its existing behavior.
* normalizes the (inequality and equality) constraints in `HitAndRunPolytopeSampler` to avoid the same issue as pytorch#1225. If `bounds` are note provided, emits a warning that this cannot be performed (doing this would require vertex enumeration of the constraint polytope, which is NP-hard and too costly).
* introduces `normalize_dense_linear_constraints` to normalize constraint given in dense format to the unit cube
* removes `normalize_linear_constraint`; `normalize_sparse_linear_constraints` is to be used instead
* simplifies some of the testing code

Note: This change is in preparation for fixing facebook/Ax#2373


Test Plan: Ran a stress test to make sure this doesn't cause flaky tests: https://www.internalfb.com/intern/testinfra/testconsole/testrun/3940649908470083/

Differential Revision: D58068753

Pulled By: Balandat
facebook-github-bot pushed a commit that referenced this issue Jun 5, 2024
Summary:
This set of changes does the following:
* adds an `n_thinning` argument to `sample_polytope` and `HitAndRunPolytopeSampler`; changes the defaults for `HitAndRunPolytopeSampler` args to `n_burnin=200` and `n_thinning=20`
* Changes `HitAndRunPolytopeSampler` to take the `seed` arg in its constructor, and removes the arg from the `draw()` method (the method on the base class is adjusted accordingly). The resulting behavior is that if a `HitAndRunPolytopeSampler` is instantiated with the same args and seed, then the sequence of `draw()`s will be deterministic. `DelaunayPolytopeSampler` is stateless, and so retains its existing behavior.
* normalizes the (inequality and equality) constraints in `HitAndRunPolytopeSampler` to avoid the same issue as #1225. If `bounds` are note provided, emits a warning that this cannot be performed (doing this would require vertex enumeration of the constraint polytope, which is NP-hard and too costly).
* introduces `normalize_dense_linear_constraints` to normalize constraint given in dense format to the unit cube
* removes `normalize_linear_constraint`; `normalize_sparse_linear_constraints` is to be used instead
* simplifies some of the testing code

Note: This change is in preparation for fixing facebook/Ax#2373

Pull Request resolved: #2358

Test Plan: Ran a stress test to make sure this doesn't cause flaky tests: https://www.internalfb.com/intern/testinfra/testconsole/testrun/3940649908470083/

Reviewed By: saitcakmak

Differential Revision: D58068753

Pulled By: Balandat

fbshipit-source-id: 9a75c547a3493e393cd7e724edd984318b76e1f4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants