Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for truncated normal distribution #188

Merged
merged 11 commits into from
Oct 18, 2021

Conversation

pbalapra
Copy link
Contributor

Adding support for truncated normal distribution for NormalIntegerHyperparameter and NormalFloatHyperparameter with lower and upper bound.

@Deathn0t
Copy link
Contributor

Deathn0t commented Sep 1, 2021

Hello @mfeurer , what do you think about this PR? We found it extremely useful to have hyper-parameters with normal distribution and boundary constraints.

ConfigSpace/hyperparameters.pyx Outdated Show resolved Hide resolved
ConfigSpace/hyperparameters.pyx Show resolved Hide resolved
Copy link
Contributor

@eddiebergman eddiebergman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pbalapra, @Deathn0t
These changes look good and don't intrude on any current functionality. However I think tests are much required for this to ensure it works correctly in all cases. Some interesting test cases which should be tested against:

  • How do both of these act when upper == lower, are they both inclusive bounds or exclusive?
  • I assume from your usage, the quantization and transforming to uniform works but we would needs tests to confirm that this functionality will always work as intended.

I think the main doc string for both classes could do with a simple line describing that bounds can be set. You can see how the docstrings are rendered here.

ConfigSpace/hyperparameters.pyx Show resolved Hide resolved
ConfigSpace/hyperparameters.pyx Outdated Show resolved Hide resolved
ConfigSpace/hyperparameters.pyx Outdated Show resolved Hide resolved
@Deathn0t
Copy link
Contributor

Hello @eddiebergman,

I think I took into consideration all your comments. I added the test in the test_hyperparameters.py file. All tests are passing for me with pytest. Let me know if everything is ok.

@codecov
Copy link

codecov bot commented Sep 14, 2021

Codecov Report

Merging #188 (f4494e0) into master (5e8acfd) will decrease coverage by 1.88%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #188      +/-   ##
==========================================
- Coverage   68.22%   66.33%   -1.89%     
==========================================
  Files          18       17       -1     
  Lines        1775     1613     -162     
==========================================
- Hits         1211     1070     -141     
+ Misses        564      543      -21     
Impacted Files Coverage Δ
ConfigSpace/read_and_write/irace.py

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5e8acfd...f4494e0. Read the comment docs.

@eddiebergman
Copy link
Contributor

Hi @Deathn0t,

All looks good to me, thanks for adding the tests! The only last small change required is that one of the tests pre-commit failed. This is just a code standard checker for which we use flake8.

You can either manually fix this and push, I'll rerun the tests and you can see if anything else needs to be fixed. Alternatively, to run this check locally.

# Using pre-commit
pip install pre-commit
pre-commit run --all-files

You can also just use a flake8 linter if it's readily available in whatever editor you use, the results should be the same.

@mfeurer if you want to have a final look, please do, otherwise I'm happy with the changes.

Copy link
Contributor

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks a lot for the PR. I just also had a look and think there are a few things missing:

  • test for sampling from the truncated normal distribution
  • you need to adapt the get_neighbors functions to take the bounds into account as well
  • please also see my inline comments

setup.py Outdated Show resolved Hide resolved
ConfigSpace/hyperparameters.pyx Show resolved Hide resolved
ConfigSpace/hyperparameters.pyx Show resolved Hide resolved
ConfigSpace/hyperparameters.pyx Show resolved Hide resolved
ConfigSpace/hyperparameters.pyx Show resolved Hide resolved
self.assertEqual(
"param, Type: NormalFloat, Mu: 5.0 Sigma: 10.0, Range: [0.1, 10.0], Default: 5.0, on log-scale, Q: 0.1", str(f6))

self.assertNotEqual(f1, f2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new duplicated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry, I don't see the duplicated? I followed the previous test cases with f4, f4_ which was already there.

Deathn0t and others added 2 commits September 15, 2021 09:13
Co-authored-by: Matthias Feurer <lists@matthiasfeurer.de>
@Deathn0t
Copy link
Contributor

I added this test for the sampling:

    def test_sample_NormalFloatHyperparameter_with_bounds(self):
        hp = NormalFloatHyperparameter("nfhp", 0, 1, lower=-3, upper=3)

        def actual_test():
            rs = np.random.RandomState(1)
            counts_per_bin = [0 for i in range(11)]
            for i in range(100000):
                value = hp.sample(rs)
                index = min(max(int((np.round(value + 0.5)) + 5), 0), 9)
                counts_per_bin[index] += 1

            self.assertEqual([0, 0, 0, 2184, 13752, 34078, 34139, 13669,
                              2178, 0, 0], counts_per_bin)

            self.assertIsInstance(value, float)
            return counts_per_bin

        self.assertEqual(actual_test(), actual_test())

@Deathn0t
Copy link
Contributor

Deathn0t commented Sep 15, 2021

Hello @mfeurer and @eddiebergman, I think I answered all the comments.

While editing the get_neighbors function I found it quite inconsistent... especially when comparing to other types of hyperparameters. In the case of normal hyperparameters having a "neighbor" based on the variance of the distribution is strange no? The neighbor can be a value out of the bounds returned by the function to_uniform for example.

I was not able to find a documentation of the expected behaviour for this function.

@Deathn0t
Copy link
Contributor

Hello, any update on this @eddiebergman @mfeurer?

@eddiebergman
Copy link
Contributor

Hello, any update on this @eddiebergman @mfeurer?

Hi @Deathn0t,

Sorry for the delay and thanks for pinging us again on it. I reviewed the PR and I am happy with the changes. @mfeurer will be available later this week or next and I don't want to merge without a final look over by him. Apologies again for this delay.

In the meantime I dug a little deeper and there are some discussion points if you'd like to have a future say in how we handle some inconsistencies you brought to our attention.

While editing the get_neighbors function I found it quite inconsistent... especially when comparing to other types of hyperparameters. In the case of normal hyperparameters having a "neighbor" based on the variance of the distribution is strange no? The neighbor can be a value out of the bounds returned by the function to_uniform for example.

I was not able to find a documentation of the expected behaviour for this function.

  • get_neighbors - Seems to n parameter values that are neighbors to the current value, by some definition of neighbor or close.
    • This function for NormalFloatHypereparameter as it stands has an odd definition of neighbors, it just creates a normal distribution centered on whatever value is passed and then draws n neighbors from that distribution.
      neighbours = [ rs.normal(value, self.sigma) for _ in range(n) ].
    • I'm not sure of my own definition but to me it would make more sense to have this sample from some reduced sigma, i.e.
      neighbours = [ rs.normal(value, self.sigma / 3) for _ in range(n) ], where 3 is arbitrary and a more reasonable value should be used.
    • What would your definition of a neighbor be?
    • The bounding method you use creates a non-normal distribution. For example, consider the case where lower and upper are close to value (mean) but where self.sigma is quite wide. This bounding would result in a large amount of neighbors with the same value as lower or upper. I don't really know how to tackle this to be honest. Would users still expect neighbors to be normally distributed within these bounds or be aware of the fact that with tight bounds, most of the neighbors will be on the boundary.
                new_value = rs.normal(value, self.sigma)
                if self.lower is not None and self.upper is not None:
                    new_value = min(max(new_value, self.lower), self.upper)
    
  • to_uniform - Well, converts a normally distributed parameter to a uniform one.
    • You're right, NormalFloatHyperparameter.get_neighbors could return a value outside of the UniformFloatHypereparameter that is created by to_uniform. While for our own use cases, I don't think this is an issue (maybe?) I could see how this might cause unexpected behaviors.
    • I'm not sure there's a fully sound solution to this but please correct me and suggest if I am wrong here.
      • A normal distribution is technically unbounded in the values it could return. (without it being specified by params as you did)
      • A uniform distribution is bounded (or at least in how we have it implemeneted)
      • There is no fully safe way to specify a boundary on the normal distribution and so how can it be set for the uniform distribution.
      • Without a formally correct way then in practice we have to pick some value. I believe mu +- 3*sigma is reasonable but if you have other suggestions, do let us know.

@Deathn0t
Copy link
Contributor

Hello @eddiebergman @mfeurer sorry to ping you again about this PR but do you know when it could be integrated? I am wondering if I should fork and build a different wheel or wait for the integration.

@eddiebergman I did not have time to think very deeply about the neighbour generation but for truncated normal we could also directly sample from the truncated distribution instead of applying the min(max(...)..) to maybe avoid duplication.

@eddiebergman
Copy link
Contributor

Hi @Deathn0t ,

Apologies once again, @mfeurer has been quite busy and will not be available for some time.

I'm going to merge this seeing as the tests pass. I will also raise an issue with the remaining points as I think there may be some unexpected outputs with the get_neighbours and to_uniform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants