Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configspace conversion #832

Merged
merged 52 commits into from
Jul 29, 2022
Merged

Add configspace conversion #832

merged 52 commits into from
Jul 29, 2022

Conversation

Delaunay
Copy link
Collaborator

Hi there! Thank you for contributing. Feel free to use this pull request template; It helps us reviewing your work at its true value.

Please remove the instructions in italics before posting the pull request :).

Description

Describe what is the purpose of this pull request and why it should be integrated to the repository.
When your changes modify the current behavior, explain why your solution is better.

If it solves a GitHub issue, be sure to link it.

Changes

Give an overview of the suggested solution.

Checklist

This is simply a reminder of what we are going to look for before merging your code.

Add an x in the boxes that apply.
If you're unsure about any of them, don't hesitate to ask. We're here to help!
You can also fill these out after creating the PR if it's a work in progress (be sure to publish the PR as a draft then)

Tests

  • I added corresponding tests for bug fixes and new features. If possible, the tests fail without the changes
  • All new and existing tests are passing ($ tox -e py38; replace 38 by your Python version if necessary)

Documentation

  • I have updated the relevant documentation related to my changes

Quality

  • I have read the CONTRIBUTING doc
  • My commits messages follow this format
  • My code follows the style guidelines ($ tox -e lint)

Further comments

Please include any additional information or comment that you feel will be helpful to the review of this pull request.

@Delaunay Delaunay marked this pull request as ready for review March 17, 2022 16:51
Copy link
Collaborator

@lebrice lebrice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff!
Can you try using singledispatch callables for the to/from conversion functions?
It would be a lot more intuitive than using the Visitor pattern, since your visitor doesn't have any state anyway:

from ConfigSpace import ConfigurationSpace
from orion.algo.space import Space
from functools import singledispatch
# ConfigSpace -> Orion
@singledispatch
def to_orion(config_space: ConfigSpace) -> Space:
    """ converts the given config space to an Orion space. """
    raise NotImplementedError(f"Don't have a handler for converting spaces of type {type(config_space)}.")

@to_orion.register(UniformFloatHyperparameter)
@to_orion.register(NormalFloatHyperparameter)
def _convert_uniform_float(config_space: UniformFloatHyperparameter) -> Real:
     prior_type = "normal" if isinstance(config_space, NormalFloatHyperParameter) else "uniform"
     ... # (shared logic for both classes)
      
@to_orion.register
def _convert_foobar(config_space: FooBarHyperParameter) -> Dimension:
    ...  
 ... # other handlers
 
 # Same thing for Orion -> ConfigSpace
@singledispatch
def to_configspace(orion_space: Space) -> ConfigurationSpace:
    """ Converts the given Orion thingy to a ConfigSpace thingy. """
    raise NotImplementedError(f"Don't know how to convert {orion_space} to an equivalent from ConfigSpace")

@to_configspace.register
def _real_to_uniform(dim: Real) -> UniformFloatHyperparameter:
     ... # convert from Orion to ConfigSpace

Another major benefit is that we could then extend the conversions externally (even from within ConfigSpace if you want!) by just registering new handlers for new dimensions when they are added.

@Delaunay
Copy link
Collaborator Author

That specific visitor does not have a states, future one could have.
I dont think single dispatch improve the code.

@lebrice
Copy link
Collaborator

lebrice commented Mar 18, 2022

That specific visitor does not have a states, future one could have.

What kind of state could it have?

I dont think single dispatch improve the code.

I'm curious: having considered both, you say that you prefer the Visitor to singledispatch, correct? So what advantages do you see in the Visitor pattern vs dispatching?

From my experience, this kind of if instance(thing, A): do_something() elif isinstance(thing, B): do_something_else() is exactly what singledispatch is great at simplifying. If you take out the Visitor thing, all you're left with is two functions, to_orion and to_configspace (your test is also a pretty clear indication of this!), and inside these functions will be a whole bunch of if isinstance statements. What singledispatch does is to then break up these two functions into smaller, more specific functions. The two generic functions can then be extended later by adding new handlers for new object types as needed.

@Delaunay
Copy link
Collaborator Author

Delaunay commented Mar 18, 2022

The visitor does not use isinstance though, only the to_orion conversion use it.

to_orion uses it because Uniform is split between Float & Integer which are converted the same way except for the class name.
If I create a function for each type I will duplicate all that code twice.
So the code with singledispatch will end up being longer and more repetitive.

  • Visitor could have a state if the conversion process had to go through a SpaceBuilder instead.

  • Having a public interface allow us to define clearly how users are expected to extend Orion Space (We are planning on adding conditionals, so that is more nodes to visit). Users can look at the visitor and see clearly all the methods he is expected to override to implement a valid feature.

NB: Visitor are more than just for the conversion they can be used to implement any new features outside of Dimensions
Users could implement their own sample/serialization/... using a Visitor.

@lebrice
Copy link
Collaborator

lebrice commented Mar 19, 2022

If I create a function for each type I will duplicate all that code twice.

I forgot to mention in my initial example that you can use the same handler for more than one type, by cascading the @to_orion.register(some_type). Edited the example above to show this a bit.
Idk if this changes your mind or not, but in any case, just thought you might want to know, since it would indeed really suck if you could only have one type per handler. This way, there is no need to duplicate a bunch of code between handlers.

I like how you made the conversion functions singledispatch callables, that might come in handy in the future!

Delaunay and others added 2 commits March 19, 2022 15:12
Co-authored-by: Fabrice Normandin <fabrice.normandin@gmail.com>
@Delaunay Delaunay force-pushed the issue_586 branch 5 times, most recently from c3eecbb to adc6bdc Compare March 21, 2022 18:11
weights=dim._probs,
)

def fidelity(self, dim: Fidelity) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incoherent with the type signature of the SpaceConverter[HyperParameter] class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

T = TypeVar("T")


class SpaceConverter(Generic[T]):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see the point of having a SpaceConverter. Is it to force an external library to provide conversion functions functions for the given types of spaces?

Also, this generic signature isn't really respected by the ToConfigSpace class, right? Since it doesn't support Fidelity-type spaces, it returns None in that case, and so it becomes a SpaceConverter[Optional[HyperParameter]]!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

Copy link
Collaborator

@lebrice lebrice Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can look at the visitor and see clearly all the methods he is expected to override to implement a valid feature.

I guess this answers my first question. However, it's clear from the fidelity example that you actually don't want to enforce that every type of dimension has to be supported by subclasses. Therefore, it's not actually an interface, if all its members are optional. It's only use, as far as I can see, is to be a list of names of methods that "can" be implemented by subclasses. It also doesn't have any state.

NB: Visitor are more than just for the conversion they can be used to implement any new features outside of Dimensions
Users could implement their own sample/serialization/... using a Visitor.

At this point, why not just let the user code create a function that does what they want to do, e.g. serialize / print / do something for the types of Dimension of Orion they want to support?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair the fidelity is the only exception. I don't see how the conversions could be usable if they do not support all other type of dimensions. I think the abstract class is the clearest way to express the interface expected for contributors. The only thing needed would be to clarify the exception for the fidelity.

src/orion/algo/space/configspace.py Outdated Show resolved Hide resolved
src/orion/algo/space/configspace.py Outdated Show resolved Hide resolved
src/orion/algo/space/configspace.py Outdated Show resolved Hide resolved
Delaunay and others added 5 commits March 28, 2022 15:40
Co-authored-by: Fabrice Normandin <fabrice.normandin@gmail.com>
Co-authored-by: Fabrice Normandin <fabrice.normandin@gmail.com>
Co-authored-by: Fabrice Normandin <fabrice.normandin@gmail.com>
Comment on lines +102 to +104
def convert_dimension(self, dimension: Dimension) -> T:
"""Call the dimension conversion handler"""
return getattr(self, _to_snake_case(type(dimension).__name__))(dimension)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially a singledispatch callable in disguise: You have handlers for each type below, and you dispatch based on the "type" of the argument, indirectly.
However, having an implicit dependency on the name of the dimension classes is very brittle, and hard to keep track of. Could you make this more explicit somehow?
I also don't think this should be inside a class, since there isn't (and most likely won't be) any state.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what happens if a new type of dimension is added, and there is no method with that name? This will raise an AttributeError, but you'd ideally like a NotImplementedError with an informative message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make it an abstract class then non-implemented methods would already raise an informative message during code development. It's true there is no state so making it a class isn't super useful for functionality's sake but I think it is nevertheless useful to make the interface more clear. The other solution would be to have a module for these functions only (without the methods for the conversion the other way around).

In any case, we should make it such that if the dev forgets to write some methods it fails with a clear message at execution or some tests automatically detects this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this should use the property type to work with transformed dimensions. Looks like we should also have tests for converting transformed spaces.

Suggested change
def convert_dimension(self, dimension: Dimension) -> T:
"""Call the dimension conversion handler"""
return getattr(self, _to_snake_case(type(dimension).__name__))(dimension)
def convert_dimension(self, dimension: Dimension) -> T:
"""Call the dimension conversion handler"""
return getattr(self, _to_snake_case(dimension.type))(dimension)

Copy link
Collaborator

@lebrice lebrice Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as:

  • we don't dispatch based on the names of classes, and
  • the solution is consistent with itself and the base class,
    then I don't have a problem with using an ABC and marking all the methods with @abstractmethod.

Copy link
Collaborator

@lebrice lebrice Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true there is no state so making it a class isn't super useful for functionality's sake but I think it is nevertheless useful to make the interface more clear. The other solution would be to have a module for these functions only (without the methods for the conversion the other way around).

Since the abstract base class would only have static methods, this is actually a good use-case for Protocols:

from typing_extensions import Protocol

T = TypeVar("T")

class SpaceConverter(Protocol[T]):
     integer: Callable[[Integer], T]
     real: Callable[[Real], T]
     fidelity: Callable[[Fidelity], T | None]

This way, if you have a module like so:

# nevergrad/conversion.py (or something like that)

def integer(dim: Integer) -> NeverGradStuff:
     ...
    
def real(dim: Real) -> NeverGradReal:
    ... 
    
def fidelity(dim: Fidelity -> None:
    return None # Nevergrad doesn't support Fidelity dimensions.

The procol can be used just like the abstract base class, and has the benefit of allowing type checkers to also check against a module:

from nevergrad import conversion

space_converter: SpaceConverter[NeverGradStuff] = conversion    # Passes, since all the required methods are defined.

"""Ignores fidelity dimension as configspace does not have an equivalent"""
return None

def space(self, space: Space) -> ConfigurationSpace:
Copy link
Collaborator

@lebrice lebrice Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest moving this logic to a Space handler of to_configspace:

@to_configspace.register(Space)
<this code>

if not set inside ``Space``

"""
conversion = ToConfigSpace()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the "root" of the singledispatch callable, and just return a NotImplementedError for the given argument.

cspace.add_hyperparameters(dims)
return cspace


Copy link
Collaborator

@lebrice lebrice Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly suggest making this a singedispatch callable like you did with to_oriondim, and cutting-pasting the methods of the ToConfigSpace class into handler functions for each supported dimension type below.

Suggested change
@singledispatch

space.register(Real("n3", "norm", 0.9, 0.1))
space.register(Integer("n4", "norm", 1, 2))

newspace = to_configspace(space)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This to me is a very clear sign that you only need a function.

return None


class ToConfigSpace(SpaceConverter[Hyperparameter]):
Copy link
Collaborator

@lebrice lebrice Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be removed, see below.

cspace = ConfigurationSpace()
dims = []

for _, dim in space.items():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for _, dim in space.items():
for dim in space.values():

@@ -0,0 +1,69 @@
import pytest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it could probably be useful to also test the transformed dimensions TransformedDimension, ReshapedDimension, etc. that are passed to the constructors of the algorithms.

If you choose to support these kind of dimensions, you can use this kind of pattern to get the delegate function based on the type of the wrapped dimension, instead of the type of the wrapper:

@to_configspace.register(TransformedDimension)
def _transformed_to_configspace(dim: TransformedDimension) -> HyperParameter:
    unwrapped_dim = dim.original_dimension
    # Get the handler for the unwrapped dim:
    handler_fn = to_configspace.dispatch(type(unwrapped_dim))    
    converted_unwrapped_dim = handler_fn(unwrapped_dim)
    # do something with it? Like re-add the transformation somehow?
    return converted_unwrapped_dim  # just drop the transformation for now.

Copy link
Member

@bouthilx bouthilx Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like most of the dispatch would be done this way since algorithms receive a transformed space. The use of strings to map the conversion methods may be more convenient than dispatch because of this.

Copy link
Collaborator

@lebrice lebrice Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of strings to map the conversion methods may be more convenient than dispatch because of this.

That's incorrect: https://github.com/Epistimio/orion/pull/832/files#diff-eec91bf300bfcaf8b606f6046f67669737948168c0e7a6b00133c91965f5ddd1R104

The name of the class is used for dispatching. The issue is the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed: #832 (comment)

@lebrice
Copy link
Collaborator

lebrice commented Apr 13, 2022

The visitor does not use isinstance though, only the to_orion conversion use it.

It actually does, indirectly, since it dispatches based on the name of the type. This is equivalent to doing isinstance.

@donglinjy
Copy link
Collaborator

donglinjy commented May 11, 2022

Are we still working on this? I think HPOBench integration as Orion BenchmarkTask can take some benefit from this work.

Copy link
Collaborator

@lebrice lebrice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some issues will pop up with transformed spaces, it would be a good idea to add some tests for those.

Feel free to disregard my other comments about the design. If something comes up, I'll make a PR.

@bouthilx
Copy link
Member

This version of the conversion would be sufficient for HPOBench that @donglinjy is working on. I will clean up this PR and merge, and then tackle the conversion of transformed space in another PR.

@bouthilx
Copy link
Member

There is a great opportunity here to improve the transformation process and uniformize it with the conversion of spaces. This will take a bit more work, so I will merge this PR in it's current form to allow @donglinjy progress meanwhile on the integration of HPOBench. The interface of the conversion class will likely change but that will be simple to adjust to.

@bouthilx bouthilx merged commit d017d1b into Epistimio:develop Jul 29, 2022
@notoraptor notoraptor mentioned this pull request Aug 2, 2022
@bouthilx bouthilx added the enhancement Improves a feature or non-functional aspects (e.g., optimization, prettify, technical debt) label Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves a feature or non-functional aspects (e.g., optimization, prettify, technical debt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants