Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esp::core::Configuration changed to use unordered_map of variant-like instead of Magnum::ConfigurationGroups #1433

Merged
merged 56 commits into from
Aug 25, 2021

Conversation

jturner65
Copy link
Contributor

@jturner65 jturner65 commented Aug 11, 2021

Motivation and Context

Currently Habitat-sim uses Magnum::ConfigurationGroups as the backing datastructure for all metadata loaded via JSON configuration files. A ConfigurationGroup saves all data as strings, and the original type of that data is lost once it is loaded into the system. This is particularly problematic with the user-defined attributes that habitat now supports. Upon read, the type of the user-defined value is inferred by the format of the data in the source JSON. With the upcoming functionality to save JSON configs to disk, including user_defined values, without having the type known will cause all data to be saved to disk as strings.

This PR replaces the ConfigurationGroup as the container for esp::core::Configuration and instead uses a ( collection of typed std::unordered_maps. ) single unordered map of ConfigValues, which use a tagged union/variant structure to retain type information and safety.

One question to consider : do we wish to remove the ConfigurationGroup as the backing structure for esp::core::Configuration. A different structure can be easily designed to replace esp::core::Configuration as the base class for the various configuration attributes that Habitat uses.

If nobody is using an esp::core::Configuration currently outside of the Metadata attributes/configuration subsystem, then the changes proposed by this PR can probably proceed with impunity; if, however, the ConfigurationGroup functionality backing habitat's Configuration is desired to remain, then I should instead create another class to back the attributes.

How Has This Been Tested

Locally, all c++ and python tests pass.

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have completed my CLA (see CONTRIBUTING)
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Aug 11, 2021
@Skylion007
Copy link
Contributor

Really would like to hear @mosra's input on this. Might be a better solution.

@jturner65 jturner65 force-pushed the Config_UseMaps branch 4 times, most recently from 484205c to 144b87d Compare August 16, 2021 13:16
@jturner65
Copy link
Contributor Author

jturner65 commented Aug 16, 2021

Discussing this refactor with @mosra, he suggested I investigate a tagged union storage container, so this push introduces this. Now all variables are stored in a single map, regardless of type, and accessed appropriately based on the "tag"-specified type they are. A std::variant/std::visitor system would have been safer and more streamlined, but these require c++17, which we do not currently support.

@jturner65 jturner65 force-pushed the Config_UseMaps branch 2 times, most recently from 41ba02c to 813ccca Compare August 17, 2021 01:24
@jturner65 jturner65 changed the title esp::core::Configuration changed to use unordered_maps instead of Magnum::ConfigurationGroups esp::core::Configuration changed to use unordered_map of tagged union instead of Magnum::ConfigurationGroups Aug 17, 2021
Copy link
Contributor

@eundersander eundersander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall seems good. It is quite verbose, e.g. lots of switch statements... hopefully we aren't adding new types often.

I spotted a number of changes that seemed to be unrelated to the config stuff. Let's move them to separate PRs so they get properly reviewed.

Side note: we should move to C++17 soon! I guess a lot of this is re-inventing std::any/std::variant.

src/esp/core/Configuration.cpp Outdated Show resolved Hide resolved
src/esp/core/Configuration.cpp Show resolved Hide resolved
src/esp/metadata/attributes/SceneAttributes.h Outdated Show resolved Hide resolved
src/esp/physics/PhysicsObjectBase.h Outdated Show resolved Hide resolved
tests/test_configs.py Show resolved Hide resolved
src/esp/core/Configuration.h Outdated Show resolved Hide resolved
Copy link
Contributor

@aclegg3 aclegg3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like Mosra's opinion before merge. Overall, I like this approach.

I chatted with @jturner65 and came up with a couple of other nice-to-have features:

  1. a query for all keys as a map/dict (key -> type enum) and supporting type enum bindings.
  2. a get(key, type) python function for easy access out of the dict returned by (1.)
  3. sub configuration accessor bindings: get a sub configuration as a separate Configuration object (nested hierarchal structure) with option to get copy or ref and to register modified copies back into a subconfiguration
  4. query list of sub configuration keys

return std::to_string(d);
case ConfigStoredType::String:
return s;
case ConfigStoredType::MagnumVec3: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't Magnum have some sort of serializer for these types @mosra?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has, one for debug output (where the goal is human readability) and one for (INI-style, space-delimited) configuration values. Neither of those produces what's supposedly needed here, tho (JSON-like formatting).

Copy link
Collaborator

@mosra mosra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through the python bindings and the "other code" first, will focus on the internals in a separate review.

const Magnum::Quaternion& val) {
const auto ptr = self.editUserConfiguration();
return ptr->set(key, val);
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those could all be just set_user_config overloads without the type name, no? Or, even more pythonic, implement either a __setitem__ operator or __setattr__, so instead of

foo.set_user_config("bar", bar)

you'd do

foo.user_config['bar'] = bar

or

foo.user_config.bar = bar

The last one might be a bit controversial, argparse uses it for example but it might be confusing to some. Ask the others for opinion first :)

src/esp/bindings/AttributesBindings.cpp Outdated Show resolved Hide resolved
Comment on lines 17 to 26
py::enum_<ConfigStoredType>(m, "ConfigStoredType")
.value("Unknown", ConfigStoredType::Unknown)
.value("Boolean", ConfigStoredType::Boolean)
.value("Integer", ConfigStoredType::Integer)
.value("Double", ConfigStoredType::Double)
.value("String", ConfigStoredType::String)
.value("MagnumVec3", ConfigStoredType::MagnumVec3)
.value("MagnumQuat", ConfigStoredType::MagnumQuat)
.value("MagnumRad", ConfigStoredType::MagnumRad);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion is out of my league and I don't know how to express it via pybind (maybe @Skylion007 could help?) but what about get_type() returning a Python type object? So you'd have e.g.

foo.set_string('bar', 'a string')
assert foo.get_type('bar') == str

Same comment about API naming here as well, could be just foo.bar = 'a string'. And because I think some minor overhead doesn't matter that much, getting the type could be as simple as this, instead of foo.get_type('bar') -- because with the interface designed like this, I think the need for querying just a type and not the value is very minimal:

type(foo.bar)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

py::type::of is what you are looking for: pybind/pybind11#2364

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do need to have an equivalent to 'ConfigStoredType::unknown' to handle the edge case where the value has not been properly initialized. Is there a python type equivalent i could use?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

py::none would be the thing, I'd say? Similar semantics to a null pointer.

src/esp/bindings/CoreBindings.cpp Outdated Show resolved Hide resolved
Comment on lines 154 to 169
.def(
"has_rad",
[](Configuration& self, const std::string& key) {
return self.checkMapForKeyAndType(key, ConfigStoredType::MagnumRad);
},
R"(Returns true if specified key references a Magnum::Rad value in this configuration.)")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about usefulness of the has_* APIs, now that you have everything stored in a single map there isn't a possibility of the same key being used for two different types and so this could be expressed simply as this and there's no need to have a dedicated overload for each and every type.

foo.get_type('bar') == mn.Rad

(assuming the above suggestion with get_type() returning a Python type object, which makes it a lot shorter than ConfigStoredType.MagnumRad)

src/esp/physics/PhysicsObjectBase.h Outdated Show resolved Hide resolved
src/tests/AttributesManagersTest.cpp Outdated Show resolved Hide resolved
Copy link
Contributor Author

@jturner65 jturner65 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to strip out all the string concatenation changes and put them in another PR. (The only modifications will be what are required to support the underlying changes in the Configuration).

Copy link
Collaborator

@mosra mosra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another batch, about the internals, I still need to look deeper into the breadcrumb parts.

Sorry in advance for all the comments! :)

src/esp/core/Configuration.h Outdated Show resolved Hide resolved
src/esp/metadata/attributes/AttributesBase.h Outdated Show resolved Hide resolved
src/esp/core/Configuration.h Outdated Show resolved Hide resolved
src/esp/core/Configuration.h Outdated Show resolved Hide resolved
src/esp/core/Configuration.h Outdated Show resolved Hide resolved
src/esp/core/Configuration.h Outdated Show resolved Hide resolved
src/esp/core/Configuration.h Outdated Show resolved Hide resolved
src/esp/core/Configuration.cpp Outdated Show resolved Hide resolved
src/esp/core/Configuration.cpp Outdated Show resolved Hide resolved
src/esp/core/Configuration.h Outdated Show resolved Hide resolved
@Skylion007
Copy link
Contributor

A std::variant/std::visitor system would have been safer and more streamlined, but these require c++17, which we do not currently support.
This would be nice to have and PyBind11 has native support for std::variant. Everything seems to compile just fine under C++17 so I would in favor of upgrading as well. https://pybind11.readthedocs.io/en/stable/advanced/cast/stl.html#c-17-library-containers. There are also some minor improvements in PyBind11 you get when using C++17 thanks to fold expressions and better constexpr support.

@jturner65
Copy link
Contributor Author

A std::variant/std::visitor system would have been safer and more streamlined, but these require c++17, which we do not currently support.

This would be nice to have and PyBind11 has native support for std::variant. Everything seems to compile just fine under C++17 so I would in favor of upgrading as well. https://pybind11.readthedocs.io/en/stable/advanced/cast/stl.html#c-17-library-containers. There are also some minor improvements in PyBind11 you get when using C++17 thanks to fold expressions and better constexpr support.

Just FYI, when I said safer and more streamlined, I meant as compared to the former tagged-union refactor that this PR originally introduced. The system that I replaced it with (with substantial guidance from @mosra ) does not have the same safety concerns.

@mosra
Copy link
Collaborator

mosra commented Aug 22, 2021

Re std::variant, to clarify my view and repeat to what I said to John privately -- I think what we created here does everything we need, is safe, memory- and runtime-efficient, compiles fast and is reasonably easy to maintain / extend for new types.

On the other hand, std::variant<bool, int, double, Vector3, Quaternion, Rad, std::string, Configuration, ...> would have notorious issues with compile times, where the compile-time cost of accessing a value scales linearly with the number of types a variant can store, and it can quickly become unbearable to the point where people have to "optimize" by storing e.g. a float as a Vector3 to reduce the amount of types used. (Not to mention the "fun" one has to go through with std::visit().) Yes, I'm skeptical about C++17 features.

image

About python bindings, I think we're fine here, exposing all types in a Pythonic way wasn't too complicated.

@jturner65 jturner65 force-pushed the Config_UseMaps branch 2 times, most recently from 6702022 to 9d11bef Compare August 22, 2021 17:09
Copy link
Contributor

@aclegg3 aclegg3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Once the last couple review changes are addressed I think we can ship this.

@jturner65 jturner65 merged commit be9bc69 into facebookresearch:master Aug 25, 2021
@jturner65 jturner65 deleted the Config_UseMaps branch August 25, 2021 13:34
@jturner65 jturner65 changed the title esp::core::Configuration changed to use unordered_map of tagged union instead of Magnum::ConfigurationGroups esp::core::Configuration changed to use unordered_map of variant-like instead of Magnum::ConfigurationGroups Aug 25, 2021
Copy link
Collaborator

@mosra mosra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more bits to add to #1452, it was easier to add the review here because this PR has the full code.

src/esp/bindings/ConfigBindings.cpp Show resolved Hide resolved
R"(Retrieves a string representation of the value referred to by the passed key.)")

.def("get_bool_keys",
&Configuration::getStoredKeys<ConfigStoredType::Boolean>)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last thing, I don't feel getStoredKeys() needs to be templated, instead it could take the type as a parameter (getKeysByType() and since you have the ConfigStoredType exposed anyway, all these overloads could be just one get_keys_by_type() with the ConfigStoredType as a parameter.

"key"_a)

.def(
"has_bool",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, there could be a single hasKeyOfType() / has_key_of_type(), taking a ConfigStoredType.

return std::to_string(r.operator float());
}
default:
ESP_CHECK(true, "Unknown/unsupported Type in ConfigValue::getAsString.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is always true, and the error message will never ever be displayed.
Is it a bug here?

consider using:

CORRADE_INTERNAL_ASSERT_UNREACHABLE();
return ""; // dummy to avoid compile warning

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of similar places like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks are reachable only if a new type was added to the ConfigStoredType enumeration but not supported by case entries. The ESP_CHECKs were added to provide feedback to illuminate this condition so that it may be remedied, should other safeguards against this situation fail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, missed that, should have been false. @jturner65 can you fix these? Basically all occurences of ESP_CHECK(true, ... are wrong. And since it's unlikely to be hit from user code, even less Python code, it doesn't need to be an ESP_CHECK() either. This is what you're looking for:

CORRADE_ASSERT_UNREACHABLE("Unknown/unsupported Type in ConfigValue::getAsString");

(btw., @bigbike, the return isn't needed, the macro contains a compiler-specific "unreachable code" annotation which does the right thing with no warnins)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! thanks for finding this @bigbike and @mosra

case ConfigStoredType::MagnumRad:
return cfg.setValue(key, get<Mn::Rad>());
default:
ESP_CHECK(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is another place that needs to be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants