Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate automated schema loading from config options #9307

Closed
wants to merge 8 commits into from

Conversation

jdavcs
Copy link
Member

@jdavcs jdavcs commented Feb 4, 2020

Update: this is ready for review.

Delegate loading+processing configuration from schema to separate module.

config/__init__.py is doing too much. This is an attempt to separate 2 different responsibilities: (1) automatically loading and processing all config properties from the schema; and (2) setting individual config properties and handling special cases. The former is delegated to a ConfigurationLoader module. The idea is to use this loader module for any further automated schema processing, whereas GalaxyAppConfiguration is to be used for processing individual config properties (as it did before).

Last year's modifications to the config system, while simplifying configuration loading, have done the opposite to the configuration module's code. This modification is an attempt to address this problem.

(Edit: more detailed rationale in commit messages)

@jdavcs jdavcs added status/WIP kind/refactoring cleanup or refactoring of existing code, no functional changes area/configuration Galaxy's configuration system labels Feb 4, 2020
Rationale: group related items together. Constants OPTION_DEFAULTS and
UNKNOWN_OPTION are closely related and are used in the same context.

OPTION_DEFAULTS is only used in `schema.py` and `config_manage.py`; the
latter imports items from `schema.py`. Thus, it is reasonable to place
both constants in the same module.
This commit drops the exception that guarded against accidental
overwriting of existing attributes, and instead ignores the setattr
call, logging a debug message (which was initialy proposed in galaxyproject#8728).

Rationale: with the previous version, base configuration properties
(config_dir, data_dir, etc.) had to be set *after* this method was
called (otherwise an exeption was raised), but *before* any path
resolving code (because these properties are used for path
resolution). As a result, config loading required a rigid sequence of steps:
1: load defaults and kwargs, create all config attributes.
2: set base configuration properties [CANNOT MOVE THIS]
3: resolve paths, do the rest: load individual properties, etc.

However, setting base configuration properties is a special case that is
*independent of schema loading*, is handled separately (in
set_config_base), and takes presendence over schema loading procedures.

This modification makes config loading more flexible, and logical:
1. set base configuration properties
2. do the rest (now any refactoring can go here)
Main configuration module is doing too much.

Separate automated schema loading from loading individual config options
and misc. special cases.
Also, _raw_config is no longer "private", so remove underscore
@jdavcs jdavcs removed the status/WIP label Feb 5, 2020
@jdavcs jdavcs marked this pull request as ready for review February 5, 2020 18:46
@galaxybot galaxybot added this to the 20.05 milestone Feb 5, 2020
Copy link
Member

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great refactoring @ic4f!

@jdavcs
Copy link
Member Author

jdavcs commented Feb 10, 2020

Thank you for reviewing, @bgruening!

test/unit/config/test_load_config.py Outdated Show resolved Hide resolved
test/unit/config/test_load_config.py Outdated Show resolved Hide resolved
test/unit/config/test_load_config.py Outdated Show resolved Hide resolved
test/unit/config/test_load_config.py Outdated Show resolved Hide resolved
jdavcs and others added 2 commits February 10, 2020 20:34
Co-Authored-By: Nicola Soranzo <nicola.soranzo@gmail.com>
resolve(key)
def _load_config(self, kwargs):
loader = ConfigurationLoader()
loader.load(self, kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here loader.load() has the side effect of modifying values in self, which makes me wonder if ConfigurationLoader is the right abstraction here.
I don't have an alternative solution though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That side effect is by design: creating schema-loaded attributes on the config object is one of the loader's primary responsibilities. You have a point, of course: by doing so the loader manipulates the internal structure of the config object. Of course, these two end up being very tightly coupled.

I'm not sure whether this is a good enough solution; it might be. Or maybe, alternatively, the loader could prepare a dict and supply it to the config, and then the config would set its own attributes based on that data (and that data would be the current raw_config). That might be a better approach. I'll give this some more thought tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they are so tightly coupled there is no point in separating some methods in a different class, IMO. This kind of side effects need to be very well documented, and will still surprise other developers.

The alternative approach could work, I guess. Maybe @jmchilton has suggestions? I tend to refer to him as expert in class/module separation 🙇‍♂️

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nsoranzo you are right - it is not the right abstraction. We can do better. I'll close this in favor of a follow-up. Thank you for the detailed review!

@jdavcs
Copy link
Member Author

jdavcs commented Feb 13, 2020

I'm closing this in favor of #9371 (currently WIP)

@jdavcs jdavcs closed this Feb 13, 2020
jdavcs added a commit to jdavcs/galaxy that referenced this pull request Feb 15, 2020
Rationale: group related items together. Constants OPTION_DEFAULTS and
UNKNOWN_OPTION are closely related and are used in the same context.

OPTION_DEFAULTS is only used in `schema.py` and `config_manage.py`; the
latter imports items from `schema.py`. Thus, it is reasonable to place
both constants in the same module.

(cherry-picked from closed galaxyproject#9307)
jdavcs added a commit to jdavcs/galaxy that referenced this pull request Feb 15, 2020
This commit drops the exception that guarded against accidental
overwriting of existing attributes, and instead ignores the setattr
call, logging a debug message (which was initialy proposed in galaxyproject#8728).

Rationale: with the previous version, base configuration properties
(config_dir, data_dir, etc.) had to be set *after* this method was
called (otherwise an exeption was raised), but *before* any path
resolving code (because these properties are used for path
resolution). As a result, config loading required a rigid sequence of steps:
1: load defaults and kwargs, create all config attributes.
2: set base configuration properties [CANNOT MOVE THIS]
3: resolve paths, do the rest: load individual properties, etc.

However, setting base configuration properties is a special case that is
*independent of schema loading*, is handled separately (in
set_config_base), and takes precendence over schema loading procedures.

This modification makes config loading more flexible, and logical:
1. set base configuration properties
2. do the rest (now any refactoring can go here)

(cherry-picked from closed galaxyproject#9307)
jdavcs added a commit to jdavcs/galaxy that referenced this pull request Mar 2, 2020
Rationale: group related items together. Constants OPTION_DEFAULTS and
UNKNOWN_OPTION are closely related and are used in the same context.

OPTION_DEFAULTS is only used in `schema.py` and `config_manage.py`; the
latter imports items from `schema.py`. Thus, it is reasonable to place
both constants in the same module.

(cherry-picked from closed galaxyproject#9307)
jdavcs added a commit to jdavcs/galaxy that referenced this pull request Mar 2, 2020
This commit drops the exception that guarded against accidental
overwriting of existing attributes, and instead ignores the setattr
call, logging a debug message (which was initialy proposed in galaxyproject#8728).

Rationale: with the previous version, base configuration properties
(config_dir, data_dir, etc.) had to be set *after* this method was
called (otherwise an exeption was raised), but *before* any path
resolving code (because these properties are used for path
resolution). As a result, config loading required a rigid sequence of steps:
1: load defaults and kwargs, create all config attributes.
2: set base configuration properties [CANNOT MOVE THIS]
3: resolve paths, do the rest: load individual properties, etc.

However, setting base configuration properties is a special case that is
*independent of schema loading*, is handled separately (in
set_config_base), and takes precendence over schema loading procedures.

This modification makes config loading more flexible, and logical:
1. set base configuration properties
2. do the rest (now any refactoring can go here)

(cherry-picked from closed galaxyproject#9307)
jdavcs added a commit to jdavcs/galaxy that referenced this pull request Mar 9, 2020
Rationale: group related items together. Constants OPTION_DEFAULTS and
UNKNOWN_OPTION are closely related and are used in the same context.

OPTION_DEFAULTS is only used in `schema.py` and `config_manage.py`; the
latter imports items from `schema.py`. Thus, it is reasonable to place
both constants in the same module.

(cherry-picked from closed galaxyproject#9307)
jdavcs added a commit to jdavcs/galaxy that referenced this pull request Mar 9, 2020
This commit drops the exception that guarded against accidental
overwriting of existing attributes, and instead ignores the setattr
call, logging a debug message (which was initialy proposed in galaxyproject#8728).

Rationale: with the previous version, base configuration properties
(config_dir, data_dir, etc.) had to be set *after* this method was
called (otherwise an exeption was raised), but *before* any path
resolving code (because these properties are used for path
resolution). As a result, config loading required a rigid sequence of steps:
1: load defaults and kwargs, create all config attributes.
2: set base configuration properties [CANNOT MOVE THIS]
3: resolve paths, do the rest: load individual properties, etc.

However, setting base configuration properties is a special case that is
*independent of schema loading*, is handled separately (in
set_config_base), and takes precendence over schema loading procedures.

This modification makes config loading more flexible, and logical:
1. set base configuration properties
2. do the rest (now any refactoring can go here)

(cherry-picked from closed galaxyproject#9307)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/configuration Galaxy's configuration system kind/refactoring cleanup or refactoring of existing code, no functional changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants