-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto registering of module dependencies #1338
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for getting started on this @mnjowe, this looks like a good first pass! I've made some suggested changes. A lot of these are just minor formatting tweaks and a few cases where I think using different variable names would improve readability.
There are a series of superflous whitespace changes in test_analysis.py
that I have suggested reverting - not sure if your editor setting is using a different default indent level - we are generally using 4 spaces? As I think the added test would be better situated in the tests/test_module_dependencies.py
test module, it maybe better to just move across the new test function there and then revert the tests/test_analysis.py
file to its previous state so it untouched by this PR.
I think it's also worth us thinking about whether we want to continue to have both data_folder
and resourcefilepath
as argument names referring to the same thing. Both are currently used in code, but as data_folder
argument to read_parameters
has not be used in practice, I think possibly just sticking to resourcefilepath
everywhere for consistency might be better as that will make it clearer to everyone that the argument to Simulation.register
has the same role as the current resourcefilepath
argument to module initialisers. @tbhallett any thoughts on this?
src/tlo/dependencies.py
Outdated
get_dependencies: DependencyGetter = get_init_dependencies, data_folder: Path = None, auto_register_modules: bool = | ||
False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_dependencies: DependencyGetter = get_init_dependencies, data_folder: Path = None, auto_register_modules: bool = | |
False | |
get_dependencies: DependencyGetter = get_init_dependencies, | |
data_folder: Optional[Path] = None, | |
auto_register_modules: bool = False, |
For consistency better to put line breaks after each argument and avoid introducing a break between the auto_register_modules
argument name and it's default value as it makes it a bit difficult to quickly see these are linked.
Also if an argument is allowed to accept a value of None
the type hint should allow for this by using Optional
(or equivalently specifying Union[..., None]
where ...
is the original type hint).
src/tlo/dependencies.py
Outdated
:param data_folder: resource files folder | ||
:param auto_register_modules: whether to register missing modules or not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param data_folder: resource files folder | |
:param auto_register_modules: whether to register missing modules or not | |
:param data_folder: Resource files folder. | |
:param auto_register_modules: Whether to register missing modules or not. Any missing | |
modules will be registered with default values for their initialiser arguments. |
Argument descriptions in the docstring should ideally be full sentences. Also think it's worth adding a note about values used for module arguments to make this explicit.
src/tlo/dependencies.py
Outdated
module_class = get_module_class_map(set())[dependency](resourcefilepath=data_folder) | ||
module_instance_map.update({dependency: module_class}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
module_class = get_module_class_map(set())[dependency](resourcefilepath=data_folder) | |
module_instance_map.update({dependency: module_class}) | |
module_instance = get_module_class_map(set())[dependency](resourcefilepath=data_folder) | |
module_instance_map[dependency] = module_instance |
While get_module_class_map(set())[dependency]
evaluates to a module class, the value returned by calling its initialiser method is a module instance so I think module_instance
would be a more descriptive name here and more consistent with naming of module_instance_map
.
Also to add a new entry to a dictionary, the more idiomatic way is just to use an indexed assignment statement rather than the update
method (which is typically used to add multiple values from another dictionary at once).
To avoid repeatedly calling get_module_class_map
here, it would also be better to call this once outside of the for dependency in sorted(dependencies):
loop and assign to a variable module_class_map
and then reuse this in the snippet above as it doesn't depend on the value of dependency
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the third point, the module get_module_class_map
is called once in the for dependency in sorted(dependencies):
and that is done within :
if dependency not in module_instance_map:
if auto_register_modules:
This makes its not redundant. It would have been redundant if it was placed just above if statement. What do you think @matt-graham @mnjowe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jkumwenda. Even inside that if statement it is still redundant because it will keep getting initialised at every dependecy in the for loop that's not in the module instance map. @matt-graham is right, we need to initialise that outside the for loop and re-use it within that it statement block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thank you for clarification, We will resolve that and do a quick debug before committing.
@@ -368,7 +369,7 @@ def test_get_parameter_functions(seed): | |||
|
|||
# Check that the parameter identified exists in the simulation | |||
assert ( | |||
name in sim.modules[module].parameters | |||
name in sim.modules[module].parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name in sim.modules[module].parameters | |
name in sim.modules[module].parameters |
Don't think this whitespace change is needed (4 space visual indent is what we would usually use) so suggesting reverting.
@@ -88,7 +89,7 @@ def initialise_simulation(self, sim: Simulation): | |||
|
|||
# At INFO level | |||
assert ( | |||
len(output["tlo.methods.dummy"]["_metadata"]["tlo.methods.dummy"]) == 2 | |||
len(output["tlo.methods.dummy"]["_metadata"]["tlo.methods.dummy"]) == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(output["tlo.methods.dummy"]["_metadata"]["tlo.methods.dummy"]) == 2 | |
len(output["tlo.methods.dummy"]["_metadata"]["tlo.methods.dummy"]) == 2 |
Don't think this whitespace change is needed (4 space visual indent is what we would usually use) so suggesting reverting.
fullmodel(resourcefilepath=resourcefilepath) | ||
+ [ | ||
ImprovedHealthSystemAndCareSeekingScenarioSwitcher( | ||
resourcefilepath=resourcefilepath | ||
), | ||
DummyModule(), | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fullmodel(resourcefilepath=resourcefilepath) | |
+ [ | |
ImprovedHealthSystemAndCareSeekingScenarioSwitcher( | |
resourcefilepath=resourcefilepath | |
), | |
DummyModule(), | |
] | |
fullmodel(resourcefilepath=resourcefilepath) | |
+ [ | |
ImprovedHealthSystemAndCareSeekingScenarioSwitcher( | |
resourcefilepath=resourcefilepath | |
), | |
DummyModule(), | |
] |
Don't think this whitespace change is needed so suggesting reverting.
"ImprovedHealthSystemAndCareSeekingScenarioSwitcher" | ||
== list(sim.modules.keys())[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ImprovedHealthSystemAndCareSeekingScenarioSwitcher" | |
== list(sim.modules.keys())[0] | |
"ImprovedHealthSystemAndCareSeekingScenarioSwitcher" | |
== list(sim.modules.keys())[0] |
Don't think this whitespace change is needed so suggesting reverting.
@@ -586,7 +586,7 @@ def test_summarize(): | |||
names=("draw", "run"), | |||
), | |||
index=["TimePoint0", "TimePoint1"], | |||
data=np.array([[0, 20, 1000, 2000], [0, 20, 1000, 2000],]), | |||
data=np.array([[0, 20, 1000, 2000], [0, 20, 1000, 2000], ]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data=np.array([[0, 20, 1000, 2000], [0, 20, 1000, 2000], ]), | |
data=np.array([[0, 20, 1000, 2000], [0, 20, 1000, 2000],]), |
Don't think this whitespace change is needed so suggesting reverting.
@@ -637,7 +637,31 @@ def test_summarize(): | |||
pd.DataFrame( | |||
columns=pd.Index(["lower", "mean", "upper"], name="stat"), | |||
index=["TimePoint0", "TimePoint1"], | |||
data=np.array([[0.5, 10.0, 19.5], [0.5, 10.0, 19.5],]), | |||
data=np.array([[0.5, 10.0, 19.5], [0.5, 10.0, 19.5], ]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data=np.array([[0.5, 10.0, 19.5], [0.5, 10.0, 19.5], ]), | |
data=np.array([[0.5, 10.0, 19.5], [0.5, 10.0, 19.5],]), |
Don't think this whitespace change is needed so suggesting reverting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have to look into my environment configurations. Its adding these spaces automatically. at first I thought they're changes from merging master in my branch BUT no. I will look into this. I will also revert the extra space changes made to this file. Thanks
tests/test_analysis.py
Outdated
def test_auto_register_modules(tmpdir): | ||
""" check module dependencies can be registered automatically """ | ||
start_date = Date(2010, 1, 1) | ||
# configure logging | ||
log_config = { | ||
"filename": "LogFile", | ||
"directory": tmpdir, | ||
} | ||
sim = Simulation(start_date=start_date, seed=0, log_config=log_config, data_folder=resourcefilepath) | ||
try: | ||
# try executing the code in this block and go to except block if module dependency error exception is fired | ||
|
||
# register modules without their associated dependencies | ||
sim.register(demography.Demography(resourcefilepath=resourcefilepath), | ||
copd.Copd(resourcefilepath=resourcefilepath), | ||
auto_register_modules=True) | ||
|
||
except ModuleDependencyError as exception: | ||
# if auto register modules argument is false, there should be a module dependency error exception fired | ||
assert exception | ||
assert exception.__class__ == ModuleDependencyError | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a test is a good idea but I'm not sure if this is the most appropriate test module for it to go in. I would say tests/test_module_dependencies.py
would probably be the more obvious place? Was the reason for putting this in test_analysis.py
as the imagined use case is that this is mainly for performing analyses with the model where the user won't necessarily know or want to worry about the module dependencies?
I think also the test here could do with a bit of changing. Tests should generally check a piece of code adheres to some expected behaviour. At the moment the only assert
statements in the test are within the except
block which I think should not be hit if the auto_register_modules=True
argument is working as intended. I would say it would be better here to for example check if after registering the specified modules that all required dependencies of the copd.Copd
module are present. For example something like
required_dependencies = get_all_required_dependencies(sim.modules["Copd"])
registered_module_names = set(sim.modules.keys())
assert required_dependencies <= registed_module_names
It would also be good to check the Simulation.register
function raises an exception if auto_register_modules=False
and we don't pass in all required dependencies, which it looks like might have been what you were trying to do here? If so rather than wrap in try...except
block, it would be better to use the pytest.raises
context manager to check a ModuleDependencyError
is raised.
I think a separate unit test of the updated topologically_sort_modules
function in tlo.dependencies
would also be useful which tests the behaviour is as expected when the new data_folder
and auto_register_modules
arguments are explicitly specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @matt-graham. Part of your comment on the test is what also was also going on in my mind(I wasn't fully satisfied with it). Initially my plan was to use with pytest.raises(ModuleDependencyError)
. The only challenge is that this fails when the code is not raising any exception(with an error DID NOT RAISE
) and I wasn't able to find a way how I can capture that(as in how to assert that no exception was raised)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on where to place this test. I think test dependencies will also be fine. I just noted that there is already a test in there(test_missing_dependency_raises_error_on_register
, using dummy modules of course) that checks module dependency and I felt like I should not add another test that to some extent will do the same. Do you want me to extend/update this test with real disease modules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay - for testing that a code snippet doesn't raise an exception the usual pattern is to just have a test function which calls the relevant function / methods, as if an unhandled exception is encountered this will automatically be interpreted as a test failure. Your current test with the sim.register
call moved outside of the try
block (and no corresponding except
block) would work for this. While testing no error is raised is a useful test, I would also say it would be worth testing if no error happens whether the registered dependencies are as expected using something similar to what I suggested above. The test should then fail either if we get a module dependency error unexpectedly or if the module dependencies are not as expected (but we didn't get an error).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So helpful. Thanks @matt-graham
Ah okay - for testing that a code snippet doesn't raise an exception the usual pattern is to just have a test function which calls the relevant function / methods, as if an unhandled exception is encountered this will automatically be interpreted as a test failure. Your current test with the
sim.register
call moved outside of thetry
block (and no correspondingexcept
block) would work for this. While testing no error is raised is a useful test, I would also say it would be worth testing if no error happens whether the registered dependencies are as expected using something similar to what I suggested above. The test should then fail either if we get a module dependency error unexpectedly or if the module dependencies are not as expected (but we didn't get an error).
… the argument is not being implemented in this PR by the way.
Is this just about the name we'll use? On that I don't have a strong feeling... but I suspect 'resoucefilepath' would have instant recognition for everyone, if we moved it. |
Yep, was just checking there wasn't any reason I was missing for having both |
* fix failing test * fix unused import statement * edit optional dependency in demography.py * roll back simulation.py * put kwarg in demography.py * update test * roll back incidental change * factorize calc * add is_alive * roll back incidental changes * make static for clarity * roll back incidental changes --------- Co-authored-by: Tim Hallett <39991060+tbhallett@users.noreply.github.com>
…fix one case (#1349) * Globally disable Pylint E06060 possibly-used-before-assignment rule * Give more informative error message on invalid arguments
Hi @tbhallett and @matt-graham. Below are the next steps I'm taking from all the discussions made on this PR and a conversation we had on slack
Thanks |
… into mnjowe/auto-module-registration # Conflicts: # tests/test_analysis.py
@matt-graham. Just realised there have been some commits from other branches added. I guess its a of s a result of caches in my local master branch when I was trying to revert changes to test analyses. I'm thinking of opening a new PR for addressing the linked issue. This PR seems to have been messed up as there are now many files changed. |
@matt-graham and @tbhallett see PR #1379 for a continuation on this. Thanks |
@tbhallett , @matt-graham , @tamuri . This PR aims at resolving issue #1325.
In a separate issue and PR I think, we can also discuss implementing the newly added
resourcefilepath
argument(in simulation object) to theread parameters
section of disease modules rather than re-creating aresourcefilepath
in each disease module?