Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-15377: More robust parameter handling #83

Merged
merged 9 commits into from Sep 1, 2018
Merged

DM-15377: More robust parameter handling #83

merged 9 commits into from Sep 1, 2018

Conversation

timj
Copy link
Member

@timj timj commented Aug 30, 2018

  • Parameters must be defined in a storage class
  • Parameters for virtual composites are forwarded to the component
  • Parameters are handled by InMemoryDatastore via assemblers.
  • Assemblers are called for any parameters not supported by a formatter.

@timj timj requested a review from TallJimbo August 30, 2018 22:00
@timj timj force-pushed the tickets/DM-15377 branch 2 times, most recently from dbad9e1 to 81659ab Compare August 30, 2018 23:37
Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Only non-trivial comments are about whether we could simplify the implementation a bit by being less flexible (and musing about what kinds of flexibility we actually we need).

if parameters:
unusedParams = {k: v for k, v in parameters.items() if k not in usedParams}
else:
unusedParams = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether we should be splitting up responsibility for parameters like this, as opposed to:

  • requiring the combination of all component getters to use all parameters
  • passing even used parameters to the assembler, assuming they're no-ops when applied repeatedly.

Any of these approaches actually works for the only concrete parameters we have for images (because anything that handles bbox should also handle origin, so those are never split up). Row-slicing parameters for catalogs (which don't exist, but could) probably rule out applying the same parameters repeatedly, unless we have some xy0-like way of assigning absolute indices to rows. Column-selection parameters for catalogs (which also don't exist) could probably work with any of these approaches.

So, I don't have a reason not to take this approach. But the second (require all parameters to be used by at least one component) might be the safer approach (just because it's more restrictive) until/unless we have a use case for more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we pass all parameters to get() then datastore can no longer verify against the list of known parameters for that component without every component having to know about the composites that it might at some point be part of. I don't think we want to require that components know all possible composites they could belong to. Passing all parameters to the assembler could work if we know for sure that applying them twice will do no harm (another option is to pass all in to the assembler but also provide the list of those already applied to components, allowing the assembler class to know which might be relevant). I also wondered whether the assembler itself should take parameters rather than doing the assembly and then applying the parameters.

The main problem we have is that parameters have to be applicable to concrete composites and virtual composites with or without formatters being involved. Therefore the assembler has to be able to handle all parameters. handleParameters is called by InMemoryDatastore and for virtual composite assembly, and PosixDatastore calls it for parameters that have explicitly been rejected by the formatters (otherwise generic formatters such as YAMLFormatter have to know about parameters).

ie I think it's a tricky problem. For now would you be happy if I changed handleParameters so that it had an "already applied" argument that could be used for virtual composites?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually unhappy with how things are now; I just wanted to start a discussion about other possibilities. If you think the code as-is is still the best option, I'm happy to go with that.

I certainly agree that we don't want components to know about all of the composites they could be a part of. But if I understand it correctly, I think datastore validation of parameters is more of a "nice to have" than a necessity - that's just about failing earlier with a better error message, right, not correctness?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, only the Exposure formatter tries to validate parameters, but then it's really the only formatter that uses parameters. I'm happy to merge as is and let it evolve a bit. It's hard to get it completely right with only one formatter using parameters and only a couple of tests for them.

inMemoryDataset : `object`
Updated form of supplied in-memory dataset, after parameters
have been used.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe assert that parameters is None or an empty dict?

def knownParameters(self):
"""Return set of all parameters known to this `StorageClass`

The set includes parameters understood by components of a composite.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably get away with requiring the parameters of a composite to be a superset of the parameters of its components, and hence make knownParameters always the same as parameters. At least I can't come up with a kind of parameter for which that would not be true with the StorageClasses and parameters I envision.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For virtual composites the current behavior is exactly what you'd want (components can have their own parameters distinct from the parent). Given the constraint that concrete composite assemblers have to be able to deal with all the parameters you might be right. This automated collecting of child parameters seems like the right approach in general though.

@@ -73,6 +76,22 @@ def testCreation(self):
self.assertIsInstance(sc2.pytype(), StorageClassFactory)
self.assertIn("butler.core", repr(sc2))

def testParameters(self):
"""Test the we can set parameters and validate them"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: the -> that

@timj timj merged commit 873d001 into master Sep 1, 2018
@timj timj deleted the tickets/DM-15377 branch September 1, 2018 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants