Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype VVP #38

Closed
raar1 opened this issue Nov 5, 2018 · 9 comments
Closed

Prototype VVP #38

raar1 opened this issue Nov 5, 2018 · 9 comments
Assignees

Comments

@raar1
Copy link
Collaborator

raar1 commented Nov 5, 2018

The more I look at the library structure so far, the more it seems clear that the UQP/VVP distinction is artificial - helpful for the proposal writing, but not necessarily relevant for implementation.

For example, the uqp/ directory contains two sub dirs - sampling and analysis. This would imply that any VVPs we implement would now go into a new top level dir called vvp/. However, I think a lot of validation primitives would naturally fit into uqp/analysis, or at least have a massive overlap with that.

As I see it, there are four categories:
sampling/ analysis/ comparison/ vis/

Do we insist on the UQP/VVP distinction?

uqp/sampling
uqp/analysis
vvp/comparison
vvp/vis

Should we do away with this split and have each category sit on the top level? Does comparison belong in its own dir at all?

@raar1 raar1 assigned raar1 and dww100 Nov 5, 2018
@djgroen
Copy link
Contributor

djgroen commented Nov 5, 2018

Some validation primitives fit in with UQPs, but I think a fair number of them do not.

I think the relation is likely as follows:

  1. VVPs are likely to rely on UQPs for underlying mechanisms.
  2. UQPs are unlikely to rely on VVPs for underlying mechanisms.

Also, some VVPs are likely to operate on aspects that are not at all covered by any UQPs (e.g. direct coupling intercommunication).

But these are things that, within VECMA, we plan to work out during the December 10/11 meeting.

@dww100
Copy link
Collaborator

dww100 commented Nov 5, 2018

I can see you point Robin and also a case for keeping vis out of it.

Maybe:
primitives/sampling
primitives/analysis
primitives/comparison

I guess a better way to think of this is what would we need in common base classes for each category of things.

@raar1
Copy link
Collaborator Author

raar1 commented Nov 6, 2018

Some ideas and discussion points for primitive Base classes:

sampling:

  • Must implement a function in the vein of add_runs() (or generate_runs())
  • Function could either take a campaign as its first arg (essentially as it's currently done)
  • OR function behaves like a generator/iterator, and the campaign object adds runs returned by this generator (until exhaustion)
  • I prefer something like the latter, because I prefer that external objects do not have permissions to modify the internal state of the campaign object (in this case, by adding runs directly to the run dict). I would much rather use a design philosophy in which the campaign object is responsible for maintaining its own integrity, and all verification is done within campaign, rather than assuming/expecting primitives (some of which will be coded by users) to act appropriately and not mess up the campaign object's internal variables.

analysis:

  • There is already a BaseAnalysisUQP, but it doesn't mandate any function implementations (just provides helper functions, so it doesn't (yet) specify a common interface for analysis primitives
  • Both of its subclasses (basic_stats and ensemble_boot) have a run_analysis() method. Maybe we can mandate the implementation of a method with that name via the Base class?
  • Both subclasses also immediately check for a pandas data_frame - could this be another common point between analysis primitives?
  • Building of the analysis summary is currently done manually in each subclass. As the output style is known, we could probably formalize this, or auto-check it or something.

comparison:

  • compare() I suppose, but with what arguments? Is it possible to prescribe arguments meaningfully without restricting all the different sorts of data one might be comparing with?
  • Enforce/formalize a common output form (such as I would like to do with the analysis primitives)

@dww100
Copy link
Collaborator

dww100 commented Nov 6, 2018

I think that sounds pretty good.

My only question is how to tie the Sampling UQPs back to the Campaign in terms of adding runs in this new paradigm. Or do you envision something like in the script which says something like:

for run in uqp.add_run(campaign):
  campaign.add_run(run)

where uqp.add_run is a generator?

I'm actually not sure that is a coherent thought but wanted to record it whilst I had the chance.

@raar1
Copy link
Collaborator Author

raar1 commented Nov 6, 2018

That's exactly what I was thinking of. That way the campaign does the adding to its own list, and any checks etc are all done within campaign.

@dww100
Copy link
Collaborator

dww100 commented Nov 6, 2018

The issue then becomes logging the UQP used in the Campaign, assuming we still want to do that (I like the idea but it could become a real pain).

@raar1
Copy link
Collaborator Author

raar1 commented Nov 6, 2018

Yes. Maybe the base class could mandate implementation of a function that returns standardized information about the UQP, its name, what it does etc. Then the logging would be done by the campaign object through calling this function and storing the return.

@raar1
Copy link
Collaborator Author

raar1 commented Nov 6, 2018

Right, I'm thinking about making a BasePrimitive class whose main job will be logging helper stuff for its subclasses (e.g. BaseSampler, BaseAnalysis, BaseComparison).

The question is now: how do we standardise what is logged? And what stuff is general to all primitives, and what is specific?

General to all primitives:

  • The name of the primitive
  • Input type
  • Output type

Or perhaps the input/output types only make sense for the Analysis primitive...

Any ideas?

@raar1
Copy link
Collaborator Author

raar1 commented Dec 10, 2018

The outcome of this discussion (and more) is effectively covered by #50 merge.

@raar1 raar1 closed this as completed Dec 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants