Skip to content

defaults, sid and env #81

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ftalbrecht opened this issue Nov 11, 2014 · 9 comments
Closed

defaults, sid and env #81

ftalbrecht opened this issue Nov 11, 2014 · 9 comments
Milestone

Comments

@ftalbrecht
Copy link
Contributor

We have to decide on how to handle defaults, sid generation and env in the future, see the discussion in pr #77 for instance. If I remember correctly todays discussion resulted in the following decisions (please corret me if I rememeber wrong):

  • It will be possible to add certain variables to defaults that are not going to influence the sid, as in (_sid_ignore may not be the final name):
    @defaults('bar1', 'bar2', _sid_ignore=['foobar'])
    def foo(bar1=1, bar2=2, foobar=None):
        pass
  • The above mechanism will mainly be used for I/O related functionality within pyMOR itself, like the cache location, log levels, plot backends, etc...
  • The PYMOR_DEFAULTS environment variable can be set to a list of filenames (instead of only one file), all of which are used. If a default is defined several times in these files a warning will be printed and the last occurence will be used. This allows to have harware independent defaults which affect the sid in a pymor_defaults.py that can be tracked by revision control for reproducability and a second pymor_defaults-$HOSTNAME.py with locations/settings specific to that machine.
  • In the long run there will be no environment variables read within pyMOR apart from PYMOR_DEFAULTS

Is this a correct summary, @sdrave, @andreasbuhr, @renemilk?

@sdrave
Copy link
Member

sdrave commented Nov 12, 2014

Sounds good to me, except for the part of removing all other environment variables. We still need some simple mechanisms for e.g. disabling caching for certain tests, and so on.

@sdrave sdrave modified the milestones: 0.3, 0.4 Nov 12, 2014
@sdrave
Copy link
Member

sdrave commented Jan 11, 2015

Having multiple defaults files is now implemented in 0df1069.
Implementing the sid_ignore functionality turned out to be harder than expected. Main problem is that you can only know which defaults should be ignored for sid calculation after importing all pyMOR modules with defaults. However, doing so creates various default argument objects like ConstantFunction() which need to know the sid of the defaults object during creation.

I spent some time thinking about this, also discussing with @renemilk and @andreasbuhr. There are various possibilities how to solve this:

Easiest would be to allow and force the user to specify for each default that is being set if it should enter sid calcuation. One could still have the sid_ignore argument, from which the standard setting would be derived. Using write_defaults_to_file would also output the current sid settings. On the plus side, this approach would be simple and give the user more power. On the other hand, a novice user probably will not understand the whole sid mechanism at all and use the power to produce hard to track down errors.

Next possibility would be to introduce a second mechanism to specify defaults which do not enter sid calculation. Let's call them io_options for now. We would introduce, write_io_options_to_file, read_io_options_from_file, print_io_options, set_io_options. Maybe there would also be a separate @io_options decorator. This approach would also be a quite clean and straightforward solution, but I dislike having two elaborate mechanisms in pyMOR which are basically doing the same thing.

Finally, would could let all defaults enter sid calculation, but only those which are needed to obtain a desired state. To be more precise: When a new instance of ImmutableInterface is created we start recording which defaults are actually used until __init__ has finished. The same would have to be done when a cached function is called. However, there are various technical details to keep in mind:

  • Tracking the used defaults would have to be done on a stack-like data structure, in case calls to __init__ or cached functions are nested.
  • Before the result of a cached function has been calculated, we do not know which defaults are used during calcuation. Thus, the defaults cannot become part of the key for cache lookup but rather have to be written as part of the value. If the key matches, we then have to retrieve the cached result along with the used defaults and check if they match.
  • If a cached result is used during __init__ or inside another cached function, the functions used for calculating this result will not be called. Thus the cached decorator will have to mark these defaults on its own.
  • NumpyMatrixBasedOperator currently implements its own trivial caching which would have to be aware of this problem. (A better choice would probably be to remove this mechanism, anyhow.)

Despite the complexity, I believe the last approach should work and, if so, would be the most elegant option. Since I would like to release soon and implementing this might break things, I would propose changing the milestone back to 0.4.

@ftalbrecht, is having multiple defaults files and not implementing sid_ignore ok for now? I still believe, #77 should be implemented using defaults, so this could not be done for the 0.3 release. (A good workaround might be to modify the loggers in the user code for now.)

@ftalbrecht
Copy link
Contributor Author

Thank you for your work! To be quite honest I am not fully aware of all consequences of the three approaches you suggest:

  • to me the first one sounds exactly like the one I suggested to begin with. Since I guess it is not, could you give a short example, countering the one I gave at the top?
  • I agree that the second approach is out of the question.
  • At the moment I can not say anything useful about the third approach you suggested. It seems reasonable to me but I can neither tell how much work it will be to implement or what the long term consequences would be.
    All in all I would trust your judgement in this one. I see no problem regarding [core.logger] allow to enable simultaneous file logging #77, I will implement it soon...

@sdrave
Copy link
Member

sdrave commented Jan 12, 2015

@ftalbrecht, the first approach would mean that not only you are able to specify sid_ignore with @defaults, but the user would be also allowed to say in her defaults.py that, e.g., the bicgstab maxiter should not enter sid calculation.

If the trivial caching is removed from the assemble method, implementing the third approach shouldn't be too hard, but it should be tested thoroughly ...

@sdrave sdrave modified the milestones: 0.4, 0.3 Jan 12, 2015
@ftalbrecht
Copy link
Contributor Author

Thank you for the clarification, this is (of couse) out of the question!

@sdrave
Copy link
Member

sdrave commented Jan 12, 2015

@ftalbrecht, what is out of the question?

@ftalbrecht
Copy link
Contributor Author

To allow the specifiation of sid_ignore in a defaults.py. I think that would invite hard to find errors...

@sdrave
Copy link
Member

sdrave commented Jan 12, 2015

Sure, but write_defaults_to_file would write the in-code defined settings to a file along with a big warning to better not change these settings. But I agree that the other approach would be more attractive.

@sdrave
Copy link
Member

sdrave commented Apr 29, 2015

After thinking about it some more, I have finally implemented the original proposal in commits 52a4873...1813d5d.

I am quite happy with the result. Some details:

  • The defaults decorator has now an additional keyword argument sid_ignore as described above.
  • To know which defaults enter sid calculation, we now import every module for which a default is defined.
  • write_defaults_to_file adds unchanged defaults only as a code comment to pymor_defaults.py, so we only need to import all of pyMOR, if the user removes all the comments.
  • I have removed the environment variables to configure caching.
  • The problem which I described above that the state of the defaults has to be known during import time becomes irrelevant with my changes in the 'sid_take_two' branch. Right now it is only a modest problem, since import order can only change if you add more settings to your pymor_defaults.py.

@sdrave sdrave closed this as completed Apr 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants