Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: arch: Mutable config properties #1122

Merged
merged 4 commits into from Aug 9, 2021

Conversation

pdxjohnny
Copy link
Member

@pdxjohnny pdxjohnny commented May 25, 2021

  • We started with NamedTuples, those had immutable properties
  • Then we moved to dataclasses, we currently have properties and mutable
  • Now we want to move to a dataclass or dataclass like approach where we default to having properties be immutable. We add a mutable keyword argument to field() which would signify that the config property is allowed to be modified without creating an entirely new instance of the config object.
    • Motivation here is that for model hyperparameter tuning we need to support in a standard way how properties of models can be tuned at runtime without instantiating an entirely new model class.
    • Another motivation which is father off is automl. When we go to do automl, whatever is “the” automl engine we implement will need to know which properties of model configs it can tweak in an automated fashion, and which things, for example the network architecture for neural networks, or even so basic as the directory property are not allowed be tweaked automatically.
  • Implementation
  • This discussion stemmed from lack of clarity on model saving and loading. Hashim was unsure when implementing a tutorial what should happen if a model is instantiated in Python with certain hyperparameters but the model exists in a saved state on disk.
    • What currently happens (at least in the case of scikit) is that the saved state is load and the config properties are ignored essentially, because they are not used to instantiate the scikit model class.
    • There is another ADR within the referenced PR that talks about model saving and loading. We essentially decided that in memory properties should override saved properties provided that those properties are mutable. And this is where the mutable discucussion came from.
  • We decided that these changes will need to take place as a part of Saahil’s project.

-------

See :doc:`0003-Config-Property-Mutable-vs-Immutable` for more context. These
documents we started during the same Weekly Sync Meeting (2021-05-25).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- What happens if package version changes?

- Should we record what default value was and use old value in event of
change? Or should we use the new value?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codecov-commenter
Copy link

codecov-commenter commented May 25, 2021

Codecov Report

Merging #1122 (a74aae7) into master (c61efb4) will decrease coverage by 0.11%.
The diff coverage is 71.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1122      +/-   ##
==========================================
- Coverage   86.32%   86.21%   -0.12%     
==========================================
  Files         168      168              
  Lines       10543    10624      +81     
  Branches     1732     1746      +14     
==========================================
+ Hits         9101     9159      +58     
- Misses       1100     1117      +17     
- Partials      342      348       +6     
Impacted Files Coverage Δ
dffml/df/memory.py 79.79% <0.00%> (-0.11%) ⬇️
dffml/source/df.py 75.60% <0.00%> (-0.94%) ⬇️
dffml/source/dir.py 82.53% <50.00%> (-1.07%) ⬇️
dffml/service/dev.py 71.53% <66.66%> (-0.03%) ⬇️
dffml/util/python.py 75.67% <69.23%> (-15.24%) ⬇️
dffml/base.py 83.52% <77.55%> (-1.15%) ⬇️
dffml/model/model.py 80.83% <100.00%> (+0.16%) ⬆️
dffml/source/file.py 97.89% <100.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c61efb4...a74aae7. Read the comment docs.

@mhash1m
Copy link
Contributor

mhash1m commented Jun 11, 2021

Tried out the patch from our last meeting on the example in Config Property Mutable vs Immutable.
It didn't work as-is (after implementing the getter) as the immutable setter gets called on default. I bypassed the exception to see what's happening and I get this:

File "/home/hash1m/miniconda3/lib/python3.7/dataclasses.py", line 357, in wrapper
    result = user_function(self)
  File "<string>", line 3, in __repr__
  File "/home/hash1m/dffml/dffml/base.py", line 336, in getter
    return self._data[key]
KeyError: 'operations'

So I'm guessing we need to find a way to somehow not set these initial keys and avoid them so it doesn't blow up, right?

Anyway, I tried setting the setter exclusively to make_mutable and the example seems to work fine (doesn't blow up and asserts pass).

So I have the following ambiguities:

  • How does this work with the interface? Are we going to have the user give a flag in the config when instantiating the model ie. along with features and directory. In this case, what's the link between the parameter flag and our data class flag.
  • I'm assuming we won't be explicitly setting each hyperparameter as mutable as we did in the example config(C: int = field("C hyperparameter", mutable=True)), so how'd that work?

ping @pdxjohnny

@pdxjohnny pdxjohnny changed the title docs: arch: Immutable config properties docs: arch: Mutable config properties Jun 17, 2021

.. todo::

If a model is instantiated with config. Saved state in directory. Saved state
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuff was copy pasted since these two docs started as one

@pdxjohnny
Copy link
Member Author

pdxjohnny commented Jun 18, 2021

@sakshamarora1 will pick up investigation of the patch and communication between @mhash1m and @programmer290399

  • How does this work with the interface? Are we going to have the user give a flag in the config when instantiating the model ie. along with features and directory. In this case, what's the link between the parameter flag and our data class flag.
  • I'm assuming we won't be explicitly setting each hyperparameter as mutable as we did in the example config(C: int = field("C hyperparameter", mutable=True)), so how'd that work?
  • As model authors should we handle all possible hyperparameters and make them mutable?

    • Ideally model authors (of Model subclasses) would make their mutable callback function handle all hyperparameters of the underlying model.
    • However, realistically handling mutability for some hyperparameters will be more complicated than others.
      • This is why mutable=True is an opt in, rather than an opt out.
  • In the future we can have a way to set mutable=True as the default for every property

    • Let's NOT do this as a part of the initial implementation though
    • This could be implemented as a helper function which calls the config() decorator
      • We could also just modify the make_config_inspect/numpy/tensorflow/etc.() to auto detect things like ints and floats

.. code-block:: python

class MyModel(Model):
CONFIG = MyModelConfig
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CONFIG = MyModelConfig
CONFIG = MyModelConfig
CONTEXT = ModelContext

@pdxjohnny pdxjohnny force-pushed the arch_immutable branch 4 times, most recently from 0312db5 to 8f54870 Compare August 4, 2021 11:49
Signed-off-by: John Andersen <johnandersenpdx@gmail.com>
@pdxjohnny pdxjohnny force-pushed the arch_immutable branch 2 times, most recently from 0e24573 to f8d2037 Compare August 4, 2021 16:40
pdxjohnny and others added 3 commits August 4, 2021 09:45
Checks if a caller is being called from a given method of a given object.

Signed-off-by: John Andersen <johnandersenpdx@gmail.com>
Co-authored-by: mHash1m <hashimchaudry23@gmail.com>
Co-authored-by: Saahil Ali <programmer290399@gmail.com>
Co-authored-by: Sudhanshu kumar <sudhanshukumar5459@gmail.com>
Signed-off-by: John Andersen <johnandersenpdx@gmail.com>
As defined by arch/0003-Config-Property-Mutable-vs-Immutable.rst

Related: ipython/ipython#1456

Signed-off-by: John Andersen <johnandersenpdx@gmail.com>
@mhash1m
Copy link
Contributor

mhash1m commented Aug 4, 2021

I pulled the branch and played with it.
Spent the last couple of hours trying to make it work on a preset config from a previous session 🤦

Anyway, everything works perfectly fine while working in the same session.

Here's a little preview of how I verified mutable properties.

class ScikitConfig(ModelConfig, NamedTuple):
    C: int = field("C hyperparameter", mutable=True)
    tol: int = field("tol hyperparameter", mutable=True)

image

Instantiating with C=0 and training blows up the model.
Then we set model.config.C = 1 and it works 🎉

This is great, thank you @pdxjohnny .

Question:

What kind of support would this require for scikit models?

Followup:

#1186 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants