Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(De)serialization rework #4

Closed
garryod opened this issue Sep 17, 2021 · 3 comments · Fixed by #16
Closed

(De)serialization rework #4

garryod opened this issue Sep 17, 2021 · 3 comments · Fixed by #16
Assignees
Labels
enhancement New feature or request

Comments

@garryod
Copy link
Member

garryod commented Sep 17, 2021

The current (de)serialization system is clunky, it requires significant rework to improve usability of the library.

Current System

The current system operates by defining "config" classes for each interface (e.g. DeviceConfig for Device(Protocol)); such classes act as an apischema tagged union of derived "config" classes. User classes are made configurable by the addition of a nested "config" class (via a decorator or inheritance of a type which applies the decorator), which joins the tagged union of it's template.

At deserilization time, all classes referenced by the configuration file must be loaded prior to apischema deserialzation such that the nested config classes can be created. This necessitates an ugly two stage process in which the config file is first parsed for fully qualified names to load them, and then parsed again to perform deserialization.

On the end user front, this requires the following boilerplate:

class MyDevice(ConfigurableDevice):
    ...

Such that MyDevice can be given a nested class MyDeviceConfig which subtypes DeviceConfig, allowing for apischema deserialization by tagged union where the qualified name of MyDevice.

Target System

An ideal system would have the following benefits over the current system:

  • Requires no additional boilerplate for the end user
  • Removes the need for passing of "config" objects
  • Perform a single parsing of the configuration file

Proposed System

I expect deserialization would likely operate as follows, though I would love to hear suggestions of alternatives:

The config file (yaml) is read & parsed with `PyYAML`
Recursively, for each object in the parsed config:
    If module of object not already loaded:
        Load module of object
        Generate config (data)class for object type - based on __init__ signature
    Check class is instance of the expected type
    Check generated config class is instance of the config of the expected type (possibly superfluous)
    Instantiate config with configured parameters
    Build object and pass / pass config

The primary issue with this system stems from the fact that we wish to inject some init arguments during the instantiation process, there exist several possible solutions to this problem, the merits of which must be weighed:

  1. Allow for passing of either built objects or configs which implement a build() method which takes injected arguments
  2. Do not perform argument injection in __init__, instead pass them to methods such as run_forever()
  3. Require passing of configs which implement a build() method which takes injected arguments
@garryod garryod added the enhancement New feature or request label Sep 17, 2021
@thomascobb
Copy link
Contributor

Here's a sketch of how the 3 would work, with on demand importing implemented:
https://gist.github.com/thomascobb/c6fb7e385d7cfb64765da921a489ec8f

The downsides:

  • You can't have a subclass of Config as a field within a subclass of Config. This is similar to the current approach (you can't have a subclass of Adapter as a field of a subclass of Adapter) so I don't think this is a problem.
  • We basically have 2 parallel deserialization roots. The first is the existing tagged union, but this will only be used to make a json deserialization schema (after all modules are imported) to aid editing. The second is the on demand importing, which will be used when reading a config file

What do you reckon?

@garryod
Copy link
Member Author

garryod commented Sep 20, 2021

I reckon that looks good! Got a few questions though:

  • Is the need to retain the existing tagged union something you'd be able to elaborate on, I'm not sure I can see where it'd be used?

  • Do you have any thoughts on how we would auto-magically generate such config classes? My initial idea is below:

    1. Generate a config dataclass with fields matching the constructor signature, but all Optional and initially None
    2. Load any field values supplied by the config using apischema
    3. Replace field values with those passed to by build()
    4. Check field values satisfy constructor signature & build an instance

    This idea doesn't provide any static safety, but I don't really see that being required since it is always to interact with external data at run time. What do you think?

Also had a thought over the weekend, we could change build() to __call__(), as such Callable[...,MyClass] would hold true for both MyClassConfig and Type[MyClass] which would be rather nice for passing objects.

@thomascobb
Copy link
Contributor

Is the need to retain the existing tagged union something you'd be able to elaborate on, I'm not sure I can see where it'd be used?

VSCode YAML plugin supports a JSON schema to give you hints and validation when editing YAML files. I'd like tickit to be able to produce a JSON schema for its config file, but to do this we need the deserialization to be defined statically using those unions rather than dynamically as it encounters each key. Then you can create the schema like this: https://github.com/epics-containers/ibek/blob/8397dc972e9164c511a96f4035f652c49739f5d4/ibek/__main__.py#L43

Do you have any thoughts on how we would auto-magically generate such config classes?

I'll defer this question until I actually make some of these classes. I intend to make these classes manually, then only automate if we need to. Having only looked at the cryocooler so far, it doesn't seem complicated enough to have to generate the config class, so I was going to make it manually.

Also had a thought over the weekend, we could change build() to __call__(), as such Callable[...,MyClass] would hold true for both MyClassConfig and Type[MyClass] which would be rather nice for passing objects.

Sounds good, I'll do that

thomascobb pushed a commit that referenced this issue Sep 23, 2021
@garryod garryod mentioned this issue Oct 1, 2021
thomascobb pushed a commit that referenced this issue Oct 5, 2021
garryod added a commit that referenced this issue Oct 20, 2021
* Try to fix #4 for cryostream only

* Fix the mypy error for raise_interrupt

* Try to fix #4 for cryostream only

* Fix the mypy error for raise_interrupt

* Ported everything apart from nested and tests

* Support SystemSimulation

* Fixed remaining tests

* Fix docs

* Update tickit/core/management/event_router.py

Co-authored-by: Garry O'Donnell <56754322+garryod@users.noreply.github.com>

* Update tickit/cli.py

Co-authored-by: Garry O'Donnell <56754322+garryod@users.noreply.github.com>

* Update tickit/cli.py

Co-authored-by: Garry O'Donnell <56754322+garryod@users.noreply.github.com>

* Update tickit/core/components/system_simulation.py

Co-authored-by: Garry O'Donnell <56754322+garryod@users.noreply.github.com>

* Update tickit/core/components/device_simulation.py

Co-authored-by: Garry O'Donnell <56754322+garryod@users.noreply.github.com>

* Fix event_router indentation

* Add test for as_tagged_union and fix bug

* Rename lifetime_runnable to runner

* Fix test_runner to use correct mock library

Co-authored-by: Garry O'Donnell <56754322+garryod@users.noreply.github.com>
Co-authored-by: Garry O'Donnell <garry.o'donnell@diamond.ac.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants