# Converting Impertaive Python Classes into Models

Creating a delcartive and descriptive computing interface that uses existing python libraries to power it requires connecting the imperative python code behind the scenes to the user facing controls. The first step in that process is taking the python classes powering the code and turning them into models - json schemas that can be used for data validation. There are three key pieces to that process: the imperative python dataclass, type hinting, and the `ytBaseModel` class which is based on pydantic's `BaseModel` class. The idea is turn the imperative class into a dataclass that has typed attributes and takes `ytBaseModel` as an argument. Half of that task is easy - pick the class you want to convert, add type hints to the attributes (if there aren't any already) and add `ytBaseModel` as argument. The other half of that task requires some domain knowledge. You have to kmow which attributes are required for the class to work, where the class should fit into the larger model, and where that class lives in the original codebase. Once you figure that, it's easy and fast to add new classes to analysis schema.

## Step 1: Pick a class to convert and identify attributes and variables necessay for it to function

For example, in yt we have a class that creates a sphere out of the data, based on coordinates the user provides. In yt, the class looks something like this (I've commmented out code that relies on other classes or functions for demo purposes):


In [1]:
class YTSphere():
    """
    A sphere of points defined by a *center* and a *radius*.

    Parameters
    ----------
    center : array_like
        The center of the sphere.
    radius : float, width specifier, or YTQuantity
        The radius of the sphere. If passed a float,
        that will be interpreted in code units. Also
        accepts a (radius, unit) tuple or YTQuantity
        instance with units attached.

    Examples
    --------

    >>> import yt
    >>> ds = yt.load("RedshiftOutput0005")
    >>> c = [0.5,0.5,0.5]
    >>> sphere = ds.sphere(c, (1., "kpc"))
    """
    _type_name = "sphere"
    _con_args = ('center', 'radius')
    def __init__(self, center, radius, ds=None,
                 field_parameters=None, data_source=None):
        # validate_center(center)
        # validate_float(radius)
        # validate_object(ds, Dataset)
        # validate_object(field_parameters, dict)
        # validate_object(data_source, YTSelectionContainer)
        super(YTSphere, self).__init__(center, ds,
                                           field_parameters, data_source)
        # Unpack the radius, if necessary
        # radius = fix_length(radius, self.ds)
        # if radius < self.index.get_smallest_dx():
        #     raise YTSphereTooSmall(ds, radius.in_units("code_length"),
        #                            self.index.get_smallest_dx().in_units("code_length"))
        self.set_field_parameter('radius', radius)
        self.set_field_parameter("center", self.center)
        self.radius = radius

## Step 2: Create python dataclass with the same name and add attributes with correct type hints

Looking at this class, it's clear there are two arguments that are required for this function to work: `center` and `radius`. From the docstring, we can see that the type hint for `center` is array-like (or a list) and the type hint for `radius` is either a float, a special yt data container called `YTQuantity`, or a tuple. It's not clear what the accepted types in the tuple are, so we can look at the examples. It looks like the first type should be a float, and the second type should be a string. It's also unclear what a `YTQuantity` is. Using that information, we can now create a dataclass for sphere, with all the correct parts for it to function. We name the class, add `ytDataObjectAbstract` as arguement, which is a `BaseModel` class meant just for yt data objects (it iterates through the data registry instead of all yt classes), a group that sphere belongs too. For demo purposes, I've added just `BaseModel` so we can see what the model looks like. I add the `center` and `radius` attributes, _keeping the exact same name as the attributes in original class_. This is important as the analysis schema uses the attribute names to match the correct arguments and passes the data on accordingly. The last piece, also critical for analysis schema functionality, is adding the `_yt_operation` name. This is a string that matches the internal name of the imperative class to the internal name of the dataclass. This is how yt knows these two classes - the original class and the dataclass - are really the same thing.

Below is what the dataclass version of sphere looks like. I've also printed out the model that is generated from the `BaseModel` argument. From the dataclass a json schema is produced, which provides a form of data validation that controls what kind of data can be entered and then run. This provides the foundation of the declarative user interface. The dataclasses turned model creates a structured guide for users entering data. The model validates the user workflow, but behind the scences the analysis schema is running the imperative code - the original class, with the attributes as the arguments - and producing an output. In this example I used `BaseModel`, but in the analysis schema different versions of `ytBaseModel` are used to both create the model and run the imperative code.

The user doesn't see the printed out JSON code below; but they will use it to validate any enter they enter. For example, any time in their workflow that want to create a sphere, the analysis schema will make sure they add a `center` and a `radius` in the correct format. 

In [6]:
from pydantic import BaseModel, Field
from typing import List, Union, Tuple, Optional

class Sphere(BaseModel):
    """A sphere of points defined by a *center* and a *radius*.
    """
    center: List[float] = Field(alias="Center")
    radius: Union[float, Tuple[float, str]] = Field(alias="Radius")
    _yt_operation: str = "sphere"

print(Sphere.schema_json(indent=2))

{
  "title": "Sphere",
  "description": "A sphere of points defined by a *center* and a *radius*.\n    ",
  "type": "object",
  "properties": {
    "Center": {
      "title": "Center",
      "type": "array",
      "items": {
        "type": "number"
      }
    },
    "Radius": {
      "title": "Radius",
      "anyOf": [
        {
          "type": "number"
        },
        {
          "type": "array",
          "items": [
            {
              "type": "number"
            },
            {
              "type": "string"
            }
          ]
        }
      ]
    }
  },
  "required": [
    "Center",
    "Radius"
  ]
}


## Adding the right class
It is important to add the correct BaseModel class as an argument to the data class. There are different classes designed for different functionality - including ytBaseModel and ytDataObjectAbstract. Both of these classes inherit from pydantic's `BaseModel` and have a function call `_run` that cycle through yt and find the correct functions to run. 


## Next Steps

Once there is dataclass has become model, it can be integrated into the larger analysis schema model by adding the dataclass reference (the name) into the model, either by adding it as an attribute or hint to another model, or adding as a type hint to the main model. In the example below, `Sphere` is added as part of the `data_source` attribute. 


In [5]:
class SlicePlot(BaseModel):
    # ds: Optional[Dataset] = Field(alias="Dataset")
    # fields: FieldNames = Field(alias="FieldNames")
    # axis: str = Field(alias="Axis")
    # center: Optional[Union[str, List[float]]] = Field(alias="Center")
    # width: Optional[Union[List[str], Tuple[int, str]]] = Field(alias="Width")
    data_source: Optional[Sphere]
    # Comments: Optional[str]
    # _yt_operation: str = "SlicePlot"

You're converted a python class into a analysis schema model! To test it, follow these steps:

4. Run the model to create the schema file, it should appear in the definitions section and correct reference wherever it was placed
5. Enter data into a JSON file using the schema for validation. The class and it's attributes should be available to enter data in
6. Run the JSON file. The ouptut should have run data you entered into that classes and should have applied it to the output.