# Understanding Feature Attributes

## Overview 
 
Understanding and properly utilizing Feature Attributes is the most important step for successful use of Howso. Feature attributes can be specified manually but are often built using the `infer_feature_attributes()` (IFA) utility function. 

This notebook will explore configuration of IFA and the feature attributes in general using a user interface designed to simplify and inform the manipulation of Feature Attributes.
In this notebook we will explore the most basic workflow using Howso Engine to gain insights in your data. This workflow is a form of exploratory data analysis and can help the user develop a deeper understanding of the data.

Reference: [Docs - Feature Attributes](https://docs.howso.com/user_guide/basic_capabilities/feature_attributes.html)

In [1]:
import os
import pandas as pd
import plotly.io as pio
import plotly.express as px
from pmlb import fetch_data
from howso.engine import Trainee
from howso.utilities import infer_feature_attributes

pio.renderers.default = os.getenv("HOWSO_RECIPE_RENDERER", "notebook")
df = fetch_data('iris', local_cache_dir="../../data/iris")


## Section 1: Load and Infer

We'll begin by loading our data and use a naive `infer_feature_attributes()` call to generate an initial set of Feature Attributes using the Iris data set.

In [2]:
# Any params you wish to use during infer_feature_attributes may be included here.
# "features" may continue any features or their configurations which should be used versus inferred
initial_params = { "features": {} }

# Infer feature attributes
initial_features = infer_feature_attributes(df, **initial_params)

## Section 2: Review

### Section 2a: Review as a DataFrame

You may use the `.to_dataframe()` utility to get a highlight of the features for read only review.

In [None]:
initial_features.to_dataframe()

### Section 2b: Review Interactively

For interactive configuration, you may wish to launch the included FeaturesAttributesControl widget.

This functionality is currently in `alpha` status. You may safely rely on the new params output produced. But the python code to load, render, and pull values out of the widget are expected to change.

In [4]:
import pathlib
from IPython.display import HTML
import ipyreact
import json

# Setup notebook relative widget file path
widget_path = pathlib.Path("../../widgets/node_modules/@howso/jupyter-notebook-react-components/lib")
# Define your theme preference
widgets_theme_mode = "dark" # "auto" (OS or browser preference), "dark" or "light"

In [None]:
# Output the required global css
css_rules = pathlib.Path(widget_path / "index.esm.css").read_text()
HTML("<style>" + css_rules + "</style>")

In [None]:
# Launch the widget - This step must be done manually, not as part of run all.
class FeatureAttributesWidget(ipyreact.ValueWidget):
    _esm = widget_path / "index.esm.js"

feature_attributes_widget = FeatureAttributesWidget(
    props={
        "control": "FeaturesAttributesControl",
        "themeMode": widgets_theme_mode,
    },
    value=json.dumps({ "features": initial_features }),
)
feature_attributes_widget

We can access updated `infer_feature_params` arguments coming out of the widget at time by using the deserializing the `.value` parameters.


In [None]:
configured_infer_feature_parameters = json.loads(feature_attributes_widget.value) 
configured_infer_feature_parameters

`configured_infer_feature_parameters` contains the full parameters set, and `features` includes the full set of values. You should review the JSON object for important changes you've made versus the `initial_features` output. You are advised to *omit* as many configurations as possible, allowing inference to select the most appropriate values dynamically. 

Values you wish to save can be copied into a new object `updated_parameters`. Future versions of widget will produce a smaller output, containing only the parameters required, and a `features` object comprised of only changes you made.

In [8]:
# Create the Trainee
updated_parameters = {
    # Any parameters you wish to maintain
     "features": { 
        # Any feature specific attributes you wish to maintain
    }
}
configured_features = infer_feature_attributes(df, **updated_parameters)
t = Trainee(features=configured_features)

# Train
t.train(df)

With your Trainee configured, you are now free to use it for further _Howso_ operations

## Section 3: Caveats

### Bundled providers

Some providers require all assets to be included in a single bundle. You may use the following code for those providers instead of separate CSS and Widget cells above:

```py
# Launch the widget
class FeatureAttributesWidget(ipyreact.ValueWidget):
    _esm = widget_path / "index.bundle.esm.js"
    name = traitlets.Unicode().tag(sync=True)

feature_attributes_widget = FeatureAttributesWidget(
    props={
        "control": "FeaturesAttributesControl",
        "themeMode": widgets_theme_mode,
    },
    value=json.dumps({ "features": initial_features }),
)
feature_attributes_widget
```

Known bundle required providers:
- Databricks