# FuzzyART in Parts

In the other `FuzzyART` notebook, we implemented a `FuzzyART` module as a class with all of the requisite methods for running it as a standalone module.
Here, we will look a little further down the rabbit hole to understand the moving parts of `FuzzyART` in finer detail.
In programming terms, this might look more like the "functional programming" paradigm/pattern in that we wish to atomically look at each moving part.

## Dependencies

First, we load all of our dependencies for the notebook

In [1]:
# From scikit-learn, for casting the data to 2D for visualization.
# This is not how the data actually looks in 4D, but the best that we can do is to cast it to 2D such that relative distances are mostly maintained.
from sklearn.manifold import TSNE
# The most common way of importing matplotlib for plotting in Python
from matplotlib import pyplot as plt
# For manipulating axis tick locations
from matplotlib import ticker

## Data

Next, we load our dataset!
Here, we will use the UCI Iris dataset again as a relatively simple example.
This time, we will modularize the preprocessing code a little more!
First, we have the function to load the data:

In [2]:
# Pandas for loading and manipulating data as a DataFrame
import pandas as pd
# For loading the iris dataset as an example
from sklearn.datasets import load_iris

def load_data() -> pd.DataFrame:
    # Load the iris dataset as a DataFrame
    iris = load_iris(as_frame=True)
    # Extract the DataFrame from the dictionary the loader provides
    data = iris['frame']
    # Return the data as a DataFrame
    return data

# Load the data
data = load_data()
# Print the first several rows to get an idea of what it looks like
data.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0


This dataset has 4 features, and it does include supervised labels.

Working directly with `DataFrame`s is good for data analysis, but for machine learning, we often have to think more carefully about how we handle our datasets in efficient ways during training and inference, especially when words like "mini-batching", "memory pinning", and "transforms" come up!

### Data Container

For a simple demonstration, we create a convenient container for the preprocessed samples:

In [3]:
# The PyTorch library containing neural network utilities and the Tensor datatype
import torch

# Dataclass for a structured way of passing around a dataset
from dataclasses import dataclass

# Create a simple container for our preprocessed data with some introspection methods
@dataclass
class PreprocessedDataset:
    x: torch.Tensor
    y: torch.Tensor

    # The number of samples
    def n_samples(self):
        return y.shape[0]

    # The "original" (not complement-coded) data feature dimension
    def dim(self):
        return int(x.shape[1] / 2)

    # The complement-coded dimension
    def dim_cc(self):
        return x.shape[1]

# Check out added methods from the decorator.
help(PreprocessedDataset)

Help on class PreprocessedDataset in module __main__:

class PreprocessedDataset(builtins.object)
 |  PreprocessedDataset(x: torch.Tensor, y: torch.Tensor) -> None
 |
 |  PreprocessedDataset(x: torch.Tensor, y: torch.Tensor)
 |
 |  Methods defined here:
 |
 |  __eq__(self, other)
 |      Return self==value.
 |
 |  __init__(self, x: torch.Tensor, y: torch.Tensor) -> None
 |      Initialize self.  See help(type(self)) for accurate signature.
 |
 |  __repr__(self)
 |      Return repr(self).
 |
 |  dim(self)
 |      # The "original" (not complement-coded) data feature dimension
 |
 |  dim_cc(self)
 |      # The complement-coded dimension
 |
 |  n_samples(self)
 |      # The number of samples
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  __dict__
 |      dictionary for instance variables
 |
 |  __weakref__
 |      list of weak references to the object
 |
 |  ------------------------------------------------------------

Let's create a quick dataset with some random dummy data to see what creating a "formal" dataset class gets us:

In [4]:
# Create a dummy dataset with some dimension and number of samples
dim = 2
n_samples = 5
# Set a manual random seed for reproducibility
torch.manual_seed(1234)
dummy_data = PreprocessedDataset(
    torch.rand((n_samples, dim)),
    torch.randint(0, 3, (n_samples,))
)
dummy_data

PreprocessedDataset(x=tensor([[0.0290, 0.4019],
        [0.2598, 0.3666],
        [0.0583, 0.7006],
        [0.0518, 0.4681],
        [0.6738, 0.3315]]), y=tensor([0, 1, 1, 1, 0]))

One thing that we notice is that we got a fancier string representation (`__repr__`) of the object when it's displayed on the terminal, and we didn't have to write it ourselves.
This is a simple example, but it is one of the beneficial things that we can get when thinking a little bit about how we handle datasets (in this case, hooking into a Python standard library `dataclass` decorator).

Creating object to hold your data, such as annotating with the standard library decorator `dataclasses.@dataclass`, is a useful trick to inherit some boilerplate machinery.
In PyTorch, the most universal API to utilize and interface with datasets is in [`torch.utils.data`](https://docs.pytorch.org/docs/stable/data.html), which includes the `Dataset` and `DataLoader` classes.
For now, we'll stick with this simpler implementation for illustration.

Next up is how to preprocess the data!

### Data Preprocessing

Now we define the function to preprocess the data.
In this example, we do this on the full batch rather than incrementally for the following reason:

`FuzzyART` uses complement coding, which maps $x \rightarrow [x, 1-x]$ and is bounded in $[0, 1]$.
To do this, we need the original $x$ to be also be bounded inside $[0, 1]$.
Most real data is not neatly normalized and bounded, so we need to do it ourselves at some point.
However, that requires knowing the bounds of the full data in advance!

Even though we are working with an incremental algorithm, we have the luxury of having the all of the Iris dataset up-front, and the dataset surely isn't going to change any time soon.
This is not always the case, especially if you are dealing with streaming datasets, where you are incrementally provided a sample one at a time!
In those cases, you have two options:

1. Know the statistics of the dataset in advance (e.g., the upper and lower bounds) and preprocess each sample incrementally off of that.
2. Use some sort of intelligent normalization scheme that enforces the bounds of the data to $[0, 1]$, such as through incorporating a limiting function like the sigmoid function
$\sigma = \dfrac{1}{1+e^{-x}}$
or some other hard limiting function.

In [5]:
# Numpy for handling numpy arrays
import numpy as np
# A sklearn utility for handling normalization of data automatically
from sklearn.preprocessing import MinMaxScaler

# Define a preprocess function that returns a PreprocessedDataset
def preprocess(
    data: pd.DataFrame,
    shuffle: bool = True,
    random_seed: int = 12345,
) -> PreprocessedDataset:
    # Shuffle the data if necessary
    if shuffle:
        np.random.seed(random_seed)
        data = data.sample(frac=1).reset_index(drop=True)

    # Whether shuffled or not, separate the labels
    labels = torch.tensor(data.pop('target'), dtype=torch.int)

    # Initialize the scalar and update the values in-place to be normalized between [0, 1]
    scaler = MinMaxScaler()

    # Linearly normalize the data and put it into a tensor
    data_cc = torch.tensor(scaler.fit_transform(data))

    # Complement code the data by appending the vector [1-x] along the feature dimension
    data_cc = torch.cat((data_cc, 1 - data_cc), dim=1)

    # What we get is a list of 8-dimensional samples
    return PreprocessedDataset(data_cc, labels)

# Preprocess the data and check it out
data_cc = preprocess(data)
data_cc

PreprocessedDataset(x=tensor([[0.3611, 0.2083, 0.4915,  ..., 0.7917, 0.5085, 0.5833],
        [0.0278, 0.5000, 0.0508,  ..., 0.5000, 0.9492, 0.9583],
        [0.5556, 0.5417, 0.6271,  ..., 0.4583, 0.3729, 0.3750],
        ...,
        [0.5278, 0.3333, 0.6441,  ..., 0.6667, 0.3559, 0.2917],
        [0.8056, 0.4167, 0.8136,  ..., 0.5833, 0.1864, 0.3750],
        [0.1111, 0.5000, 0.1017,  ..., 0.5000, 0.8983, 0.9583]],
       dtype=torch.float64), y=tensor([1, 0, 1, 0, 0, 0, 1, 0, 1, 2, 0, 2, 1, 1, 2, 2, 0, 2, 1, 1, 1, 1, 0, 1,
        0, 2, 0, 1, 0, 2, 0, 0, 2, 2, 1, 2, 2, 1, 0, 1, 0, 1, 0, 2, 1, 1, 0, 1,
        1, 1, 2, 0, 1, 0, 2, 2, 0, 2, 0, 2, 2, 2, 2, 1, 2, 1, 0, 0, 1, 0, 2, 2,
        0, 0, 2, 0, 2, 1, 0, 0, 1, 2, 0, 2, 1, 0, 1, 1, 1, 0, 0, 1, 2, 0, 1, 1,
        2, 1, 1, 0, 0, 1, 2, 0, 1, 2, 2, 1, 2, 1, 1, 2, 2, 2, 2, 2, 0, 1, 0, 0,
        0, 2, 1, 0, 1, 2, 1, 1, 0, 2, 0, 0, 0, 1, 1, 0, 2, 2, 1, 2, 1, 2, 2, 0,
        2, 0, 2, 2, 2, 0], dtype=torch.int32))

Now that that we have our data defined and preprocessed (linearly normalized and complement coded), we can define some parts of our `FuzzyART` module!

## FuzzyART

We need to define the following things to have a complete FuzzyART module:

1. **Weights**: How the weights are defined in memory.
This is a growing data structure in the sense that FuzzyART instantiates new nodes when necessary, so whatever it is that we do, we need a mechanism to append weights.
2. **Activation**: The FuzzyART activation function; this takes the weights and input sample, and it spits out a number.
3. **Match**: The FuzzyART match function.
This also is a function of the weights and current sample, and it returns a number.
4. **Learning**: One a winning node is selected, we need to define what learning is; the weight update function returns a new weight as a function of the old weight, the current sample, and the learning rate.
5. **Match Rule/Competition**: This function iterates over all of the weights, calculating their activation and match values, selecting a winning node, and updating it.
This is essentially the main loop that we will call during training and inference, performing the FuzzyART "match rule."

What this translates to is an empty API to "fill out" in steps.
We often use the abstract class pattern in software development to do such a thing, which is implemented in Python using the standard library module `abc`, such as the following:

In [None]:
from abc import ABC, abstractmethod

class FuzzyARTBase(ABC):

    # A method to initialize a new category, however it the weights are defined
    @abstractmethod
    def grow(self, x: torch.Tensor):
        pass

    # The activation function for a single sample and weight
    @abstractmethod
    def activation(self, x: torch.Tensor, j: int) -> torch.float:
        pass

    # The match function for a single sample and weight
    @abstractmethod
    def match(self, x: torch.Tensor, j: int) -> torch.float:
        pass

    # The learning function once a weight update has
    @abstractmethod
    def learn(self, x: torch.Tensor, j: int) -> torch.Tensor:
        pass

    # The FuzzyART match rule (competition) loop, which
    @abstractmethod
    def competition(self, x: torch.Tensor, j: int) -> int:
        pass

    # The training method
    @abstractmethod
    def train(self, x:torch.Tensor):
        pass

    # After training, the classification method
    @abstractmethod
    def classify(self, x:torch.Tensor) -> int:
        pass


NameError: name 'Tensor' is not defined

For a full implementation of almost exactly this, check out the [notebook at this link](https://art-book-online.github.io/notebooks/fuzzyart-pytorch/) that does this in a single `FuzzyART` class with PyTorch tensor weights.

In this notebook we'll be inspecting each of these functions individually, so we'll modularize it into a set of functions that take all of the info that they need and spit out the relevant information.

<div class="alert alert-block alert-info">
<b>Note:</b>
This isn't the most efficient method of implementing `FuzzyART`.
In fact, it might actually be the slowest way of going about it!
However, that is not the point of this exercise; the point is to get an understanding of each of the working parts so that you can go forth and implement it however you need in your own application!
</div>

As a complete side note, you could *conceivably* combine both methods (creating an abstract base class and implementing the methods *after the fact*) with something like the following:

In [None]:
# The types module has all sorts of black magic within
from types import MethodType

# Define some basic class with data
class MyClass:
    def __init__(self, x):
        self.x = x

# Initialize the object
obj = MyClass(1)

# Verify that the class doesn't have the method we need
print(f"Checking if `obj` has 'added_method': {hasattr(MyClass, 'added_method')}")

# Define a dangling method after defining the class with a "self" argument
def added_method(self):
    return f"My x is: {self.x}"

# Attach the function as a method of the instantiated object
obj.check = MethodType(added_method, obj)

# Verify that the instantiated object has the method, and see what it outputs
print(f"Checking if `obj` has an 'added_method': {hasattr(obj, 'check')}")
print(f"Output of 'added_method': {obj.check()}")

Checking if `obj` has 'added_method': False
Checking if `obj` has an 'added_method': True
Output of 'added_method': My x is: 1


That's not what we'll do here, but it's neat what you can do with code when you think outside the box!

### Weights

Our weights are defined as a grow matrix.

## TODO

This notebook is a work in progress!
If you see this, it means that there is more to come for this notebook.