Merge pull request #48 from AP6YC/release/0.4

Release/0.4.0
AP6YC · Nov 30, 2021 · dce1bb2 · dce1bb2 · AP6YC · Dec 2, 2021
2 parents 0b4930b + 85f3b64
commit dce1bb2
Show file tree

Hide file tree

Showing 34 changed files with 1,896 additions and 1,412 deletions.
diff --git a/.github/workflows/Documentation.yml b/.github/workflows/Documentation.yml
@@ -4,6 +4,7 @@ on:
   push:
     branches:
       - master
+      - develop
     tags: '*'
   pull_request:
 

diff --git a/Project.toml b/Project.toml
@@ -2,7 +2,7 @@ name = "AdaptiveResonance"
 uuid = "3d72adc0-63d3-4141-bf9b-84450dd0395b"
 authors = ["Sasha Petrenko"]
 description = "A Julia package for Adaptive Resonance Theory (ART) algorithms."
-version = "0.3.7"
+version = "0.4.0"
 
 [deps]
 Distributed = "8ba89e20-285c-5b6f-9357-94700520ee1b"

diff --git a/README.md b/README.md
@@ -6,9 +6,7 @@ A Julia package for Adaptive Resonance Theory (ART) algorithms.
 |:------------------:|:----------------:|:------------:|
 | [![Stable][docs-stable-img]][docs-stable-url] | [![Build Status][ci-img]][ci-url] | [![Codecov][codecov-img]][codecov-url] |
 | [![Dev][docs-dev-img]][docs-dev-url] | [![Build Status][appveyor-img]][appveyor-url] | [![Coveralls][coveralls-img]][coveralls-url] |
-
 | **Dependents** | **Date** | **Status** |
-|:--------------:|:--------:|:----------:|
 | [![deps][deps-img]][deps-url] | [![version][version-img]][version-url] | [![pkgeval][pkgeval-img]][pkgeval-url] |
 
 [deps-img]: https://juliahub.com/docs/AdaptiveResonance/deps.svg
@@ -54,7 +52,7 @@ Please read the [documentation](https://ap6yc.github.io/AdaptiveResonance.jl/dev
   - [Implemented Modules](#implemented-modules)
   - [Structure](#structure)
   - [History](#history)
-  - [Credits](#credits)
+  - [Acknowledgements](#acknowledgements)
     - [Authors](#authors)
     - [Software](#software)
     - [Datasets](#datasets)
@@ -133,19 +131,59 @@ opts = opts_DDVFA(rho_ub=0.75, rho_lb=0.4)
 art = DDVFA(opts)
 ```
 
+Train and test the models with `train!` and `classify`:
+
+```julia
+# Unsupervised ART module
+art = DDVFA()
+
+# Supervised ARTMAP module
+artmap = SFAM()
+
+# Load some data
+train_x, train_y, test_x, test_y = load_your_data()
+
+# Unsupervised training and testing
+train!(art, train_x)
+y_hat_art = classify(art, test_x)
+
+# Supervised training and testing
+train!(artmap, train_x, train_y)
+y_hat_artmap = classify(art, test_x)
+```
+
+`train!` and `classify` can accept incremental or batch data, where rows are features and columns are samples.
+
+Unsupervised ART modules can also accommodate simple supervised learning where internal categories are mapped to supervised labels with the keyword argument `y`:
+
+```julia
+# Unsupervised ART module
+art = DDVFA()
+train!(art, train_x, y=train_y)
+```
+
+These modules also support retrieving the "best-matching unit" in the case of complete mismatch (i.e., the next-best category if the presented sample is completely unrecognized) with the keyword argument `get_bmu`:
+
+```julia
+# Get the best-matching unit in the case of complete mismatch
+y_hat_bmu = classify(art, test_x, get_bmu=true)
+```
+
 ## Implemented Modules
 
 This project has implementations of the following ART (unsupervised) and ARTMAP (supervised) modules:
 
 - ART
-  - **DDVFA**: Distributed Dual Vigilance Fuzzy ART
+  - **FuzzyART**: Fuzzy ART
   - **DVFA**: Dual Vigilance Fuzzy ART
-  - **GNFA**: Gamma-Normalized Fuzzy ART
+  - **DDVFA**: Distributed Dual Vigilance Fuzzy ART
 - ARTMAP
   - **SFAM**: Simplified Fuzzy ARTMAP
   - **FAM**: Fuzzy ARTMAP
   - **DAM**: Default ARTMAP
 
+Because each of these modules is a framework for many variants in the literature, this project also implements these [variants](https://ap6yc.github.io/AdaptiveResonance.jl/dev/man/modules/) by changing their module [options](https://ap6yc.github.io/AdaptiveResonance.jl/dev/man/guide/#art_options).
+
 In addition to these modules, this package contains the following accessory methods:
 
 - **ARTSCENE**: the ARTSCENE algorithm's multiple-stage filtering process is implemented as `artscene_filter`. Each filter stage is exported if further granularity is required.
@@ -183,7 +221,7 @@ AdaptiveResonance
 - 2/8/2021 - Formalize usage documentation.
 - 10/13/2021 - Initiate GitFlow contribution.
 
-## Credits
+## Acknowledgements
 
 ### Authors
 

diff --git a/docs/Project.toml b/docs/Project.toml
@@ -1,4 +1,7 @@
 [deps]
 AdaptiveResonance = "3d72adc0-63d3-4141-bf9b-84450dd0395b"
+Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
 LiveServer = "16fef848-5104-11e9-1b77-fb7a48bbb589"
+MLDataUtils = "cc2ba9b6-d476-5e6d-8eaf-a92d5412d41d"
+Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
diff --git a/docs/make.jl b/docs/make.jl
@@ -1,17 +1,27 @@
-using Documenter, AdaptiveResonance
+using Documenter
+using AdaptiveResonance
 
 makedocs(
     modules=[AdaptiveResonance],
-    format=Documenter.HTML(prettyurls = get(ENV, "CI", nothing) == "true"),
-    # format=Documenter.HTML(),
+    format=Documenter.HTML(
+        prettyurls = get(ENV, "CI", nothing) == "true",
+        assets = [
+            joinpath("assets", "favicon.ico")
+        ]
+    ),
     pages=[
         "Home" => "index.md",
+        "Getting Started" => [
+            "getting-started/whatisart.md",
+            "getting-started/basic-example.md",
+        ],
         "Tutorial" => [
             "Guide" => "man/guide.md",
             "Examples" => "man/examples.md",
+            "Modules" => "man/modules.md",
             "Contributing" => "man/contributing.md",
-            "Index" => "man/full-index.md"
-        ]
+            "Index" => "man/full-index.md",
+        ],
     ],
     repo="https://github.com/AP6YC/AdaptiveResonance.jl/blob/{commit}{path}#L{line}",
     sitename="AdaptiveResonance.jl",
@@ -21,4 +31,5 @@ makedocs(
 
 deploydocs(
     repo="github.com/AP6YC/AdaptiveResonance.jl.git",
+    devbranch="develop",
 )
diff --git a/docs/src/assets/favicon.ico b/docs/src/assets/favicon.ico
diff --git a/docs/src/assets/figures/art.png b/docs/src/assets/figures/art.png
diff --git a/docs/src/assets/header.png b/docs/src/assets/header.png
diff --git a/docs/src/assets/logo.png b/docs/src/assets/logo.png
diff --git a/docs/src/getting-started/basic-example.md b/docs/src/getting-started/basic-example.md
@@ -0,0 +1,60 @@
+# Basic Example
+
+In the example below, we create a dataset generated from two multivariate Gaussian distributions in two dimensions, showing how an ART module can be used in unsupervised or simple supervised modes alongside an ARTMAP module that is explicitly supervised-only.
+
+```@example
+# Copyright © 2021 Alexander L. Hayes
+# MIT License
+
+using AdaptiveResonance
+using Distributions, Random
+using MLDataUtils
+using Plots
+
+"""
+Demonstrates Unsupervised DDVFA, Supervised DDVFA, and (Supervised) SFAM on a toy problem
+with two multivariate Gaussians.
+"""
+
+# Setup two multivariate Gaussians and sampling 1000 points from each.
+
+rng = MersenneTwister(1234)
+dist1 = MvNormal([0.0, 6.0], [1.0 0.0; 0.0 1.0])
+dist2 = MvNormal([4.5, 6.0], [2.0 -1.5; -1.5 2.0])
+
+N_POINTS = 1000
+
+X = hcat(rand(rng, dist1, N_POINTS), rand(rng, dist2, N_POINTS))
+y = vcat(ones(Int64, N_POINTS), zeros(Int64, N_POINTS))
+
+p1 = scatter(X[1,:], X[2,:], group=y, title="Original Data")
+
+(X_train, y_train), (X_test, y_test) = stratifiedobs((X, y))
+
+# Standardize data types
+X_train = convert(Matrix{Float64}, X_train)
+X_test = convert(Matrix{Float64}, X_test)
+y_train = convert(Vector{Int}, y_train)
+y_test = convert(Vector{Int}, y_test)
+
+# Unsupervised DDVFA
+art = DDVFA()
+train!(art, X_train)
+y_hat_test = AdaptiveResonance.classify(art, X_test)
+p2 = scatter(X_test[1,:], X_test[2,:], group=y_hat_test, title="Unsupervised DDVFA")
+
+# Supervised DDVFA
+art = DDVFA()
+train!(art, X_train, y=y_train)
+y_hat_test = AdaptiveResonance.classify(art, X_test)
+p3 = scatter(X_test[1,:], X_test[2,:], group=y_hat_test, title="Supervised DDVFA", xlabel="Performance: " * string(round(performance(y_hat_test, y_test); digits=3)))
+
+# Supervised SFAM
+art = SFAM()
+train!(art, X_train, y_train)
+y_hat_test = AdaptiveResonance.classify(art, X_test)
+p4 = scatter(X_test[1,:], X_test[2,:], group=y_hat_test, title="Supervised SFAM", xlabel="Performance: " * string(round(performance(y_hat_test, y_test); digits=3)))
+
+# Performance Measure + display the plots
+plot(p1, p2, p3, p4, layout=(1, 4), legend = false, xtickfontsize=6, xguidefontsize=8, titlefont=font(8))
+```
diff --git a/docs/src/getting-started/whatisart.md b/docs/src/getting-started/whatisart.md
@@ -0,0 +1,65 @@
+# Background
+
+This page provides a theoretical overview of Adaptive Resonance Theory and what this project aims to accomplish.
+
+## What is Adaptive Resonance Theory?
+
+Adaptive Resonance Theory (commonly abbreviated to ART) is both a **neurological theory** and a **family of neurogenitive neural network models for machine learning**.
+
+ART began as a neurocognitive theory of how fields of cells can continuously learn stable representations, and it evolved into the basis for a myriad of practical machine learning algorithms.
+Pioneered by Stephen Grossberg and Gail Carpenter, the field has had contributions across many years and from many disciplines, resulting in a plethora of engineering applications and theoretical advancements that have enabled ART-based algorithms to compete with many other modern learning and clustering algorithms.
+
+Because of the high degree of interplay between the neurocognitive theory and the engineering models born of it, the term ART is frequently used to refer to both in the modern day (for better or for worse).
+
+Stephen Grossberg's has recently released a book summarizing the work of him, his wife Gail Carpenter, and his colleagues on Adaptive Resonance Theory in his book [Conscious Brain, Resonant Mind](https://www.amazon.com/Conscious-Mind-Resonant-Brain-Makes/dp/0190070552).
+
+## ART Basics
+
+![art](../assets/figures/art.png)
+
+### ART Dynamics
+
+Nearly every ART model shares a basic set of dynamics:
+
+1. ART models typically have two layers/fields denoted F1 and F2.
+2. The F1 field is the feature representation field.
+    Most often, it is simply the input feature sample itself (after some necessary preprocessing).
+3. The F2 field is the category representation field.
+    With some exceptions, each node in the F2 field generally represents its own category.
+    This is most easily understood as a weight vector representing a prototype for a class or centroid of a cluster.
+4. An activation function is used to find the order of categories "most activated" for a given sample in F1.
+5. In order of highest activation, a match function is used to compute the agreement between the sample and the categories.
+6. If the match function for a category evaluates to a value above a threshold known as the vigilance parameter ($$\rho$$), the weights of that category may be updated according to a learning rule.
+7. If there is complete mismatch across all categories, then a new categories is created according to some instantiation rule.
+
+### ART Considerations
+
+In addition to the dynamics typical of an ART model, you must know:
+
+1. ART models are inherently designed for unsupervised learning (i.e., learning in the absense of supervisory labels for samples).
+    This is also known as clustering.
+2. ART models are capable of supervised learning and reinforcement learning through some redesign and/or combination of ART models.
+    For example, ARTMAP models are combinations of two ART models in a special way, one learning feature-to-category mappings and another learning category-to-label mappingss.
+    ART modules are used for reinforcement learning by representing the mappings between state, value, and action spaces with ART dynamics.
+3. Almost all ART models face the problem of the appropriate selection of the vigilance parameter, which may depend in its optimality according to the problem.
+4. Being a class of neurogenitive neural network models, ART models gain the ability for theoretically infinite capacity along with the problem of "category proliferation," which is the undesirable increase in the number of categories as the model continues to learn, leading to increasing computational time.
+    In contrast, while the evaluation time of a deep neural network is always *exactly the same*, there exist upper bounds in their representational capacity.
+5. Nearly every ART model requires feature normalization (i.e., feature elements lying within $$[0,1]$$) and a process known as complement coding where the feature vector is appended to its vector complement $$[1-\bar{x}]$$.
+   This is because real-numbered vectors can be arbitrarily close to one another, hindering learning performance, which requires a degree of contrast enhancement between samples to ensure their separation.
+
+To learn about their implementations, nearly every practical ART model is listed in a recent [ART survey paper by Leonardo Enzo Brito da Silva](https://arxiv.org/abs/1905.11437).
+
+## History and Development
+
+At a high level, ART began with a neural network model known as the Grossberg Network named after Stephen Grossberg.
+This network treats the firing of neurons in frequency domain as basic shunting models, which are recurrently connected to increase their own activity while suppressing the activities of others nearby (i.e., on-center, off-surround).
+Using this shunting model, Grossberg shows that autonomous, associative learning can occur with what are known as instar networks.
+
+By representing categories as a field of instar networks, new categories could be optimally learned by the instantiation of new neurons.
+However, it was shown that the learning stability of Grossberg Networks degrades as the number of represented categories increases.
+Discoveries in the neurocognitive theory and breakthroughs in their implementation led to the introduction of a recurrent connections between the two fields of the network to stabilize the learning.
+These breakthroughs were based upon the discovery that autonomous learning depends on the interplay and agreement between *perception* and *expectation*, frequently referred to as bottom-up and top-down processes.
+Furthermore, it is *resonance* between these states in the frequency domain that gives rise to conscious experiences and that permit adaptive weights to change, leading to the phenomenon of learning.
+The theory has many explanatory consequences in psychology, such as why attention is required for learning, but its consequences in the engineering models are that it stabilizes learning in cooperative-competitive dynamics, such as interconnected fields of neurons, which are most often chaotic.
+
+Chapters 18 and 19 of the book by [Neural Network Design by Hagan, Demuth, Beale, and De Jesus](https://hagan.okstate.edu/NNDesign.pdf) provide a good theoretical basis for learning how these network models were eventually implemented into the first binary-vector implementation of ART1.
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -1,3 +1,7 @@
+![header](assets/header.png)
+
+---
+
 # AdaptiveResonance.jl
 
 These pages serve as the official documentation for the AdaptiveResonance.jl Julia package.
@@ -17,12 +21,14 @@ This documentation is split into the following sections:
 Pages = [
     "man/guide.md",
     "man/examples.md",
+    "man/modules.md",
     "man/contributing.md",
     "man/full-index.md",
 ]
 Depth = 1
 ```
 
 The [Package Guide](@ref) provides a tutorial to the full usage of the package, while [Examples](@ref) gives sample workflows using a variety of ART modules.
+A list of the implemented ART modules is included in [Modules](@ref), where different options are also listed for creating variants of these modules that exist in the literature.
 
 Instructions on how to contribute to the package are found in [Contributing](@ref), and docstrings for every element of the package is listed in the [Index](@ref main-index).
diff --git a/docs/src/man/examples.md b/docs/src/man/examples.md
@@ -9,6 +9,8 @@ All ART modules learn in an unsupervised (i.e. clustering) mode by default, but
 
 ### DDVFA Unsupervised
 
+DDVFA is an unsupervised clustering algorithm by definition, so it can be used to cluster a set of samples all at once in batch mode.
+
 ```julia
 # Load the data from some source with a train/test split
 train_x, train_y, test_x, test_y = load_some_data()
@@ -25,6 +27,8 @@ y_hat_test = classify(art, test_x)
 
 ### DDVFA Supervised
 
+ART modules such as DDVFA can also be used in simple supervised mode where provided labels are used in place of internal incremental labels for the clusters, providing a method of assessing the clustering performance when labels are available.
+
 ```julia
 # Load the data from some source with a train/test split
 train_x, train_y, test_x, test_y = load_some_data()
@@ -47,6 +51,9 @@ perf_test = performance(y_hat_test, test_y)
 
 ### Incremental DDVFA With Custom Options and Data Configuration
 
+Even more advanced, DDVFA can be run incrementally (i.e. with one sample at a time) with custom algorithmic options and a predetermined data configuration.
+It is necessary to provide a data configuration if the model is not pretrained because the model has no knowledge of the boundaries and dimensionality of the data, which are necessary in the complement coding step.
+
 ```julia
 # Load the data from some source with a train/test split
 train_x, train_y, test_x, test_y = load_some_data()
@@ -97,6 +104,8 @@ ARTMAP modules are supervised by definition, so the require supervised labels in
 
 ### SFAM
 
+A Simplified FuzzyARTMAP can be used to learn supervised mappings on features directly and in batch mode.
+
 ```julia
 # Load the data from some source with a train/test split
 train_x, train_y, test_x, test_y = load_some_data()
@@ -116,8 +125,11 @@ perf_train = performance(y_hat_train, train_y)
 # Calculate testing performance
 perf_test = performance(y_hat_test, test_y)
 ```
+
 ### Incremental SFAM With Custom Options and Data Configuration
 
+A simplified FuzzyARTMAP can also be run iteratively, assuming that we know the statistics of the features ahead of time and reflect that in the module's `config` with a `DataConfig` object.
+
 ```julia
 # Load the data from some source with a train/test split
 train_x, train_y, test_x, test_y = load_some_data()

diff --git a/docs/src/man/guide.md b/docs/src/man/guide.md
@@ -39,6 +39,7 @@ To work with ART modules, you should know:
 - [Their basic methods](@ref methods)
 - [Incremental vs. batch modes](@ref incremental_vs_batch)
 - [Supervised vs. unsupervised learning modes](@ref supervised_vs_unsupervised)
+- [Mismatch vs. Best-Matching-Unit](@ref mismatch-bmu)
 
 ### [Methods](@id methods)
 
@@ -163,6 +164,19 @@ perf_test = performance(y_hat_test, test_y)
 
 However, many ART modules, though unsupervised by definition, can also be trained in a supervised way by naively mapping categories to labels (more in [ART vs. ARTMAP](@ref art_vs_artmap)).
 
+### [Mismatch vs. Best-Matching-Unit](@id mismatch-bmu)
+
+During inference, ART algorithms report the category that satisfies the match/vigilance criterion (see [Background](@ref)).
+By default, in the case that no category satisfies this criterion the module reports a *mismatch* as -1.
+In modules that support it, a keyword argument `get_bmu` (default is `false`) can be used in the `classify` method to get the "best-matching unit", which is the category that maximizes the activation.
+This can be interpreted as the "next-best guess" of the model in the case that the sample is sufficiently different from anything that the model has seen.
+For example,
+
+```julia
+# Conduct inference, getting the best-matching unit in case of complete mismatch
+y_hat_bmu = classify(my_art, test_x, get_bmu=true)
+```
+
 ## [ART Options](@id art_options)
 
 The AdaptiveResonance package is designed for maximum flexibility for scientific research, even though this may come at the cost of learning instability if misused.
@@ -191,6 +205,13 @@ The options are objects from the [Parameters.jl](https://github.com/mauro3/Param
 my_art_opts = opts_DDVFA(gamma = 3)
 ```
 
+!!! note "Note"
+    As of version `0.3.6`, you can pass these keyword arguments directly to the ART model when constructing it with
+
+    ```julia
+    my_art = DDVFA(gamma = 3)
+    ```
+
 You can even modify the parameters on the fly after the ART module has been instantiated by directly modifying the options within the module:
 
 ```julia