# Bioactive version 1.2 Demonstration

This notebook runs illustrative campaigns using BioActive. All of these examples generate full-space simulated experimental data according to user-specified parameters and then simulate acquiring specific experimental results using the simulated data.  

There are also campaigns not included in this notebook that perform real experiments by communicating with laboratory automation software and equipment.

The campaigns can be run individually (they do not interact with each other) but the cell below must be run first.

Each campaign will create a folder in the current working directory and store in it the HDF5 database for that campaign, a pickle file containing the completed campaign structure, and various visualizations of the campaign results.  The visualizations will be displayed in the notebook cell.

In [1]:
import CreateAndRunCampaign
from visualize import visualize
randomseed = 11

## Index

1. [MultiVars - Fitting Multivariate Regression](#MultiVars)
1. [PoolBased - Classify Items from a Pool](#PoolBasedClassification)
1. [Battleship - Categorical Clustering](#Battleship)

## MultiVars
This is the simplest BioActive campaign.  A discrete "Experimental Space" is specified as the number of independent variables and the number of discrete values that each can take on, and "Experimental Data" is generated at the outset from a linear model to fill that Experimental Space (with optional added noise).  Data for specific values of the independent variables are provided upon request to the active learner (in batches of specified size), and the learner seeks to estimate the regression function that generated the data.  

In [2]:
NumberOfIvars = 2
NumberPerIvar = 6
AddedNoise = 0
ExperimentsPerRound = 6
UseActiveLearning = True
visualizations = CreateAndRunCampaign.CreateAndRunCampaign("MultiVar2","createCampaignDiscreteRegression",
                                      [NumberOfIvars,NumberPerIvar,AddedNoise,ExperimentsPerRound,UseActiveLearning],
                                      randomseed,"MV2-6-0-6",["*.tif", "*.mp4"])
visualize(visualizations)

(49, 2)
[('int', -50, 50)]
<class 'createCampaignDiscreteRegression.main.<locals>.campaign.ESS'>
dimarr
[7, 7]
Length of finalPredictions7
Fetching data for round 0
Fetching data for round 1
Fetching data for round 2
Stopping: Not enough points with confidence below 0.9 (4) to fill a batch (6)
['<h3>MV2-6-0-6/simsDirectory_2021-09-12 11:35:12.763298/BioActive_LinReg(2iVar)Campaign_batchsize6.tif</h3><img src="MV2-6-0-6/simsDirectory_2021-09-12 11:35:12.763298/BioActive_LinReg(2iVar)Campaign_batchsize6.tif">']


In [3]:
NumberOfIvars = 4
NumberPerIvar = 3
AddedNoise = 20
ExperimentsPerRound = 9
UseActiveLearning = True
visualizations = CreateAndRunCampaign.CreateAndRunCampaign("MultiVar4","createCampaignDiscreteRegression",
                                      [NumberOfIvars,NumberPerIvar,AddedNoise,ExperimentsPerRound,UseActiveLearning],
                                      randomseed,"MV4-3-20-9",["*.tif", "*.mp4"])
visualize(visualizations)

(256, 4)
[('int', -50, 50)]
<class 'createCampaignDiscreteRegression.main.<locals>.campaign.ESS'>
dimarr
[4, 4, 4, 4]
Length of finalPredictions4
Fetching data for round 0
Fetching data for round 1
Fetching data for round 2
Fetching data for round 3
Fetching data for round 4
Fetching data for round 5
Fetching data for round 6
Fetching data for round 7
Fetching data for round 8
Fetching data for round 9
Fetching data for round 10
Fetching data for round 11
Fetching data for round 12
Fetching data for round 13
Fetching data for round 14
Fetching data for round 15
Fetching data for round 16
Fetching data for round 17
Fetching data for round 18
Fetching data for round 19
Fetching data for round 20
Fetching data for round 21
Fetching data for round 22
Fetching data for round 23
Fetching data for round 24
Fetching data for round 25
Fetching data for round 26
Fetching data for round 27
Stopping: Not enough points with confidence below 0.9 (4) to fill a batch (9)
['<h3>MV4-3-20-9/simsDirectory

In [None]:
NumberOfIvars = 8
NumberPerIvar = 2
AddedNoise = 20
ExperimentsPerRound = 200
UseActiveLearning = True
visualizations = CreateAndRunCampaign.CreateAndRunCampaign("MultiVar8","createCampaignDiscreteRegression",
                                     [NumberOfIvars,NumberPerIvar,AddedNoise,ExperimentsPerRound,UseActiveLearning],
                                      randomseed,"MV8-2-20-200",["*.tif", "*.mp4"])
visualize(visualizations)

(6561, 8)
[('int', -50, 50)]
<class 'createCampaignDiscreteRegression.main.<locals>.campaign.ESS'>
dimarr
[3, 3, 3, 3, 3, 3, 3, 3]
Length of finalPredictions3
Fetching data for round 0
Fetching data for round 1
Fetching data for round 2
Fetching data for round 3
Fetching data for round 4
Fetching data for round 5
Fetching data for round 6


## PoolBasedClassification
In Pool-Based Active Learning, the "Experimental Space" is specified as a single independent variable that is an index into a pool of samples, each described by one or more features.  An "experiment" consists of selecting a sample and asking for its label.  The goal is to learn to classify as many members of the pool correctly as possible while asking as few questions as possible.

In [None]:
NumberOfPoints = 40
NumberOfAvars = 1
NumberOfClasses = 2
NumberOfClustersPerClass = 1
UseActiveLearning = True
visualizations = CreateAndRunCampaign.CreateAndRunCampaign("PoolBased2-1","createCampaignPoolBased",
                                [NumberOfPoints,NumberOfAvars,NumberOfClasses,NumberOfClustersPerClass,UseActiveLearning],
                                randomseed,"PB2-1",["*.tif", "*.mp4"])
visualize(visualizations)

In [None]:
NumberOfPoints = 40
NumberOfAvars = 1
NumberOfClasses = 2
NumberOfClustersPerClass = 1
UseActiveLearning = False
visualizations = CreateAndRunCampaign.CreateAndRunCampaign("PoolBased2-1Random","createCampaignPoolBased",
                                [NumberOfPoints,NumberOfAvars,NumberOfClasses,NumberOfClustersPerClass,UseActiveLearning],
                                randomseed,"PB2-1Random",["*.tif", "*.mp4"])
visualize(visualizations)

In [None]:
NumberOfPoints = 40
NumberOfAvars = 4
NumberOfClasses = 2
NumberOfClustersPerClass = 1
UseActiveLearning = True
visualizations = CreateAndRunCampaign.CreateAndRunCampaign("PoolBased2-4","createCampaignPoolBased",
                                [NumberOfPoints,NumberOfAvars,NumberOfClasses,NumberOfClustersPerClass,UseActiveLearning],
                                randomseed,"PB2-4",["*.tif", "*.mp4"])
visualize(visualizations)

In [None]:
NumberOfPoints = 40
NumberOfAvars = 4
NumberOfClasses = 2
NumberOfClustersPerClass = 1
UseActiveLearning = False
visualizations = CreateAndRunCampaign.CreateAndRunCampaign("PoolBased2-4Random","createCampaignPoolBased",
                                [NumberOfPoints,NumberOfAvars,NumberOfClasses,NumberOfClustersPerClass,UseActiveLearning],
                                randomseed,"PB2-4Random",["*.tif", "*.mp4"])
visualize(visualizations)

In [None]:
NumberOfPoints = 100
NumberOfAvars = 4
NumberOfClasses = 2
NumberOfClustersPerClass = 2
UseActiveLearning = True
visualizations = CreateAndRunCampaign.CreateAndRunCampaign("PoolBased2-4-2","createCampaignPoolBased",
                                [NumberOfPoints,NumberOfAvars,NumberOfClasses,NumberOfClustersPerClass,UseActiveLearning],
                                randomseed,"PB2-4-2",["*.tif", "*.mp4"])
visualize(visualizations)

## Battleship
In the Battleship Campaign, a board of hits and misses is initialized. Depending on user  preferences, this board can range from a board with clear structure in its data, to a completely  randomized board. The purpose of this campaign is to test the abilities of its associated  categorical modeller to find the clusters and patterns in its data and predict unseen values in  the board. In the campaign, there are four basic steps. First, the campaign checks if the stopping  criterion has been met. If the stopping criterion has not been met, then the campaign proceeds by  building a model from the existing data. Based on the confidence levels the modeler assigns to its  own predictions, the active learner then chooses which data to request next, prioritizing data for  which the modeller has low-confidence predictions. Then, the experiments requested by the active  learner are performed by the data acquisition function, and the data is stored in a database. The  cycle then repeats until the stopping criterion has been met, after which the campaign terminates.  More detailed documentation on each of these four functions can be found in the function library.

In [None]:
NumberOfIvars = 2
NumberPerIvar = 12
NumberGroups = 3
AddedNoise = 0
ExperimentsPerRound = 4
UseActiveLearning = True
visualizations = CreateAndRunCampaign.CreateAndRunCampaign("CatModel2","createCampaignCategoricalModeler",
                                [NumberOfIvars,NumberPerIvar,NumberGroups,AddedNoise,ExperimentsPerRound,UseActiveLearning],randomseed,
                                "CM2-12-3-0",["*.tif", "*.mp4"])
visualize(visualizations)

In [None]:
NumberOfIvars = 2
NumberPerIvar = 12
NumberGroups = 3
AddedNoise = 0
ExperimentsPerRound = 4
UseActiveLearning = False
visualizations = CreateAndRunCampaign.CreateAndRunCampaign("CatModel2","createCampaignCategoricalModeler",
                                [NumberOfIvars,NumberPerIvar,NumberGroups,AddedNoise,ExperimentsPerRound,UseActiveLearning],randomseed,
                                "CM2-12-3-0",["*.tif", "*.mp4"])
visualize(visualizations)