<img src="https://d24cdstip7q8pz.cloudfront.net/t/ineuron1/content/common/images/final%20logo.png" height=50% width=50% alt-text="iNeuron.ai logo">

**Table of Contents**
- **<a href="#CL">Continual Learning</a>**
- **<a href="#CCED">Challenge of Continually Evolving Data</a>**
- **<a href="#AAML">Automatically Adaptive Machine Learning</a>**
- **<a href="#CA">Complete Architecture</a>**
  - **<a href="#SJ">Sketcher and Joiner</a>**
  - **<a href="#MS">Monitoring System</a>**
  - **<a href="#MP">Model Policy</a>**
  - **<a href="#SI">Shared Infrastructure</a>**
- **<a href="#PMR">Points need to consider during Model Retraining</a>**
- **<a href="#R">Reference</a>**


<a id="CL"></a>
## 1. Continual Learning

Self-maintaining systems that can learn continually, as data arrives represents `continual AutoML` or `Automatically Adaptive Machine Learning`.

- The principle characteristic of a `Machine Learning (ML)` system is that it is data-driven. A ML system is a software system where the software is written by data.

- ML models act as compressors: taking the original input (training) data they produce compact representations (e.g. neural network weights).

<a id="CCED"></a>
## 2. Challenge of Continually Evolving Data

- System outputs may depend on the inputs in non-trivial ways, making integration testing challenging.
- ML systems are often entangled with control parameters that are interrelated, and external dependencies.
- The outputs of ML models may be used as the inputs to downstream systems, which themselves may involve ML models etc, which can result in hidden feedback loops.

<a id="AAML"></a>
## 3. Automatically Adaptive Machine Learning 
- The remedy we propose stems from a notion of ‘zero touch’ ML, i.e. ML models that can be deployed and do not need to be maintained manually. 
- ML requires a hypervisor: A system that monitors, creates and maintains deployed models.
- Our focus is maintenance of the model once deployment has occurred. 

<a id="CA"></a>
## 4. Complete Architecture
<img src="imgs/Architecture.jpg" width="800"/>
[Source](https://arxiv.org/abs/1903.05202)

The overall system architecture is given in Fig. 1. The following sub-sections will outline the role of each in more detail.

<a id="SJ"></a>
### 4.1 Sketcher and Joiner
- Both classical software and ML systems process data in batches, but many modern software systems are based around streaming data. 
- In a high data-rate streaming scenario, queries are never-ending, continuous, streaming queries. To deal with quantity of data a `sketcher` (smart heavy down-sampler) is important. 
- The `sketcher/compressor` component deals with the challenge that arises when data throughput is too high for the downstream training systems to cope with. 
- In `batch processing`, data is there and you query it. In `stream processing` data might be arriving late, out of order, or it might be dropped altogether: this is mitigated by the `joiner` component.
- The `joiner` subsystem is required to ensure that the trainer and predictor receives the data in the format required.



<a id="MS"></a>
### 4.2 Monitoring System
- Auto-Adaptability implies that the system self-diagnoses when errors are likely to occur. The is enabled by the `data monitoring subsystem` which analyses the incoming streams looking for possible anomalies, drift, and change-points.
- The `prediction monitoring subsystem` , which analyses the prediction and health monitoring streams generated by the predictor subsystem.



<a id="MP"></a>
### 4.3 Model Policy
- The `policy engine` is responsible for updating models by triggering retraining or other actions. In its simplest form, the policy engine could use classical AutoML techniques in combination with retraining triggers. 
- Decisions would be made on the basis of the data monitoring subsystem, the prediction monitoring subsystem, and any associated business logic.


<a id="SI"></a>
### 4.4 Shared Infrastructure
The `shared infrastructure` can be seen as an intelligent cache that manages the storage requirements of the system. There is the requirement to keep logs since customers might want to replay historical episodes.

- The `shared infrastructure` is shown as a collection of databases.

- **Model DB:** Trained models (when feasible) are written to the `Model DB` to ensure provenance (to be able to answer questions about why a particular decision was made at a particular time) and rollback functionality. 

- **Training DB:** The `Training DB` holds a history of the training episodes from the Trainer subsystem. 

- **Validation Data Reservoir:** The `Validation Data Reservoir` is used for validating models during training, and (subject to space constraints) can also be used for reproducing old models.

>**Note:** Collectively, the databases also enable offline experimentation, e.g. A/B testing. 

- **Diagnostic Logs:** The `Diagnostic Logs` provide a full history for system debugging.

- **System State DB:** The `System State DB` is used both by the policy engine to enable decision making.

<a id="PMR"></a>
## 5. Points need to consider during Model Retraining

- The analysis can be done such as; How quickly does the most recent data-point need to become part of the model? How long does it take for data to become irrelevant? These thing need to find `the right scope of data for relevant learning at the current moment`. Generally, newer instances are more relevant, but in some cases (e.g. retail), data from the previous quarter/year are more relevant.

- It is also required to know when to retrain. This decision involves balancing the `operational cost` against the potential downstream benefit resulting from improvements to the model.

- It is essential to trace back from decisions to their underlying causes. The system must include comprehensive logging as well as providing `continuous health monitoring output streams`.

- It is required to analyze the `operational costs of retraining` and the `risks for not performing retraining`.

<a id="R"></a>
## Reference

Tom Diethe, Tom Borchert, Eno Thereska, Borja de Balle Pigem, Neil Lawrence "Continual Learning in Practice". 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montréal, Canada.