## 1. MLOps motivation
Only one in two organizations has moved beyond pilots and proofs of concept. 
Moreover, 72% of a cohort of organizations that began AI pilots before 2019 have not been able to deploy
even a single application in production according to The AI-powered enterprise Capgemini Reasearch Institute.

Models don’t make it into production, and if they do, they break because they fail to adapt to changes in the environment.

### Reasons for fail in ML
* Teams engage in a high degree of manual and one-off work.
* Teams do not have reusable or reproducible components
* Processes involve difficulties in handoffs between data scientists and IT
* Lack of talent and integration issues 
* Lack of strong governance models for achieving scale


### ML Engineering  is at the center of building ML-enabled systems
ML engineering provides a superset of the discipline of software engineering that handles the unique complexities of the practical applications of ML

- Preparing and maintaining high-quality data for training ML models.
- Tracking models in production to detect performance degradation.
- Performing ongoing experimentation of new data sources, ML algorithms, and hyperparameters, and then
tracking these experiments.
- Maintaining the veracity of models by continuously retraining them on fresh data.
- Avoiding training-serving skews that are due to inconsistencies in data and in runtime dependencies between
training environments and serving environments.
- Handling concerns about model fairness and adversarial attacks.

## MLOps Definition
MLOps is a methodology for ML engineering that unifies:
1. ML system development (the ML element) with 
2. ML system operations (the Ops element)

MLOps supports ML development and deployment in the way that DevOps and DataOps support application engineering and data engineering (analytics). The difference is that when you deploy a web service, you care about resilience, queries per second, load balancing, and so on. When you deploy an ML model, you also need to worry about
changes in the data, changes in the model, users trying to game the system, and so on

It advocates formalizing and (when beneficial) automating critical steps of ML system
construction. MLOps provides a set of standardized processes and technology capabilities for building, deploying,
and operationalizing ML systems rapidly and reliably.

### Benefits of MLOps practices
- Shorter development cycles, and as a result, shorter time to market.
- Better collaboration between teams.
- Increased reliability, performance, scalability, and security of ML systems.
- Streamlined operational and governance processes.
- Increased return on investment of ML projects.

## MLOps Lifecycle
<img src="graphs/mlops_lifecycle.png" alt="image" width="300" height="auto">

### The MLOps lifecycle encompasses seven integrated and iterative processes
- ML development concerns experimenting and developing a robust and reproducible model training procedure (training pipeline code), which consists of multiple tasks from data preparation and transformation to
model training and evaluation.
- Training operationalization concerns automating the process of packaging, testing, and deploying repeatable and reliable training pipelines.
- Continuous training concerns repeatedly executing the training pipeline in response to new data or to code
changes, or on a schedule, potentially with new training settings.
- Model deployment concerns packaging, testing, and deploying a model to a serving environment for online
experimentation and production serving.
- Prediction serving is about serving the model that is deployed in production for inference.
- Continuous monitoring is about monitoring the effectiveness and efficiency of a deployed model.
- Data and model management is a central, cross-cutting function for governing ML artifacts to support auditability, traceability, and compliance. Data and model management can also promote shareability, reusability,
and discoverability of ML assets.

## MLOps end-to-end workflow
Simplified but canonical flow for how the MLOps processes interact with each other, focusing on
high-level flow of control and on key inputs and outputs.

This is not a waterfall workflow that has to sequentially pass through all the processes. 

The processes can be skipped, or the flow can repeat a given phase or a subsequence of the processes
<img src="graphs/mlops_workflow.png" alt="image" width="500" height="auto">


1. ML development == experimentation. 
    * prototype model architectures and training routines
    * create labeled datasets
    * use features and other reusable ML artifacts that are governed through the data and model management process
    * primary output of this process is a formalized training procedure, which includes:
        * data preprocessing
        * model architecture
        * model training settings.

2. Training operationalization (if required)
    * the training procedure is operationalized as a training pipeline
    * requires a CI/CD routine to
        * build,
        * test,
        * deploy the pipeline to the target execution environment.

3. The continuous training 
    * its pipeline is executed repeatedly based on retraining triggers
    * produces a model as output 
    * The model is retrained as 
        * new data becomes available or 
        * if model performance decay is detected.
    * Other training artifacts and metadata that are produced by a training pipeline are also tracked. 
    * If the pipeline produces a successful model candidate, that candidate is then tracked by the model management process as a registered model.

4. The registered model is 
    * annotated
    * reviewed
    * approved for release 
    * finally deployed to a production environment.
This process might be relatively opaque if you are using a no-code solution, or it can involve building a custom CI/CD pipeline for progressive delivery.

5. The deployed model serves predictions using the deployment pattern that you have specified: 
    * online, 
    * batch,
    * streaming predictions. 
In addition to serving predictions, the serving runtime can generate model explanations and capture serving logs to be used by the continuous monitoring process.

6. The continuous monitoring process monitors the model for predictive effectiveness and service. The primary
concern of effectiveness performance monitoring is detecting
    * model decay—for like data and concept drift.  
The model deployment can also be monitored for efficiency metrics like latency, throughput, hardware
resource utilization, and execution errors. 




## MLOps need of technical capabilities
* can be provided by a single integrated ML platform.
* can be created by combining vendor tools that each are best suited to particular tasks, developed as custom services, or created as a combination of these approaches.

<img src="graphs/techincal capabilities.png" alt="image" width="500" height="auto">






# MLOps key tasks

## 1. ML development
<img src="graphs/ml_dev.png" alt="image" width="600" height="auto">

## 2. Training operationalization

<img src="graphs/training_op.png" alt="image" width="600" height="auto">


## 3. Continuous training

<img src="graphs/cont_train.png" alt="image" width="600" height="auto">


## 4. Model deployment

<img src="graphs/model_dep.png" alt="image" width="600" height="auto">


## 5. Prediction serving

<img src="graphs/pred_serv.png" alt="image" width="600" height="auto">


## 6. Continuous monitoring

<img src="graphs/cont_monit.png" alt="image" width="600" height="auto">


## 7. Dataset and feature management

<img src="graphs/data_feat_management.png" alt="image" width="600" height="auto">


## 8. ML metadata tracking

<img src="graphs/ml_meta.png" alt="image" width="600" height="auto">


## 8. Putting it all together - E2E MLOps workflow

<img src="graphs/e2e.png" alt="image" width="600" height="auto">


## Useful links:
* https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
* https://cloud.google.com/resources/mlops-whitepaper?hl=pl