# Beyond the Training...

#### Model scoring, online learning, and workflow orchestration

## Prediction / Inference / Scoring

In the ML context, these terms all refer to using models to make predictions 

There are several patterns
* Batch (bulk) scoring
* Request/response
* Streaming

In the general case, these are -- happily -- trivially parallelizable and scalable

As we move to newer tools, we notice that they are all able to address these use cases.
* Ray includes a subproject called Ray Serve 
    * https://docs.ray.io/en/master/serve/
* Dask has various examples addressing these use cases
    * https://examples.dask.org/machine-learning/parallel-prediction.html
    * https://examples.dask.org/machine-learning/torch-prediction.html
    * https://examples.dask.org/applications/async-web-server.html
    
And we should include the most promising open-source "ML platform", Kubeflow https://www.kubeflow.org/

__However__ one of the biggest challenges is not over- or under- architecting a model serving solution.
* Dask is ideal for batch prediction
* But for request-response model serving, one might argue that both Dask and Ray are overly complex relative to the functionality they offer
* Kubeflow is complex, but has a broader set of functionality 
    * ... which may make it worthwhile *if you need that functionality*
* For streaming prediction, a simple Kafka or Pulsar application may be sufficient

Keep in mind that the fundamental scoring (prediction) operation is typically uncomplicated and does not warrant any "special" software system. So if you are going to use or build a larger system, make sure it is meeting, while not exceeding, your actual data management needs.

### Model Serving is Trivial; Model Management in Production May Not Be

Why might you want a more complex system?
* Model performance monitoring
* Caching layer
* Model drift
* A/B or bandit testing
* Rolling deploy of new model versions
etc.

Those concerns are beyond our scope here, but the key point is that your focus as a system designer should be on accommodating those first -- if you need them; I would advise against using an overly complex tool for model serving and then having to bolt on those additional capabilities.

### A Note on Orchestration

Apache Airflow, the incumbent orchestration solution, is a nice product. 

For the next generation of architecture -- particular where ML is the goal from the start, rather than just data transformation -- take a look at Prefect (https://www.prefect.io/core)

Prefect's argument -- borrowed straight from https://docs.prefect.io/core/getting_started/why-not-airflow.html#overview -- is:

>Airflow was designed to run static, slow-moving workflows on a fixed schedule, and it is a great tool for that purpose. Airflow was also the first successful implementation of *workflows-as-code*, a useful and flexible paradigm. It proved that workflows could be built without resorting to config files or obtuse DAG definitions.
>
> However, because of the types of workflows it was designed to handle, Airflow exposes a limited "vocabulary" for defining workflow behavior, especially by modern standards. Users often get into trouble by forcing their use cases to fit into Airflow's model. A sampling of examples that Airflow can not satisfy in a first-class way includes:
> 
> -   DAGs which need to be run off-schedule or with no schedule at all
> -   DAGs that run concurrently with the same start time
> -   DAGs with complicated branching logic
> -   DAGs with many fast tasks
> -   DAGs which rely on the exchange of data
> -   Parametrized DAGs
> -   Dynamic DAGs
>
> If your use case resembles any of these, you will need to work *around* Airflow's abstractions rather than *with* them. For this reason, almost every medium-to-large company using Airflow ends up writing a custom DSL or maintaining significant proprietary plugins to support its internal needs. This makes upgrading difficult and dramatically increases the maintenance burden when anything breaks.

Naturally, a project's own pitch is not a neutral argument to use that tool -- and I'm not suggesting you take it as such.

It is, however, well worth consideration in your system design.