# Introduction

This set of notebooks demonstrates the use case of uploading an externally trained prediction model into Exasol database. In this particular scenario, a regression or classification model is trained using the [scikit-learn](https://scikit-learn.org/stable/) library. The trained scikiit_learn model is then uploaded into the [BucketFS](https://docs.exasol.com/db/latest/database_concepts/bucketfs/bucketfs.htm) and used for making predictions.

The prediction is made for each row in a "test" database. The computation is invoked by a user defined function ([UDF](https://docs.exasol.com/db/latest/database_concepts/udf_scripts.htm)). If the database has multiple nodes the test rows would be evenly distributed across the nodes. The UDF making the predictions runs on all these nodes in parallel, each node computing its portion of the test data. This parallelization results in the reduction of overall computation time by the factor equal to the number of nodes.

For the prediction UDF the scikit-learn model is an arbitrary object that supports the `predict` method. It can just as well be a pipeline including some pre-processing and/or post-processing steps.

## Prerequisites

Before using this set of notebooks the following steps need to be completed:
1. [Configure the AI-Lab](../main_config.ipynb).
2. [Load the MAGIC Gamma Telescope data](../data/data_telescope.ipynb).
3. [Load the Abalone data](../data/data_abalone.ipynb).

## Content

This section consists of two tutorials, covering one classification and one regression problem. Please note that the tutorials focus on showing the in-database prediction rather than presenting best ML practices. For the sake of clarity, we perform only very basic ML pre-processing and performance analysis.

The first notebook to run is the one that creates a [prediction UDF](./sklearn_predict_udf.ipynb).

This shall be followed by the two tutorials in any order.

- [Solving classification problem (MAGIC Gamma Telescope)](./sklearn_train_telescope.ipynb).
- [Solving regression problem (Abalone)](./sklearn_train_abalone.ipynb).

