# MLOps Tutorial - NYC Taxi fare

This project demonstrates a complete ML project and the development flow from initial exploration to continuous deployment at scale.
The example is based on a [Kaggle competition](https://www.kaggle.com/competitions/new-york-city-taxi-fare-prediction). Its 
goal is to predict the correct trip fare, using the public NYC Taxi dataset. 

This example is intended to explain and demonstrate the overall MLOps flow by using the [MLRun](https://www.mlrun.org/) MLOps orchestration framework. It is not designed to dive into the individual components or models.

Open inside Sagemaker and follow the Sagemaker installation section in the [README.md](./README.md) file.

The ML application development and productization flow consists of the following steps (demonstrated through notebooks):

- [**Exploratory data analysis (EDA) and modeling**](./00-exploratory-data-analysis.ipynb).
- [**Data and model pipeline development**](./01-dataprep-train-test.ipynb) (data preparation, training, evaluation, and so on).
- [**Application & serving pipeline development**](./02-serving-pipeline.ipynb) (intercept requests, process data, inference, and so on).
- [**Scaling and automation**](./03-automation-monitoring.ipynb) (run at scale, hyper-parameter tuning, monitoring, pipeline automation, and so on).
- Continuous operations (automated tests, CI/CD integration, upgrades, retraining, live ops, and so on).

<img src="./images/project-dev-flow.png" alt="project-dev-flow"/><br>

You can find the python source code under [/src](./src) and the tests under [/tests](./tests).
