# Task 2: Model Serving (30 points)


In this task, we will explore different ways to serve your model. Most of your assignments in other DS/ML courses are script based. So you probably trained some model, did some predictions and produced some output. However, in production, you need a more systematic way to run your classifiers. We will discuss four different ways for running your classifiers. 

## Task 2a : Command Line (7.5 points)

The simplest possible mechanism is to serve your model as a command line script. So, you will call your script with some parameters (such as classifier name, features etc) and the model will output the prediction in the terminal.

This might sound like a retro and clumsy way of serving things. However, it has lot of practical usecases. For example, you can use the full power of Unix command line tools to slice and dice the model predictions which might require complex coding otherwise. You can also use this to test that your model is working as expected before deploying it to production. A number of companies (including mine) have CI/CD pipelines that test your model in your terminal using sophisticated shell scripts.

### Developing a command line utility with Typer Library

In this assignment, we will implement a very simple scenario. However, things will be much more complex in production. So, it is useful to develop this utility in a good library.

The classical way is to use the [argparse](https://docs.python.org/3/library/argparse.html) library that comes with standard Python. This is often sufficient if your requirements are simple enough. However, there are a number of advanced features that are missing or hard to use in argparse. 

A more modern way is to use the [click](https://click.palletsprojects.com). This is a an advanced and very powerful tool that allows you to create sophisticated command line utilities. I have some click scripts that automate the entire ML pipeline. While it is quite powerful, it also has a steep learning curve. Nevertheless, I would definitely recommend you to learn this library.

In this assignment, we will take an intermediate approach of using the [typer](https://typer.tiangolo.com/) library. Typer is based on click library and does some cool syntactic sugars to make life much easier. Usually, you can get 90% of what Click offers by using typer. 

For this assignment, you have to skim four concepts from Typer. 

1. Commands : More details [here](https://typer.tiangolo.com/tutorial/commands/). 

2. Arguments : More details [here](https://typer.tiangolo.com/tutorial/arguments/). 

3. Options: More details [here](https://typer.tiangolo.com/tutorial/options/). 

4. Variadic Arguments: More details [here](https://typer.tiangolo.com/tutorial/multiple-values/). 



### Command Line Utility Task Specifications

1. The code for this task is inside `task2/typer_demo.py` 

2. You will develop two Typer commands: train and predict. The "business" logic for these is already written by me. All you need to do is to convert them to typer commands. 

3. The train command should have the name `train` (see how to add name to a Typer command) and should have the help string as "Train a classifier.". The predict command should have the name `predict` and should have the help string as "Make a prediction using a trained classifier.". 

4. The `predict` command should accept one required argument named `item` and a optional argument named `classifier`. The item should accept a variadic input of floats. In other words, I should be able to pass the feature values as a b c d e etc where a/b/c/d/e are float. The classifier should take values from ValidClassifiers enum that can be found in ds5612_pa2.utils.ml_utils. 

5. For both `train` and `predict`, annotate the `classifier` option so that it outputs the help message as "Classifier name." You do not need to do anything for item argument.

6. You can test your code with the grader using the command

> rye test -- -m t2typer

7.  You can manually test the code by running the following commands from the project root folder (DS5612_PA2). 

> rye run typer_task --help 
> 
> rye run typer_task train --help
> 
> rye run typer_task predict --help
>
> rye run typer_task train
> 
> rye run typer_task train --classifier KNN
>
> rye run typer_task predict --classifier NaiveBayes 1 2 3 4 5 6 7 8 9 0

8. Optional: Nerd stuff: Check out the pyproject.toml file's rye script section to see how this custom command is done. You can use the script section of rye/uv/pdm to create scripts that can simplify your life.

## Task 2b: Terminal Application with Textual (7.5 points)

The second type of model serving is terminal based. Typically, your ML models will be deployed on a Linux server that does not have any UI. So, you need to be able to do some simple evaluation of your model using your terminal.

[Textual](https://textual.textualize.io/) is a popular approach to create very complex terminal applications. In this task, we will develop a terminal variant of the model serving that you did for typer. 

A screenshot of the application be found below. I have partially implemented the business and styling logic of it. You just need to create the UI part. The application should have the following features. Note that while some ideas are open to experimentation, some (like the element id are fixed as they are needed for automatic grading).

The code for this task is inside `task2/textual_demo.py`. You can test it using the command

> rye test -- -m t2textual

1. It should have two panels: input and output. This can be using a combination of `Horizontal` and `Vertical` respectively.
2. It should have a dropdown (`Select`) with the id `classifier` and takes the values from the variable `CLASSIFIER_OPTIONS`
3. It should have a `TextArea` with the id `input-text`
4. It should have a `Button` with text "Predict" and id as `predict`
5. The output of the classifier will be done Label and ProgressBar. 
6. There should be two labels. First should have text "Positive Score: " and id `positive_score`. Second should have "Negative Score: " and id `negative_score`
7. The classifier score should be displayed using ProgressBar. The first should have the id as `positive` and second should have an id as `negative`. They should be able to display a numbers between 0.0 and 1.0. So if I give a value of 0.6, then the progress bar should be 60% highlighted. 
8. In the on_mount event, reset the value of both progress bars to 0.
9. When the predict button is pressed,ensure that the Classifier and features are specified. You can enter a sample feature value as "1 2 3 4 5 6 7 8 9 0" (without the quotes). In general, the textarea should accept 10 features that are space separated. If the inputs are not specified, then an error message should be shown using the `notify` command. 


**Hints**
1. You can terminate the application using Ctrl+c.
2. You can run the application using a rye command `rye run textual_task`

**References**:
1. Basic tutorial: [here](https://textual.textualize.io/tutorial/)
2. You need to use the following UI types: Horizontal, Vertical, Button, Footer, Header, Label, ProgressBar, Select, TextArea




![Textual Application](resources/textual_screenshot.jpg)

## Task 2c: API based Model Serving using FastAPI (7.5 points)

The third type of model serving is API based. In this case, you will serve your model as a web service. It should have some predict function that accepts some information required for inference and should return the prediction. 

Such a model serving is often required when your application consists of multiple microservices one of which is your model. This also allows you to do lot of testing such as by using tools such as [Postman](https://www.postman.com/). 

We will be using [FastAPI](https://fastapi.tiangolo.com/). It is an exceptional library that is based on Pydantic and [Starlette](https://www.starlette.io/) and is blazing fast. Unless you have a good reason, you should implement your microservices using FastAPI (and not some old methods such as Flask, Django etc).

A really cool feature of FastAPI is that it is built on top of Pydantic. So it will automatically validate the HTTP request and automatically convert the input into valid Python objects that you can immediately use in your code. Pydantic also allows you to write complex validations (as you did in Task 1). Not long ago, you will be spending lot of code doing this manually and now FastAPI does this automatically. 

We will only explore FastAPI briefly. But it is worth learning it in detail. 

The code for this task is inside `task2/fastapi_demo.py` . You can test it using the command

> rye test -- -m t2fastAPI

You can run the FastAPI server using the following command. But remember to stop the server before running the test suite.

> rye run fastapi_task

### Task 2c1: Modeling Request and Response

Create three Pydantic BaseModel classes with the names `PredictRequest`, `PredictionResponse` and `DetailedPredictionResponse`. The request will be an instance of `PredictRequest`. To make things interesting, we will do a minor demonstration of API versioning. We will implement two versions of the API. The V1 will return `PredictionResponse` while the V2 will return `DetailedPredictionResponse`.

The `PredictRequest` should have two fields: `classifier` that can take only values from `pipeline_configs.ValidClassifierNames` and `features` that is a list of float.


The `PredictionResponse` should have two fields: `predicted_class` that is an integer and `ml_model_version` that is a string with default value of "V1". This should allow the clients to understand that they are processing the output of V1 model.

The `PredictionResponse` should extend `PredictionResponse` and have a new field: `probabilities` that is a tuple with two floats (ie the probability for positive and negative class). Additionally, it should set the  `ml_model_version` as a string with default value of "V2". This should allow the clients to understand that they are processing the output of V2 model.

### Task 2c2: Inference API with Versioning

We will implement two inference functions. I have already written the business logic of this.

`predict_v1` that is accessible via the route "/v1/predict". It should accept PredictRequest and output PredictionResponse. 

`predict_v2` that is accessible via the route "/v2/predict". It should accept PredictRequest and output DetailedPredictionResponse. 

**Optional**: Nerd Stuff: When you have a long running company, you want to have different versions of the model running. For simplicity, we did a simple API versioning using different routes. There are other versioning strategies that does not require new "routes". These include: 

(a) Header-based versioning: Use a custom header (e.g., X-API-Version) to specify the desired version.

(b) Query parameter versioning: Use a query parameter (e.g., ?version=2) to specify the version.

(c) Content negotiation: Use the Accept header to specify the desired response format and version.

### Task 2c3: Endpoint for Bulk Inference

To make things more challenging, let us implement a bulk inference endpoint. This endpoint should accept a file as input and should output a list of DetailedPredictionResponse. Given a file, you should read it and parse it line by line. In the previous inference endpoints, Pydantic already converted the string into a list of float. For this endpoint, we need to do the parsing ourselves. Each line is a string that has space separated input. So convert them into a list of float and pas it to the ML pipeline code (that I have already written). It is possible that the input has some error (eg sending string values as features or not send sufficient number of features etc). In this case, you should raise a HTTPException with the status code 422 and also send the exception details in the message. 

Details about how to process files can be found [here](https://fastapi.tiangolo.com/tutorial/request-files/).

### Task 2c4: Automatic API Documentation using FastAPI (ungraded)

By default, FastAPI provides two endpoints for accessing the documentation:

1. Swagger UI: Available at [/docs](http://127.0.0.1:8000/docs)

2. ReDoc: Available at [/redoc](http://127.0.0.1:8000/redoc)

It should show a neat UI with the API documentation. It even has a nifty utility to test these APIs. It uses some known documentation style. So you can add some comments to the Python function and it will show the documentation automatically. 

If you do not know what Swagger is, please take some time to learn about Swagger and OpenAPI. 





## Task 2d: Model Serving using Gradio (7.5 points)

As the final task, we will experiment [Gradio](https://www.gradio.app/) with that is becoming increasingly popular for demoing ML models. The high level idea is to provide a Pythonic interface so that data scientists can develop web applications without knowing much about frontend development. As you will see, you can create a nice looking UI in 4-5 lines of Python code.

I am partial to Gradio as I have used it in many of my demos. However, there are other alternatives too. Other popular ones include [Streamlit](https://streamlit.io/) and [Taipy](https://taipy.io/) that have their tradeoffs. Both are solid alternatives to Gradio. However, Gradio is more popular because of its tight integration with [HuggingFace hub](https://huggingface.co/docs/hub/en/index) where most of the deep learning projects are showcased. 

The code for this task is inside `task2/gradio_demo.py` . You can test it using the command

> rye test -- -m t2gradio



#### Product Specification

You can see a simple UI of the code in the screenshot below. The UI should have a DropDown with label "Classifier" for the classifier (use the `CLASSIFIER_OPTIONS` variable for creating the values) and a textbox with label "Features".

There will be two panels on the RHS for the output. If the inputs are correct and there is no error, you should output the class (Negative/Positive) and progress bar to show the respective probabilities. If there is an error, the error will be displayed in the error textbox. 

The "business" logic is written by me. So you can focus on the UI. It should not take more than 5 lines or so for the UI. You also have to change the code a bit to handle the error scenario which should another couple of lines. 

You can run the application using the command below. This will start a web service rendering that can be accessed (by default) [here](http://127.0.0.1:7860/). Remember to stop the server before running the test suite.

> rye run gradio_task



You have to use the following Gradio components: Dropdown, Textbox and Label. The Label class can accept class probabilities that my code will already give in the right format.




![Gradio Success](resources/gradio_success.jpg) 

![Gradio Failure](resources/gradio_failure.jpg) 

