# Deploying Reliable Models in Production

import Video from "@site/src/components/Video";
import deployVideo from "@site/static/video/inference-tutorial/deploy_model.mp4";

In this tutorial, we'll train a reliable model on real-world messy data and deploy it in production to obtain predictions on new data using Cleanlab Studio's Python API. We'll be working with a product classification dataset from Web Data Commons. This dataset, which can be found in it's original form [here](https://data.dws.informatik.uni-mannheim.de/largescaleproductcorpus/data/amazon_training.json.gzip), contains titles, descriptions, and category labels for 23,000 products on Amazon.

**Overview of what we'll do in this tutorial:**
- [Upload a dataset](#2-prepare-and-upload-dataset)
- [Clean your data](#3-clean-your-data)
- [Deploy a model on the cleaned data](#4-train-a-model)
- [Use the model in production to obtain predictions on new data](#5-use-the-model-in-production)

![Product Descriptions Dataset](/img/inference-tutorial/dataset_preview.png)

```python
>>> model.predict(['$100 Dollar Bill Design (Benjamin) Eraser'])
['Office_Products']
```

## 1. Install and Import Required Dependencies

You can use `pip` to install all packages required for this tutorial as follows:

In [None]:
!pip install pandas
!pip install cleanlab-studio

In [2]:
import os

from cleanlab_studio import Studio
import pandas as pd

In [4]:
# code to render dataframes
from IPython.core.display import HTML

def display(data: pd.DataFrame | pd.Series) -> None:
    return HTML(data.to_html(escape=False))

## 2. Prepare and Upload Dataset

The dataset used in this demo can be found hosted at this [link](https://cleanlab-public.s3.amazonaws.com/StudioDemoDatasets/amazon-products/amazon_products_train.csv). The link points to a preprocessed version of the original Web Data Commons dataset (which was constructed from Common Crawl). The preprocessed dataset concatentates the title and description of products into a single field, which is then used to predict the category label.

### Prepare Data

For this tutorial, we'll be uploading our dataset to Cleanlab Studio as a CSV file structured as follows:
```
asin,brand,categoryLabel,description,title,text
B007CC3KXI,,Tools_and_Home_Improvement,"Light up your fence, gate, deck or stairs with this pair of solar-powered LED lights. Weather resistant plastic, no wiring req. Each includes a rechargeable battery. 4 1/2""L x 2 1/4""H",Deck And Fence Wall Mount Solar Stair &amp; Step Lights - 2 Pack,"Deck And Fence Wall Mount Solar Stair &amp; Step Lights - 2 Pack
Light up your fence, gate, deck or stairs with this pair of solar-powered LED lights. Weather resistant plastic, no wiring req. Each includes a rechargeable battery. 4 1/2""L x 2 1/4""H"
...
```

In [12]:
df = pd.read_csv("https://cleanlab-public.s3.amazonaws.com/StudioDemoDatasets/amazon-products/amazon_products_train.csv")
display(df.head(3))

Unnamed: 0,asin,brand,categoryLabel,description,title,text
0,B007CC3KXI,,Tools_and_Home_Improvement,"Light up your fence, gate, deck or stairs with this pair of solar-powered LED lights. Weather resistant plastic, no wiring req. Each includes a rechargeable battery. 4 1/2""L x 2 1/4""H",Deck And Fence Wall Mount Solar Stair & Step Lights - 2 Pack,"Deck And Fence Wall Mount Solar Stair & Step Lights - 2 Pack\nLight up your fence, gate, deck or stairs with this pair of solar-powered LED lights. Weather resistant plastic, no wiring req. Each includes a rechargeable battery. 4 1/2""L x 2 1/4""H"
1,B00IHPX9FS,,Home_and_Garden,"Mezzati Luxury LinensModern luxury meets classic comfort. Affordable, top quality linens.BenefitsMicrofiber sheets arewarm and cuddly in winterandcool in summer- like a cozy t-shirt.Hypoallergenic and resistant to dust mites.Excellent solution for those who tend to sweat at night. Wick moisture and remain cool, and very fast drying.Provide aluxurious softness, resist wrinkling and are strong and durable.You will be surprised how comfortable you will feel.1800 Prestige CollectionThese sheets shall complement any bed no matter what bedroom style you are trying to match.What's more important is that it will insure that you and your close ones have a good night sleep.We spend third of our lives in bed so why not make sure that we spend it in comfort.Besides with Mezzati sheet sets you don't need to spend a fortune like you would on some other luxury linens.The price that fits any budget and product that will satisfy your need for luxurious comfort that you deserve.Easy CareWrinkle and Fade resistant, machine wash in cold water. Tumble dry on low.100% Satisfaction GuaranteedIf within 30 days you're not 100% happy with your purchase of Mezzati Luxury Linenslet us know and we'll refund your entire purchase price...no questions asked!Mezzati Luxury Linens offers the Highest Quality Brushed Microfiber on the Market!Choose Quality - Choose Comfort! Order now while supplies last!","Mezzati Luxury Bed Sheets Set - #1 On Amazon! ★ Best, Softest, Coziest Bed Sheets Ever! ★ Sale Today Only ★ 1800 Prestige Collection Brushed Microfiber Luxury Wrinkle Resistant Bedding Sheets - Deep Pocket - High Quality with Soft Silky Touch ★ All with 100% Money Back Guarantee!! (Gold, Queen)","Mezzati Luxury Bed Sheets Set - #1 On Amazon! ★ Best, Softest, Coziest Bed Sheets Ever! ★ Sale Today Only ★ 1800 Prestige Collection Brushed Microfiber Luxury Wrinkle Resistant Bedding Sheets - Deep Pocket - High Quality with Soft Silky Touch ★ All with 100% Money Back Guarantee!! (Gold, Queen)\nMezzati Luxury LinensModern luxury meets classic comfort. Affordable, top quality linens.BenefitsMicrofiber sheets arewarm and cuddly in winterandcool in summer- like a cozy t-shirt.Hypoallergenic and resistant to dust mites.Excellent solution for those who tend to sweat at night. Wick moisture and remain cool, and very fast drying.Provide aluxurious softness, resist wrinkling and are strong and durable.You will be surprised how comfortable you will feel.1800 Prestige CollectionThese sheets shall complement any bed no matter what bedroom style you are trying to match.What's more important is that it will insure that you and your close ones have a good night sleep.We spend third of our lives in bed so why not make sure that we spend it in comfort.Besides with Mezzati sheet sets you don't need to spend a fortune like you would on some other luxury linens.The price that fits any budget and product that will satisfy your need for luxurious comfort that you deserve.Easy CareWrinkle and Fade resistant, machine wash in cold water. Tumble dry on low.100% Satisfaction GuaranteedIf within 30 days you're not 100% happy with your purchase of Mezzati Luxury Linenslet us know and we'll refund your entire purchase price...no questions asked!Mezzati Luxury Linens offers the Highest Quality Brushed Microfiber on the Market!Choose Quality - Choose Comfort! Order now while supplies last!"
2,B00AE1L9VO,,Camera_and_Photo,ThisImpact BDA-PRO Adapter for Beauty Dishallows you to mount all sizes of Impact Beauty Dishes on your studio flash. The adapter slips quickly into the reflectors and is secured by a simple thumbscrew.Compatible with:Profoto strobes,Impact BDA-PRO Adapter for Beauty Dish,Impact BDA-PRO Adapter for Beauty Dish\nThisImpact BDA-PRO Adapter for Beauty Dishallows you to mount all sizes of Impact Beauty Dishes on your studio flash. The adapter slips quickly into the reflectors and is secured by a simple thumbscrew.Compatible with:Profoto strobes


### Upload Dataset

<details><summary>This tutorial focuses on using the Python API, but you can also use our <a href="https://app.cleanlab.ai">Web UI</a> for a no-code option <b>(click to expand)</b></summary>

If you would like to upload your data without writing code, simply go to [https://app.cleanlab.ai/upload](https://app.cleanlab.ai/upload) and follow these steps:
1. Click "Upload from URL"
2. Enter [link to dataset](https://cleanlab-public.s3.amazonaws.com/StudioDemoDatasets/amazon-products/amazon_products_train.csv)
3. Click "Upload" and wait for the file to upload
4. Click "Next"
5. Make sure "text" is selected as the dataset modality. Leave everything else on the schema editing page as default
6. Click "Confirm"
7. Wait for data ingestion to complete

</details>

To upload your dataset to Cleanlab Studio using our Python API, use the following code:

In [6]:
# you can find your API key by going to app.cleanlab.ai/upload, 
# clicking "Upload via Python API", and copying the API key there
API_KEY = "<YOUR_API_KEY>"

# initialize studio object
studio = Studio(API_KEY)

In [None]:
# upload dataset
dataset_id = studio.upload_dataset(df, dataset_name="Product Descriptions")

## 3. Clean Your Data

Real world data is messy and often contains issues including label errors, outliers, duplicate examples, etc. If we use Cleanlab Studio to address these issues and train a model on the improved data, we'll obtain a [model that gives more reliable predictions](https://cleanlab.ai/blog/model-deployment/).

Since cleaning your data using Cleanlab Studio isn't the main focus of this tutorial, we won't go into detail on it here. Instead see our [Python API](/guide/quickstart/api#creating-a-project) or [Web UI](/guide/quickstart/web#create-a-project-to-find-and-correct-outliers-and-label-issues) quickstarts. 

## 4. Train a Model

Once you're happy with your dataset corrections, you can use Cleanlab Studio to automatically train and deploy a model using the cleaned data. To do this, click on the "Deploy Model" button on the project page, name your model, and click deploy! Cleanlab Studio will automatically train many models using state-of-the-art autoML to find the best model for your dataset.

<Video
  width="1792"
  height="1010"
  src={deployVideo}
  autoPlay={false}
  loop={false}
  muted={true}
  shadow={true}
/>

## 5. Use the Model in Production

Now that you've deployed your model, you can use Cleanlab Studio's Python API to obtain predictions for new data points! For this tutorial, we've prepared several batches of samples to run inference on.

Batches for text models must be provided as lists, NumPy arrays, or Pandas Series of strings. Batches for tabular models must be provided as Pandas DataFrames.

In [5]:
# load and prepare batch
batch = pd.read_csv("https://cleanlab-public.s3.amazonaws.com/StudioDemoDatasets/amazon-products/amazon_products_inference_batch_0.csv")
batch_text = batch["text"]
display(batch_text)

0     Tablet Portfolio for iPad Tablets\nTablet Port...
1     Factory-Reconditioned Milwaukee 2471-81 12-Vol...
2     Microsoft Zune Armor Case - The Metal Case (Bl...
3     90W AC Power Adapter/Battery Charger for HP Pa...
4     Golda\nAs Israel's prime minister from 1969 to...
                            ...                        
95    HQRP 2.0Ah Power Tools Battery for DeWalt DE90...
96    L-Tryptophan - 100 Grams (3.53 Oz) - 99+% Pure...
97    Beba Toy\nThe Beba Toy is the &#x201C;World&#x...
98    Coby TFTV3925 39-Inch 1080p 60Hz LCD HDTV (Bla...
99    Mokingtop Fashion New 6 Pieces Babys Girls Hea...
Name: text, Length: 100, dtype: object

In [3]:
# create studio client
API_KEY = "<YOUR_API_KEY>"

# initialize studio object
studio = Studio(API_KEY)

![Model ID](/img/inference-tutorial/model_id.png)

In [7]:
# load model from Studio
# you can find your model ID in the models table on the dashboard!
model_id = "<YOUR_MODEL_ID>"
model = studio.get_model(model_id)

predictions = model.predict(batch_text)
display(pd.DataFrame({"text": batch_text, "predictions": predictions}).head(3))

Unnamed: 0,text,predictions
0,Tablet Portfolio for iPad Tablets\nTablet Port...,Computers_and_Accessories
1,Factory-Reconditioned Milwaukee 2471-81 12-Vol...,Tools_and_Home_Improvement
2,Microsoft Zune Armor Case - The Metal Case (Bl...,Other_Electronics


<details><summary>It's also possible to use our <a href="https://app.cleanlab.ai/">Web UI</a> to get predictions</summary>

If you would like to get predictions without writing code, simply go to [https://app.cleanlab.ai/](https://app.cleanlab.ai/) and follow these steps:
1. Click "View model" for the model you created
2. Click "Predict new labels"
3. Upload a CSV containing the examples you want predictions for
    1. For text models, your CSV should have a single column
    2. For tabular models, your CSV should contain all of your predictive columns
4. Click "Predict New Labels"
7. Wait for inference to complete
8. Click "Export" for the query you made

</details>

Congrats! You now have a model trained on reliable data that effectively predict product categories given product descriptions. This model can be deployed in production for real-time queries, or as part of an ETL pipeline.