# 🔄 Fine-Tune and Serve a BERT Model (LLM)

<a target="_blank" href="https://colab.research.google.com/github/unionai-oss/bert-llm-classification-pipeline/blob/main/tutorial.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

This tutorial will walk you through building an end-to-end Large Language Model (LLM) fine-tuning pipeline using Hugging Face Transformers and Union's AI workflow and inference platform. We'll download a dataset, fine-tune a BERT model for classification on unstructured data, serve the model for inference, and track the pipeline artifacts using Union's powerful MLOps features. Although this example may seem relatively simple, all the concepts and tools used here can be applied to more complex machine learning and AI projects.


By just adding a few lines of code to your Python functions, you'll be able to create a reproducible ML pipeline, taking advantage of Union's features:

- Reproducible AI workflows: Ensure your ML pipeline produces the same environments every time.
- Versioning of code and artifacts: Track changes in your code and models automatically.
- Data Caching for faster iterations: Reuse results from previous executions to save time.
- Declarative Infrastructure: Define your ML infrastructure needs directly in your code without worrying about provisioning.
- Artifact Management for models and data: Automatically manage your model files and datasets.
- Container Image Builder: Build and deploy your code in a consistent environment.
- Local Development: Test your workflows locally before deploying them to the cloud.
- Actors for long-running stateful containers: Handle tasks that require continuous state or interaction.
- And more...

Example of how to use Union's Python SDK to define a simple ML pipeline:

```python
@task(
    cache=True,
    cache_version="4",
    container_image=image,
    requests=Resources(cpu="2", mem="2Gi")
)
def download_data(): -> pd.DataFrame:
    ...

@task(
    container_image=image,
    requests=Resources(cpu="2", mem="20Gi", gpu="1")
)
def train_model(data: pd.DataFrame:): -> pytorch.Model:
    ...

@workflow()
def pipeline_workflow():
    data = download_data()
    train_model(data=data)
    ...

```


## 🧰 Setup 

Sign up for a Union Serverless account at [Union.ai](https://union.ai) by clicking the "Get Started" button. No card required, and you'll get $30 in free credits to get started. Signing up can take a few minutes.

Or you can use your [Union BYOC Enterprise](https://www.union.ai/pricing) login if you have one.

### 📦 Install Python Packages & Clone Repo

Packages can be installed in your local environment using the following command using your preferred package manager from the [requirements.txt](requirements.txt) file. For example `pip install -r requirements.txt`. 

to clone the repo, run the following command in your environment: `git clone `

If you're running this notebook in a Google Colab environment, you can install the packages and clone the GitHub repo directly in the notebook by running the following cell:


In [None]:
try:
    import google.colab
    IN_COLAB = True
except ImportError:
    IN_COLAB = False

if IN_COLAB:
    !git clone https://github.com/unionai-oss/bert-llm-classification-pipeline
    %cd bert-llm-classification-pipeline
    !pip install -r requirements.txt

### 🔐 Authenticate

If you're using [Union BYOC Enterprise](https://www.union.ai/pricing) use: `union create login --host <union-host-url>`

Otherwise, Authenticate to [Union Serverless](https://www.union.ai/) by running the command below - create an account for free at [Union.ai](https://union.ai) if you don't have one:
 

In [2]:
!union create login --serverless --auth device-flow

🔐 [33mConfiguration saved to [0m[33m/Users/sageelliott/.union/[0m[33mconfig.yaml[0m
Login successful into [1;32mserverless[0m


## 🔀 BERT Fine-Tuning Pipeline

In this sections we'll be running tasks and workflows defined in Python under the relevant folders. 

Navigate to the `tasks` and `workflows` folders to see the code. if you're following along in a hosted jupyter notebook you should be able to view the code by clicking on a folder icon (usually on the left side of the screen).

First we'll create a machine learning pipeline that trains a model on the iris dataset.

Our workflow will have the following steps:
- Download Dataset & Preprocess
- Download BERT Model
- Fine-tune BERT Model
- Evaluate the model
- Save model as an artifact (We'll serve the model in the next section)
- Run a prediction with new test data

Note: Data pipelines could be seperate from model training pipelines for more complex pipelines. In this example we'll keep it simple and combine them into one workflow.

navigate to the [workflows/workflows.py](workflows/train_pipeline.py) file. Find `train_pipeline()` function to see the code for the workflow. This workflow uses tasks defined in the [/tasks](tasks/data.py) folder and builds a container image from [container.py](containers.py).

In [None]:
!union run --remote workflows/train_pipeline.py train_pipeline

# 🚀 Serving the Fine-Tuned BERT model:

### Live App Serving (Beta)

In [None]:
!union deploy apps app.py bert-sentiment-analysis

### Batch Serving

In [16]:
!union register workflows/batch_inference.py

[2mRunning pyflyte register from /Users/sageelliott/Documents/gitrepos/tut-bert-hf with images ImageConfig(default_image=Image(name='default', fqn='cr.union.ai/v1/unionai/union', tag='py3.11-0.1.142', digest=None), images=[Image(name='default', fqn='cr.union.ai/v1/unionai/union', tag='py3.11-0.1.142', digest=None)]) and image destination folder /root on 1 package(s) ('/Users/sageelliott/Documents/gitrepos/tut-bert-hf/workflows/batch_inference.py',)[0m
Registering against serverless-1.us-east-2.s.union.ai[0m
[33mDetected Root /Users/sageelliott/Documents/gitrepos/tut-bert-hf, using this to create deployable package...[0m
[33mLoading packages ['workflows.batch_inference'] under source root /Users/sageelliott/Documents/gitrepos/tut-bert-hf[0m
[33mNo output path provided, using a temporary directory at /var/folders/nv/hcrpygqd6xvd6m2cf6w3pbvc0000gn/T/tmpgpen8_qr instead[0m
[33mComputed version is hfFIHDHk88f_AR_c4CKprA[0m
[34mImage flytekit:pQPzS42lZRRflp2iFhumig found. Skip bu

In [17]:
from union.remote import UnionRemote
# Create a remote connection
remote = UnionRemote()

In [18]:
def predict_with_container(data):

    inputs = {"texts": data}

    workflow = remote.fetch_workflow(name="workflows.batch_inference.batch_inference_workflow")
    execution = remote.execute(workflow, inputs=inputs, wait=True) # wait=True will block until the execution is complete

    print(execution.outputs)

    return execution.outputs["o0"]

In [19]:
print(predict_with_container(["I love this movie",
                               "I hate this movie"]
                               ))

{'o0': [b'gqVsYWJlbKdMQUJFTF8xpXNjb3Jlyz/sIz+AAAAA', b'gqVsYWJlbKdMQUJFTF8wpXNjb3Jlyz/lTCjgAAAA']}
[{'label': 'LABEL_1', 'score': 0.8793027400970459}, {'label': 'LABEL_0', 'score': 0.6655468344688416}]


### ⚡ Faster batch serving with Union Actors

Union [Actors](https://docs.union.ai/serverless/user-guide/core-concepts/actors/#actors) dramatically reduce the cost of cold starts by maintaining long-running stateful environments that stay ready for use until a defined time-to-live (TTL). This persistent setup eliminates redundant initialization and unlocks several key benefits. This can be especially useful for AI pipelines that benefit from long-running environments, such as large containers, serving models,

In [None]:
def predict_with_actor(data):

    inputs = {"texts": data}

    workflow = remote.fetch_workflow(name="workflows.batch_inference.batch_inference_workflow")
    execution = remote.execute(workflow, inputs=inputs, wait=True) # wait=True will block until the execution is complete

    # print(execution.outputs)

    return execution.outputs['o0']

In [None]:
print(predict_with_actor([[5.1, 3.5, 1.4, 0.2]]))