In [57]:
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

<table align="left">

  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/tuning/getting_started_tuning.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/tuning/getting_started_tuning.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/blob/main/tuning/getting_started_tuning.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>
</table>

## Tuning and deploy a foundation model

Creating an LLM requires massive amounts of data, significant computing resources, and specialized skills. On Vertex AI, tuning allows you to customize a foundation model for more specific tasks or knowledge domains.

While the prompt design is excellent for quick experimentation, if training data is available, you can achieve higher quality by tuning the model. Tuning a model enables you to customize the model response based on examples of the task you want the model to perform.

For more details on tuning have a look at the [official documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models).

### Objective

This tutorial teaches you how to tune a foundational model on new unseen data and you will use the following Google Cloud products:

- Vertex AI Generative AI Studio
- Vertex AI Pipelines
- Vertex AI Model Registry
- Vertex AI Endpoints

The steps performed include:

- Get training data from BQ and generate a JSONL file
- Upload training data
- Create a pipeline job
- Inspect your model on Vertex AI Model Registry
- Get predictions from your tuned model

### Quota
**important**: Tuning the text-bison@001  model uses the tpu-v3-8 training resources and the accompanying quotas from your Google Cloud project. Each project has a default quota of eight v3-8 cores, which allows for one to two concurrent tuning jobs. If you want to run more concurrent jobs you need to request additional quota via the [Quotas page](https://console.cloud.google.com/iam-admin/quotas).

### Costs
This tutorial uses billable components of Google Cloud:

* Vertex AI Generative AI Studio

Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing),
and use the [Pricing Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

### Install Vertex AI SDK

In [58]:
!pip install google-cloud-aiplatform google-cloud-bigquery sequence-evaluate sentence-transformers rouge --upgrade --user --quiet

[0m

**Colab only:** Uncomment the following cell to restart the kernel or use the restart button. For Vertex AI Workbench you can restart the terminal using the button on top.

In [59]:
# Automatically restart kernel after installs so that your environment can access the new packages
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

## Environment Variables
Set the below environment variables to reflect your environment. 

PROJECT_ID - your project id  <br/>
REGION - your compute region <br/>
BUCKET_NAME - the name of the GCS regional bucket. See Bucket naming guidelines.  <br/>

In [64]:
import sys
IN_COLAB = 'google.colab' in sys.modules 
%env IN_COLAB=$IN_COLAB

try: PROJECT_ID
except NameError: PROJECT_ID = None

try: REGION
except NameError: REGION = None

try: BUCKET_NAME
except NameError: BUCKET_NAME = None

try: MODEL_ID
except NameError: MODEL_ID = None


if not PROJECT_ID:
    PROJECT_ID=input('PROJECT_ID?')
    
if not REGION:
    REGION=input('REGION?', default='us-central1')
    
if not BUCKET_NAME:
    BUCKET_NAME=input('BUCKET_NAME?')    
    
if not MODEL_ID:
    MODEL_ID=input('MODEL_ID?', default='text-bison@001')    
    


print(f"PROJECT_ID: {PROJECT_ID}")
print(f"REGION: {REGION}")
print(f"BUCKET_NAME: {BUCKET_NAME}")
print(f"MODEL_ID: {MODEL_ID}")



env: IN_COLAB=False
PROJECT_ID: kallogjeri-project-345114
REGION: us-central1
BUCKET_NAME: generative-ai-workshop
MODEL_ID: text-bison@001


### Authenticating your notebook environment
* If you are using **Colab** to run this notebook, uncomment the cell below and continue.
* If you are using **Vertex AI Workbench**, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env).

In [65]:
if IN_COLAB:
    from google.colab import auth
    auth.authenticate_user()

In [66]:
%reload_ext autoreload
%reload_ext google.cloud.bigquery

%autoreload 1

In [67]:
from google.cloud.bigquery import magics
import google.auth
credentials, project = google.auth.default()

magics.context.credentials = credentials

### BigQuery IAM
Now you need to add permissions to the service account:
- Go to the [IAM page](https://console.cloud.google.com/iam-admin/) in the console
- Look for the default compute service account. It should look something like this: `<project-number>-compute@developer.gserviceaccount.com`
- Assign the default compute service account with `bigquery.user`

### Set your project ID

**If you don't know your project ID**, you may be able to get your project ID using `gcloud`. Otherwise, check the support page: Locate the [project ID](https://support.google.com/googleapi/answer/7014113). Please update `PROJECT_ID` below.

In [68]:
# Set the project id
! gcloud config set project {PROJECT_ID}

Updated property [core/project].


### Create a bucket
Now you have to create a bucket that we will use to store our tuning data. To avoid name collisions between users on resources created, you generate a UUID for each instance session and append it to the name of the resources you create in this tutorial.

In [69]:
import random
import string

# Generate a uuid of a specifed length(default=8)
def generate_uuid(length: int = 8) -> str:
    return "".join(random.choices(string.ascii_lowercase + string.digits, k=length))

UUID = generate_uuid()

Choose a bucket name and update the `BUCKET_NAME` parameter.

In [70]:
BUCKET_URI = f"gs://{BUCKET_NAME}"

In [71]:
if BUCKET_NAME == "" or BUCKET_NAME is None or BUCKET_NAME == "[your-bucket-name]":
    BUCKET_NAME = "vertex-" + UUID

BUCKET_URI = f"gs://{BUCKET_NAME}"
print(BUCKET_URI)
    

gs://generative-ai-workshop


Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket.

In [72]:
! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI

Creating gs://generative-ai-workshop/...
ServiceException: 409 A Cloud Storage bucket named 'generative-ai-workshop' already exists. Try another name. Bucket names must be globally unique across all Google Cloud projects, including those outside of your organization.


Finally, validate access to your Cloud Storage bucket by examining its contents:

In [73]:
! gsutil ls -al $BUCKET_URI

  23559724  2023-06-30T16:49:18Z  gs://generative-ai-workshop/tune_data_stack_overflow_python_qa.jsonl#1688143758407993  metageneration=1
TOTAL: 1 objects, 23559724 bytes (22.47 MiB)


### Import libraries

**Colab only**: Uncomment the following cell to initialize the Vertex AI SDK. For Vertex AI Workbench, you don't need to run this.

In [74]:
import vertexai
vertexai.init(project=PROJECT_ID, location=REGION)

In [75]:
from typing import Union

import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np

from vertexai.preview.language_models import TextGenerationModel
from google.cloud import aiplatform
from google.cloud import bigquery

## Tune your Model

Now it's time for you to create a tuning job. Tune a foundation model by creating a pipeline job using Generative AI Studio, cURL, or the Python SDK. In this notebook, we will be using the Python SDK. You will be using a Q&A with a context dataset in JSON format.

### Training Data
💾 Your model tuning dataset must be in a JSONL format where each line contains a single training example. You must make sure that you include instructions.

You will use the StackOverflow data on BigQuery Public Datasets, limiting to questions with the `python` tag, and accepted answers for answers since 2020-01-01.

First create a helper function to let you easily query BigQuery and return the results as a Pandas DataFrame.

Next define the query.

In [76]:
%%bigquery df --project $PROJECT_ID --use_bqstorage_api

SELECT
    CONCAT(q.title, q.body) as input_text,
    a.body AS output_text
FROM
    `bigquery-public-data.stackoverflow.posts_questions` q
JOIN
    `bigquery-public-data.stackoverflow.posts_answers` a
ON
    q.accepted_answer_id = a.id
WHERE
    q.accepted_answer_id IS NOT NULL AND
    REGEXP_CONTAINS(q.tags, "python") AND
    a.creation_date >= "2020-01-01"
LIMIT
    10000

Query is running:   0%|          |

Downloading:   0%|          |

There should be 10k questions and answers.

In [77]:
print(len(df))

10000


In [78]:
df.head()

Unnamed: 0,input_text,output_text
0,How do I use the markers parameter of a sympy ...,<p>Nice find!</p>\n<p>The documentation doesn'...
1,I wanted sigup button to give the value to my ...,<p>ok imade it work using lambda i tried to us...
2,How to drop columns if a row in a column has a...,"<p>The following code snippet should work, rem..."
3,OpenCV text and shapes look vague on my PC com...,<p>I found the problem. In my Windows settings...
4,Does heroku automatically update the repositor...,"<p>Heroku's <a href=""https://devcenter.heroku...."


Lets split the data into training and evalation. For Extractive Q&A tasks we advise 100+ training examples. In this case you will use 800.

In [79]:
# split is set to 80/20
train, evaluation = train_test_split(df, test_size=0.2)
print(len(train))

8000


For tuning, the training data first needs to be converted into a JSONL format.

In [80]:
tune_jsonl = train.to_json(orient='records', lines=True)

print(f"Length: {len(tune_jsonl)}")
print(tune_jsonl[0:100])

Length: 23558470
{"input_text":"Sole problems with Slug and Date Field problem in Django<p>I have a problem with Djan


Next, you can write it to a local JSONL before transferring it to Google Cloud Storage (GCS).

In [81]:
training_data_filename = "tune_data_stack_overflow_python_qa.jsonl"

with open(training_data_filename, "w") as f:
    f.write(tune_jsonl)

You can then export the local file to GCS, so that it can be used by Vertex AI for the tuning job.

In [82]:
! gsutil cp $training_data_filename $BUCKET_URI

Copying file://tune_data_stack_overflow_python_qa.jsonl [Content-Type=application/octet-stream]...
- [1 files][ 22.5 MiB/ 22.5 MiB]                                                
Operation completed over 1 objects/22.5 MiB.                                     


You can check to make sure that the file successfully transferred to your Google Cloud Storage bucket:

In [83]:
! gsutil ls -al $BUCKET_URI

  23558470  2023-06-30T16:53:30Z  gs://generative-ai-workshop/tune_data_stack_overflow_python_qa.jsonl#1688144010222872  metageneration=1
TOTAL: 1 objects, 23558470 bytes (22.47 MiB)


In [84]:
TRAINING_DATA_URI = f"{BUCKET_URI}/{training_data_filename}"
print(TRAINING_DATA_URI)

gs://generative-ai-workshop/tune_data_stack_overflow_python_qa.jsonl


### Model Tuning
Now it's time to start to tune a model. You will use the Vertex AI SDK to submit our tuning job.

#### Recommended Tuning Configurations
✅ Here are some recommended configurations for tuning a foundation model based on the task, in this example Q&A. You can find more in the [documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models).

Extractive QA:
- Make sure that your train dataset size is 100+
- Training steps [100-500]. You can try more than one value to get the best performance on a particular dataset (e.g. 100, 200, 500)

In [85]:
MODEL_NAME= f"genai-workshop-tuned-model-{UUID}"

In [86]:
#Function that starts the tuning job
def tuned_model(
    project_id: str,
    location: str,
    training_data: str,
    model_display_name: str,
    train_steps = 100,
):

    """Prompt-tune a new model, based on a prompt-response data.

    "training_data" can be either the GCS URI of a file formatted in JSONL format
    (for example: training_data=f'gs://{bucket}/{filename}.jsonl'), or a pandas
    DataFrame. Each training example should be JSONL record with two keys, for
    example:
      {
        "input_text": <input prompt>,
        "output_text": <associated output>
      },

    Args:
      project_id: GCP Project ID, used to initialize aiplatform
      location: GCP Region, used to initialize aiplatform
      training_data: GCS URI of training file or pandas dataframe of training data
      model_display_name: Name for your model
      train_steps: Number of training steps to use when tuning the model
    """

    aiplatform.init(project=project_id, location=location)
    model = TextGenerationModel.from_pretrained(MODEL_ID)        

    model.tune_model(
        training_data=traidata,
        model_display_name=model_display_name,
        train_steps=train_steps,
        # Tuning can only happen in the "europe-west4" location
        tuning_job_location="europe-west4",
        # Model can only be deployed in the "us-central1" location
        tuned_model_location="us-central1",
    )
    
   

    # Test the tuned model:
    print(model.predict("Can you provide me with a Python implementation of BERT with Tensorflow? Example: "))

    return model

Next it's time to start your tuning job. **Disclaimer:** tuning and deploying a model takes time, on the other order of four hours.

In [87]:
# This will start the tuning job and output a URL where you can monitor the pipeline execution.
model = tuned_model(PROJECT_ID, REGION, TRAINING_DATA_URI, MODEL_NAME)

NameError: name 'traidata' is not defined

Following the link above, you can view your pipeline run. As you can see in the screenshot below, it will execute the following steps:

- Validation
- Export managed dataset
- Convert JSONL to TFRecord
- Large language model tuning
- Upload LLM Model

## View your tuned foundational model on Vertex AI Model registry
When your tuning job is finished, your model will be available on Vertex AI Model Registry. The following Python SDK sample shows you how to list tuned models.

In [None]:
def list_tuned_models(project_id, location):

    aiplatform.init(project=project_id, location=location)
    model = TextGenerationModel.from_pretrained("text-bison@001")
    tuned_model_names = model.list_tuned_model_names()
    print(tuned_model_names)

In [None]:
list_tuned_models(PROJECT_ID, REGION)

You can also use the Google Cloud Console UI to view all of your model in [Vertex AI Model Registry](https://console.cloud.google.com/vertex-ai/models?e=13802955&jsmode=O&mods=-ai_platform_fake_service&project=cloud-llm-preview1). Below you can see an example of a tuned foundational model available on Vertex AI Model Registry.

## Use your tuned model to get predictions
Now it's time to get predictions. First you need to get the latest tuned model from the Vertex AI Model registry.

In [None]:
def fetch_model(project_id, location):

    aiplatform.init(project=project_id, location=location)
    model = TextGenerationModel.from_pretrained("text-bison@001")
    list_tuned_models = model.list_tuned_model_names()
    tuned_model = list_tuned_models[0]

    return tuned_model

In [None]:
deployed_model = fetch_model(PROJECT_ID, REGION)
deployed_model = TextGenerationModel.get_tuned_model(deployed_model)

Now you can start send a prompt to the API. Feel free to update the following prompt.

In [None]:
PROMPT = """
How can I store my TensorFlow checkpoint on Google Cloud Storage?

Python example:

"""

In [None]:
print(deployed_model.predict(PROMPT))

## Evaulation
It's essential to evaluate your model to understand its performance. Evaluation can be done in an automated way using evaluation metrics like F1 or Rouge. You can also leverage human evaluation methods. Human evaluation methods involve asking humans to rate the quality of the LLM's answers. This can be done through crowdsourcing or by having experts evaluate the responses. Some standard human evaluation metrics include fluency, coherence, relevance, and informativeness. Often you want to choose a mix of evaluation metrics to get a good understanding of your model performance. Below you will find an example of how you can do the evaluation.

In this example you will be using [sequence-evaluate](https://pypi.org/project/sequence-evaluate/) to evaluation the tuned model.

In [None]:
from seq_eval import SeqEval
evaluator = SeqEval()

Earlier in the notebook, you created a train and eval dataset. Now it's time to take some of the eval data. You will use the questions to get a response from our tuned model, and the answers we will use as a reference:

- **Candidates**: Answers generated by the tuned model.
- **References**: Original answers that we will use to compare.

In [None]:
evaluation = evaluation.head(10) # you can change the number of rows you want to use
evaluation_question = evaluation["input_text"]
evaluation_answer = evaluation["output_text"]

Now you can go ahead and generate candidates using the tuned model based on the questions you took from the eval dataset.

In [None]:
candidates = []

for i in evaluation_question:
    response = deployed_model.predict(i)
    candidates.append(response.text)

len(candidates)

You will also have to create a list of our references. These will you use to evaluate the model's performance.

In [None]:
references = evaluation_answer.tolist()

len(references)

Next you will generate the evaluation metrics. `evaluator.evaluate` will return a few eval metrics. Some of the important ones are:
- [Blue](https://en.wikipedia.org/wiki/BLEU): The BLEU evaluation metric is a measure of the similarity between a machine-generated text and a human-written reference text.
- [Rouge](https://en.wikipedia.org/wiki/ROUGE_(metric)): The ROUGE evaluation metric is a measure of the overlap between a machine-generated text and a human-written reference text.

In [None]:
scores = evaluator.evaluate(candidates, references, verbose=False)
print(scores)