https://codelabs.developers.google.com/llm-finetuning-supervised#4

# Model Tuning with Vertex AI Foundation Model

# Objective

This lab teaches you how to tune a foundational model on new unseen data and you will use the following Google Cloud products:
*   Vertex AI Pipelines
*   Vertex AI Evaluation Services
*   Vertex AI Model Registry
*   Vertex AI Endpoints

# Use Case

Using Generative AI we will generate a suitable TITLE for a news BODY from BBC FULLTEXT DATA (Sourced from BigQuery Public Dataset *bigquery-public-data.bbc_news.fulltext*). We will fine tune text-bison@002 to a new fine-tuned model called "bbc-news-summary-tuned" and compare the result with the response from the base model.

# Install and Import Dependencies

In [None]:
%pip install google-cloud-aiplatform
%pip install --user datasets
%pip install --user google-cloud-pipeline-components

In [None]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

In [None]:
import IPython
from google.cloud import aiplatform
from google.colab import auth as google_auth
google_auth.authenticate_user()

In [None]:
import vertexai
PROJECT_ID = "red-delight-346705" #@param
vertexai.init(project=PROJECT_ID)

In [None]:
region = "us-central1"
REGION = "us-central1"
project_id = "red-delight-346705"

In [None]:
! gcloud config set project {project_id}

Updated property [core/project].


In [None]:
#Import the necessary libraries

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import warnings
warnings.filterwarnings('ignore')
import vertexai
vertexai.init(project=PROJECT_ID, location=REGION)
import kfp
import sys
import uuid
import json
import vertexai
import pandas as pd
from google.auth import default
from datasets import load_dataset
from google.cloud import aiplatform
from vertexai.preview.language_models import TextGenerationModel, EvaluationTextSummarizationSpec


# Prepare & Load Training Data

In [None]:
BUCKET_NAME = 'data-16-05-2024'
BUCKET_URI = f"gs://data-16-05-2024/TRAIN.jsonl"
REGION = "us-central1"

In [None]:
json_url = 'https://storage.googleapis.com/data-16-05-2024/TRAIN.jsonl'
df = pd.read_json("/content/TRAIN.jsonl", lines=True)
df.head()

Unnamed: 0,input_text,output_text
0,The BBC News website takes a look at how games...,Mobile games come of age
1,The explosion in consumer technology is to con...,Gadget market 'to grow in 2005'
2,The proportion of surfers using Microsoft's In...,New browser wins over net surfers
3,'God games' in which players must control virt...,Games help you 'learn and play'
4,Online communities set up by the UK government...,Online commons to spark debate


In [None]:
print(df.shape)

(744, 2)


Fine Tune Text Bison@002 Model

In [None]:
model_display_name = 'bbc-finetuned-model' # @param {type:"string"}
tuned_model = TextGenerationModel.from_pretrained("text-bison@002")
tuned_model.tune_model(
training_data=df,
train_steps=100,
tuning_job_location="europe-west4",
tuned_model_location="europe-west4",
)


INFO:google.cloud.aiplatform.pipeline_jobs:Creating PipelineJob
INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob created. Resource name: projects/1060807096482/locations/europe-west4/pipelineJobs/tune-large-model-20240516161743
INFO:google.cloud.aiplatform.pipeline_jobs:To use this PipelineJob in another session:
INFO:google.cloud.aiplatform.pipeline_jobs:pipeline_job = aiplatform.PipelineJob.get('projects/1060807096482/locations/europe-west4/pipelineJobs/tune-large-model-20240516161743')
INFO:google.cloud.aiplatform.pipeline_jobs:View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/europe-west4/pipelines/runs/tune-large-model-20240516161743?project=1060807096482
INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/1060807096482/locations/europe-west4/pipelineJobs/tune-large-model-20240516161743 current state:
PipelineState.PIPELINE_STATE_RUNNING
INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/1060807096482/locations/europe-west4/pipe

KeyboardInterrupt: 

# Predict with the new Fine Tuned Model

In [None]:
response = tuned_model.predict("Summarize this text to generate a title: \n Ever noticed how plane seats appear to be getting smaller and smaller? With increasing numbers of people taking to the skies, some experts are questioning if having such packed out planes is putting passengers at risk. They say that the shrinking space on aeroplanes is not only uncomfortable it it's putting our health and safety in danger. More than squabbling over the arm rest, shrinking space on planes putting our health and safety in danger? This week, a U.S consumer advisory group set up by the Department of Transportation said at a public hearing that while the government is happy to set standards for animals flying on planes, it doesn't stipulate a minimum amount of space for humans.")
print(response.text)

 Shrinking space on planes putting our health and safety in danger


In [None]:
tuned_model_name = tuned_model._endpoint.gca_resource.deployed_models[0].model
tuned_model_1 = TextGenerationModel.get_tuned_model(tuned_model_name)
#TextGenerationModel.get_tuned_model("bbc-finetuned-model")
response = tuned_model_1.predict("Summarize this text to generate a title: \n Ever noticed how plane seats appear to be getting smaller and smaller? With increasing numbers of people taking to the skies, some experts are questioning if having such packed out planes is putting passengers at risk. They say that the shrinking space on aeroplanes is not only uncomfortable it it's putting our health and safety in danger. More than squabbling over the arm rest, shrinking space on planes putting our health and safety in danger? This week, a U.S consumer advisory group set up by the Department of Transportation said at a public hearing that while the government is happy to set standards for animals flying on planes, it doesn't stipulate a minimum amount of space for humans.")
print(response.text)

 Shrinking space on planes putting our health and safety in danger


# Predict with Base Model for comparison

In [None]:
base_model = TextGenerationModel.from_pretrained("text-bison@002")
response = base_model.predict("Summarize this text to generate a title: \n Ever noticed how plane seats appear to be getting smaller and smaller? With increasing numbers of people taking to the skies, some experts are questioning if having such packed out planes is putting passengers at risk. They say that the shrinking space on aeroplanes is not only uncomfortable it it's putting our health and safety in danger. More than squabbling over the arm rest, shrinking space on planes putting our health and safety in danger? This week, a U.S consumer advisory group set up by the Department of Transportation said at a public hearing that while the government is happy to set standards for animals flying on planes, it doesn't stipulate a minimum amount of space for humans.")
print(response.text)

 Shrinking Space on Planes: Putting Our Health and Safety at Risk?
