In [1]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

## Overview

## Objective

This notebook shows how to conecct BigQuery dataset to  Claude models on Vertex AI using  BigQuery DataFrames.

### Claude on Vertex AI

Anthropic Claude models on Vertex AI offer fully managed and serverless models. To use a Claude model on Vertex AI, send a request directly to the Vertex AI API endpoint.

For more information, see the [Use Claude](https://cloud.devsite.corp.google.com/vertex-ai/generative-ai/docs/third-party-models/use-claude) documentation.

### BigQuery DataFrames
BigQuery DataFrames provides a Pythonic DataFrame and machine learning (ML) API powered by the BigQuery engine. BigQuery DataFrames is an open-source package.

For more information, see this documentation
https://cloud.google.com/bigquery/docs/reference/bigquery-dataframes


### Getting Started


#### Authenticate your notebook environment (Colab only)
If you are running this notebook on Google Colab, uncomment and run the following cell to authenticate your environment. This step is not required if you are using [Vertex AI Workbench](https://cloud.google.com/vertex-ai-workbench).

In [2]:
# from google.colab import auth
# auth.authenticate_user()

## Using Anthropic's Vertex SDK + BQ for *Python*

### Getting Started


#### Install the latest bigframes package if bigframes version < 1.15.0



In [3]:
# !pip install bigframes --upgrade

#### Restart current runtime

To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which will restart the current kernel.

In [4]:
# # Restart kernel after installs so that your environment can access the new packages
# import sys

# if "google.colab" in sys.modules:
#     import IPython

#     app = IPython.Application.instance()
#     app.kernel.do_shutdown(True)

#### Define Google Cloud project and region information

In [5]:
# Input your project id
PROJECT_ID = "bigframes-dev"  # @param {type:"string"}

#### Select Claude Model and Region Availability:
https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude#anthropic_claude_quotas_and_supported_context_length

In [6]:
REGION = "us-east5"  # @param {type:"string"}

### Load raw sample data to a bigquery dataset

Create a BigQuery Dataset and table. You can use the sample museum data in CSV from [here](https://github.com/googleapis/python-bigquery-dataframes/tree/main/notebooks/generative_ai/museum_art.csv).

The dataset should be in the **same region** as your chosen claude model. Let's say you selected us-east5 for claude 'haiku', then load the sample data to a dataset in us-east5.

### Text generation for BQ Tables using Python BigFrames


In [8]:
import bigframes
import bigframes.pandas as bpd
bigframes.options._bigquery_options.project = PROJECT_ID # replace to user project
bigframes.options._bigquery_options.location = REGION  #choice a region which the claude model you choice allows
df = bpd.read_gbq("bigframes-dev.garrettwu_us_east5.museum_art") # replace with your table
df.dtypes

object_number                             string[pyarrow]
is_highlight                                      boolean
is_public_domain                                  boolean
object_id                                           Int64
department                                string[pyarrow]
object_name                               string[pyarrow]
title                                     string[pyarrow]
culture                                   string[pyarrow]
period                                    string[pyarrow]
dynasty                                   string[pyarrow]
reign                                     string[pyarrow]
portfolio                                 string[pyarrow]
artist_role                               string[pyarrow]
artist_prefix                             string[pyarrow]
artist_display_name                       string[pyarrow]
artist_display_bio                        string[pyarrow]
artist_suffix                             string[pyarrow]
artist_alpha_s

In [9]:
# @title query: select top 10 records from table and put into dataframe

df = df[["object_id", "title"]].head(10)
df

Unnamed: 0,object_id,title
0,285844,"Addie Card, 12 years. Spinner in North Pownal ..."
1,437141,Portrait of a Man
2,670650,[Snow Crystal]
3,268450,Newhaven Fisherman
4,646996,전(傳) 오원 장승업 (1843–1897) 청동기와 화초가 있는 정물화 조선|傳 吾...
5,287958,Bridge of Augustus at Nani
6,435869,"Antoine Dominique Sauveur Aubert (born 1817), ..."
7,55834,
8,45087,
9,56883,


### Enable Claude model on Vertex AI and Create a BQ External Model Connection


*   Step 1: Visit the Vertex AI Model Garden console and select the model tile for Claude model of your choice. Following this doc [link](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude). Click on the **“Enable”** button and follow the instructions.

*   Step 2: Create a BQ External Connection
Follow the same process like this one: [link](https://cloud.google.com/bigquery/docs/generate-text#create_a_connection). Pay attention to the **supported region** of Claude models and make your conenction follow the same region for example us-east5 for Claude 3.5.







### Use BigQuery DataFrames ML package with Claude LLM  

In this example, we are using the Claude3TextGenerator class from BigQuery DataFrames to translate title of art piece to english.

Documentation for the Claude3TextGenerator Class: https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.llm.Claude3TextGenerator

In [10]:
from bigframes.ml import llm
model = llm.Claude3TextGenerator(model_name="claude-3-5-sonnet",
                                 connection_name="bigframes-dev.us-east5.bigframes-rf-conn" ) # replace with your connection
df["input_prompt"] = "translate this into English: " + df["title"]
result = model.predict(df["input_prompt"])
result

Unnamed: 0,ml_generate_text_llm_result,ml_generate_text_status,prompt
0,This text is already in English. It appears to...,,"translate this into English: Addie Card, 12 ye..."
1,"The phrase ""Portrait of a Man"" is already in E...",,translate this into English: Portrait of a Man
2,"The phrase ""[Snow Crystal]"" is already in Engl...",,translate this into English: [Snow Crystal]
3,"The phrase ""Newhaven Fisherman"" is already in ...",,translate this into English: Newhaven Fisherman
4,"Here's the English translation: ""Attributed t...",,translate this into English: 전(傳) 오원 장승업 (1843...
5,"I apologize, but I'm not sure which language ""...",,translate this into English: Bridge of Augustu...
6,This title is already in English. It describes...,,translate this into English: Antoine Dominique...
7,,,
8,,,
9,,,


In [11]:
output_df=df.drop(columns=["input_prompt"]).join(result.drop(columns="ml_generate_text_status"))
output_df

Unnamed: 0,object_id,title,ml_generate_text_llm_result,prompt
0,285844,"Addie Card, 12 years. Spinner in North Pownal ...",This text is already in English. It appears to...,"translate this into English: Addie Card, 12 ye..."
1,437141,Portrait of a Man,"The phrase ""Portrait of a Man"" is already in E...",translate this into English: Portrait of a Man
2,670650,[Snow Crystal],"The phrase ""[Snow Crystal]"" is already in Engl...",translate this into English: [Snow Crystal]
3,268450,Newhaven Fisherman,"The phrase ""Newhaven Fisherman"" is already in ...",translate this into English: Newhaven Fisherman
4,646996,전(傳) 오원 장승업 (1843–1897) 청동기와 화초가 있는 정물화 조선|傳 吾...,"Here's the English translation: ""Attributed t...",translate this into English: 전(傳) 오원 장승업 (1843...
5,287958,Bridge of Augustus at Nani,"I apologize, but I'm not sure which language ""...",translate this into English: Bridge of Augustu...
6,435869,"Antoine Dominique Sauveur Aubert (born 1817), ...",This title is already in English. It describes...,translate this into English: Antoine Dominique...
7,55834,,,
8,45087,,,
9,56883,,,


In [12]:
# prompt: load the dataframe output to another Bigquery table

# @title Save results to BigQuery

output_df.to_gbq("bigframes-dev.garrettwu_us_east5.museum_art_translate", if_exists="replace") # replace with your table

'bigframes-dev.garrettwu_us_east5.museum_art_translate'